There was a full outage in our New York data center this morning. Things began flapping around 3:30 AM, and then settled down after 10 minutes and we saw no need to panic. At approx 5:00 AM, it happened again. We contacted Peer1, and they felt it must be connectivity and started investigating. Things came up around 5:30 AM, and Peer1 did not find anything. At approx 6:15 it happened again, but this time it was a full outage. Again, Peer1 could not detect anything wrong with the connection. Michael went down to the data center, verified that our router could not talk to the outside world, and then moved the Peer1 network connection from our switch directly to our router. This cleared everything up.
Reason suggests that there is either a configuration error on the switch as a whole or an issue with just that port. We see no reason to think that the problem is on Peer1’s end. We are still investigating.
Mitigating factors: This outage did not affect FogBugz customers using the Los Angeles data center. Because the outage occurred during the North American night, most North American customers would not have been affected.