luke: September 2010 Archives

22 guests rebooted on Mantle

| | Comments (0)

Note, not all guests on mantle were rebooted. Downtime was approximately 15 minutes. The cause was (my own) human error; I plugged the keyboard into the wrong server when I started the reboot; we were able to cancel the reboot before all guests went down, then we brought up the guests again.

a discussion on the SLA

| | Comments (17)

So, according to some metrics over the last two days we had 3 hours of downtime. But it was spread over two days, so it really should count for more.

So, here's what svtix said about the matter:

In consideration of the downtime experienced in our SVTIX data center on Septem\ ber 13 and 14, I am crediting your account for three days of service. This wil\ l be applied to your current invoice.

Now, this seems to be how most of my competitors do it, too. At best, they give you a symbolic apology.

the thing is that if I had taken the sla payout from my last network outage, and instead of giving those credits, I had spent the money on a new router and a secondary, redundant upstream, this problem would not have been a big deal at all. Customers would not have experienced downtime.

So yeah, while an SLA is a good way of estimating the cost of a problem and aligning the interests of the owner with the interests of the customers wrt. downtime, I think that when the company is in 'full growth' mode like prgmr.com is, it might hurt more than it helps, by removing some of the working capital that would have otherwise paid for infrastructure upgrades.

About this Archive

This page is a archive of recent entries written by luke in September 2010.

luke: August 2010 is the previous archive.

luke: November 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.