So every now and again a customer will complain of a crashing domain. Occasionally, it is an early sign of a hardware problem that I need to deal with, so I don't want to just ignore it.
Now, the problem is that like a physical server, once the domain has rebooted, most of the information about why it crashed is gone. (and what little is left is in /var/log on the guest, and as a general rule we don't like mucking around in the guest. that's your business, not ours.)
Now, on a physical server, we solve this by using a logging serial console. (I reccomend opengear if you have the money, and a used cyclades if you don't have money. the 'buddy system' (making one server the console server for the next, then the next server the console server for the first) usually requires adding usb serial dongles, but is even cheaper still, for installations with only a few servers. I personally like the IOgear brand usb -> serial dongles Fry's has.
I can turn on debug logging in xenconsoled and that will log the console for all domains to a file (one file for each domain) then I can use those logs to troubleshoot the problem. The thing is, apparently some people have privacy concerns with this, so I haven't done it yet.
Now, personally, I don't think serial consoles are that sensitive. I mean, it's common to leave terminals in data centers where passers by can see the output. They will allow me to see what program is crashing, which may be sensitive, and depending on how you have the thing configured, I can see when people log in and log out.
So, I have several options.
- I could leave it as is, continue to go back and fourth and guess if someone asks me why something crashed after a reboot
- I can log all consoles and delete the data once a week or once a month
- I can apply a patch to log some people's consoles and not others, and let the user decide
Obviously, option 2 makes my life a /whole lot/ easier. Option 3 is better than option 1, but it still means maintaining an out of tree xenconsoled (or pushing it upstream)