June 2008 Archives

mediate disk access with ionice(1).

| 1 Comment
When you've got a bunch of people running virtual machines on your hardware, there's a certain chance that you'll see contention for disk I/O.  (Disks are slow.  Everybody knows that.) Although you can't set hard limits and partitions as you can with the network QoS, you can use the ionice command to prioritize the different domains into subclasses, with a syntax like:

# ionice -p <PID> -c <class> -n <priority within class>

where -n ranges from 0 to 7, with lower numbers taking precedence.  We recommend always specifying "2" for the class.  Other classes exist -- 3 is idle and 1 is realtime -- but idle is extremely conservative, while 1 is so aggressive as to have a good chance of locking up the system.

Here we'll test ionice with two different domains, one with the highest "normal" priority, the other with the lowest.

First, ionice only works with the CFQ I/O scheduler.  To check that you're using the CFQ scheduler, run this command in the dom0:

# cat /sys/block/[sh]d[a-z]*/queue/scheduler
noop anticipatory deadline [cfq]
noop anticipatory deadline [cfq]

The word in brackets is the selected scheduler.  If it's not [cfq], reboot with the parameter elevator=cfq.

Next we find the processes we want to ionice.  Since I'm using tap:aio devices, the dom0 process is tapdisk.  If I were using phy: devices, it'd be [xvd <domain id> <device specifier>]  (You can see that phy: devices give you a bunch more information.)

# ps aux | grep tapdisk
root      1054   0.5   0.0   13588   556   ?   Sl   05:45   0:10   tapdisk /dev/xen/tapctrlwrite1 /dev/xen/tapctrlread1
root      1172   0.6   0.0   13592   560   ?   Sl   05:45   0:10   tapdisk /dev/xen/tapctrlwrite2 /dev/xen/tapctrlread2

Now we can ionice our domains.  Note that the numbers correspond to the order the domains were started in, not the domain id.

# ionice -p 1054 -c 2 -n 7
# ionice -p 1172 -c 2 -n 0

To test ionice, I'll run a couple of bonnie++ processes, and time them.  (After bonnie, i dd, just to make sure that conditions for the other domain remain unchanged.)

prio 7 domU tmp # /usr/bin/time -v  bonnie++  -u 1 && dd if=/dev/urandom of=load
prio 0 domU tmp # /usr/bin/time -v  bonnie++  -u 1 && dd if=/dev/urandom of=load

Results?  Well, wall-clock-wise, the domU with prio 0 took 3:32.33 to finish, while the prio 7 domU needed 5:07.98.  The bonnie++ results themselves were a bit confusing -- some stuff showed a great difference, others not so much.  Try it for yourself.

(Of course, this is another thing to integrate into our increasingly baroque domain config files.)

the clouds will be a daisy chain.

| No Comments
Still bouncing back and forth between the various things that need doing.  I think migration is the current lowest-hanging fruit, so I'm going to address that.

The only things I really want to clear up there, other than the usual rewriting, are the (relatively new) -c option to xm save, and the hooks Xen provides to migrate external devices.  One of those is trivial; the other is quite hard.

But hey, I'm having fun, I think.

unplanned obsolescence.

| No Comments
So I've concluded that the ROUTE target, with its associated --tee option, is a complete pack of lies.  I think the best solution for Snort is definitively to throw hardware at the problem.

I'm going to take an unpopular tack here, and opine that bad, out-of-date documentation is still better than no documentation at all.  At least with out-of-date stuff you get an idea of the direction they were going, the sort of things that they implemented, the way things used to work.  Sometimes that's all you need.

Besides, if we were afraid to write anything that would become obsolete. . .  Well, I don't even want to finish that thought.
A couple days ago I said that we had completed all the research we were ever going to do for the 'frontends' chapter, that we weren't going to install any more, because I didn't want to get into a cycle of chasing their upgrades.

In retrospect, I may have been lying.

See, the issue is that, while Xen is a cool technology, it's also fundamentally kind of boring.  It makes stuff possible by adding a new layer of abstraction and encapsulation, but until tools are written to take advantage of it, there's really not much point.  In some ways the frontend chapter, therefore, is a tremendous opportunity.  (Rather than, as we've been viewing it, a digression on clunky substitutes for expressive command-line tools.)

So I'm going to add some stuff about OpenQRM, based on a tip from our technical reviewer, because it looks like it's got some neat datacenter-oriented features that take advantage of Xen for load balancing and such.

There is power in the virtualization concept -- it's not really just about more little boxen.  Got to keep that in mind.

xen remote control.

| No Comments
More todo!  (The copyediting phase is, after all, the best time to add new sections.)

I want to nail down in my mind, for certain and absolute, whether it is possible to control xen remotely, and if so, how.  Virt-manager seems to allow it, but I haven't found the knobs to twiddle that make it work.  This would be useful to allow, for example, a centralized console server for the customer domains.

This is probably related to the XenAPI in some way.

copyeditor's marks.

| No Comments
We're copyediting!  It occurs to me that we (by which I mostly mean, *ahem* my coauthor) would benefit from an overview of copyediting marks.  So I searched the web and found a couple of references.  (Okay, I admit I'd actually forgotten some of them myself.  It's been a very long time.)

http://bfa.sdsu.edu/editorial/copyediting.htm has most of the common ones that we're likely to use.

http://www.wiley.com/legacy/authors/guidelines/stmguides/4-fig1.htm is a more complete, less friendly reference.

In addition to these, I tend to indicate that I want to move a block by bracketing it [ ] and drawing an arrow.  Circled words most often mean "tt format", for command or file names.  Transposition is indicated with an s-curve between and around the items to switch.

We don't have to be precise, since all the paper markup we're doing is just for scratch.  But there's no reason not to use a standard system.  It's like terminating your UTP cables with the correct scheme, rather than just taking any two pairs.

still tweaking.

| No Comments
This one's a reminder -- we're considering pulling the firewalling bits out of the hosting chapter, expanding them, and putting them into a discussion of dom0 firewalling in the networking chapter.  The idea is that there are substantiative differences between firewalling in the Xen case and in the ordinary case -- with Xen, we want to firewall the dom0 but let the domUs have unfiltered access to the network.

Discussion of shaping will most likely still go in hosting.
Even though storage is technically kind of finalized, I want to add a few lines on falling through to custom block devices.  (There's already a brief discussion, but a more concrete example might help.)

We could refer to the block-iscsi script described at http://lists.xensource.com/archives/html/xen-devel/2007-11/msg00782.html

bork bork bork.

| No Comments
My overuse of adverbs makes me sick.  There are a bunch of writing sins I can't seem to escape.  I think the constantly-iterated editing process is helping the book, though.

As to actual status, I told Luke to look at and send off Chapter 3 for copyedit.  I looked at Chapter 9 yesterday and felt like chopping my fingers off at the first joint -- but that should be ready for copyedit tomorrow.

xen, cobbler, and kickstart.

| No Comments
Today I reduced the stratospheric level of procrastination in which I've been indulging by actually getting pypxeboot and cobbler set up and working.  It's really amazingly slick -- a single command provisions a VM and writes a config file.

However, I'm not sure how useful this really is in the Xen context.  It does integrate very well with a "bare metal" provisioning system (like kickstart.)  One of the nice things about Xen, though, is that you can just copy disk images around, or dd one physical device to another.  If you've already got a fully-configured pxeboot-based system running on your network, this is useful.  If not, it's probably easier to just write some short scripts.

I've got to admit, though, RedHat has a slick product. 

I just realized that "cobbler" is a pun based on "kickstart" . . .  shoes. . .  kick. . .  my word, RedHat, that isn't funny at all.

About this Archive

This page is an archive of entries from June 2008 listed from newest to oldest.

May 2008 is the previous archive.

July 2008 is the next archive.

Find recent content on the main index or look in the archives to find all content.