Hating on HATEOAS#
Wed, 05 Jan 2011 14:39:09 +0000
In 2011 I will not start blog posts with the word "so".
Lately I've been thinking about RESTfulness again. An
observation that has been widely made is that Roy T. Fielding's
definition of REST differs wildly from what most of the
rest (sorry) of the world thinks it
is
- while all people with taste and discrimination must surely
agree that the trend from evil-tasting SOAPy stuff back to simple
HTTP-based APIs is a Good Thing, the "discoverability"
and "hypertext" aspects of Canonical REST are apparently not
so widely considered as important for practical use.
My own small contribution to this debate is that the reason
people are not trying to do
HATEOAS
is that they've been told that the web at large - the large
part of the WWW that's mediated through ordinary web browsers
under the direction of human brains - is an example of how it
works. And the more I think about it the more I think that the
example is rubbish and unhelpful.
It's a rubbish example because the browsers through which we're
viewing these resources have very limited support for most of the
HTTP verbs and HTTP response codes that REST requires, hence
silly workarounds like tunnelling PUT inside POST using
_method=put
. In a way that's a trivial complaint because the workarounds
do exist, but it's still a mess. (Note for purists: I
write "REST requires" when what I really mean is "HTTP defines",
but you don't qualify for the "RESTful" badge if you're misusing
HTTP, and there's little social cachet in describing an API
as consensual
HTTP )
It's also a rubbish example because humans tend to expect a multi-stage "workflow" or "wizard" interaction, but HTML has lousy support for
updating state and indicating a transition at the same time. A
representation of a resource might include a Form to update the
state of that resource, but it says nothing about what you can do
when it's been updated. Alternatively (or additionally) it might
include a navigation link to another resource, but that will be
fetched with a GET and won't change anything server-side. Let's
take a typical shopping cart as example: a form with two buttons
for "update quantity" and "go to checkout" - whichever button you
press, the resource that gets POSTed to is the same in either
case, and any application state transition that might happen
after you click is driven by the server sending a redirect (or
not) - in effect, the data sent by the client smooshes together both the
updated resource state and the navigation, which doesn't smell
to me like hypertext. And as a side note, we may yet decide to
ignore the client's indication of where it wants to go next if
the data supplied is not valid for the current state of the
resource, and instead send another copy of the shopping cart page
prefixed with a pretty red box that says "sorry, you can't have
3.2j widgets" - and in all probability send it with a "200 OK"
response code because there's no point sending any fancy kind of
40x when you don't know whether the browser will display it or
will substitute with its own error page.
And thirdly it's a rubbish example because of the browser history
stack and the defensive server-side programming that becomes
necessary when your users start to treat your story as
a Choose Your Own
Adventure
game. The set of state transitions available to the user is in
practice not just the ones in the document you're showing him,
but also all the other ones you've shown him in any of n previous
documents: some of them may still be allowed, but
others (changing the order details after you've charged his card)
may not. Sending him "409 conflict" in these situations is
probably not going to make him any wiser - you're going to have
to think about the intention behind his navigational meander and
do something that makes sense for the mental
model you think he has. Once the user has hit the Back button and desynced the
application state from the server-side resource state, you're
running to catch up.
To summarise, a web application designed for humans needs to
support human-friendly navigation and validation in ways which
current browsers can't while keeping true to the intended uses of
HTML and HTTP and RESTful style in general. This doesn't mean I
think HATEOAS is bad as a concept - I just think we should be
looking elsewhere than the human-driven web for an example of
where it's good (and I haven't really found a compelling one
yet).
I have a nasty feeling that the comments on this site are presently broken, but responses by email (dan @ telent.net) are welcome - please say if you want your email published or not.
How to create a diskless elastichosts node#
Sat, 22 Jan 2011 22:35:28 +0000
Elastichosts is a PAYG (or monthly contract) "cloud" virtual server
provider based on the Linux kvm technology. At $WORK we use it to
provide a horizontally scalable app service, and we need to be able to
add new app servers in less time than it takes to copy a complete
working Debian system. Also we want to be running the same version of
the same software on every server (think "security updates") and we
don't want to be paying for another 3GB of Debian that we don't really
need on each box. So, we need that stuff to be shared.
Elastichosts don't directly support kvm snapshots (or they didn't when
I asked them about it) which leaves us looking for alternative ways to
do the same thing. This blog entry describes one such approach: we
use a read-only CD image for the root filesystem and then mount /usr
and /home over NFS and a ramdisk (populated at boot) on /var. It's
all done using standard Debian tools and Debian setup as of the
"squeeze" 6.0 release.
The finished thing is on github at
https://github.com/telent/squeeze-cd-nfsroot/ . To use, basically you
clone the repo into /usr/local/client
, edit the files, and run make
. Slightly less
basically, you almost certainly need to know what edits to make to
which files, and you may also want to know how it works anyway. So
read on ...
(Yes, you should be able to clone it elsewhere because I shouldn't have hardcoded that directory name into the Makefile. This may be fixed in a future version if I ever find the need to install it somewhere else myself. Or see the 'conclusion' section if you want to fix it yourself)
How the client boots
- the client boots off a CD (ISO9660) image created by
initramfs-tools
which is configured to look for an nfsroot directory. This directory
is created on the server by a Makefile rule that copies the server's root
dir and the replaces, renames and changes a bunch of stuff in /etc
- it then mounts a ramdisk on /tmp and another on /var. There is an
initscript
populate_var
which creates all the empty directories that
daemons will expect when they start up. Note that these directories
are entirely ephemeral, which means for example that syslog must be
configured to log remotely
- it mounts /usr and /home (readonly) directly from the server. This
means that most of the packages on the server are available
immediately on the clients - unless they include config files in
/etc
, in which case they aren't until you rerun the Makefile that
creates the nfsroot (after, possibly, adjusting the config
appropriately for the client)
A short guide to customising the system
These files are copied to the client - you may want to review their contents
- template/etc/fstab needs to have the right hostname for your NFS server
- template/etc/initramfs-tools/initramfs.conf - check DEVICE and NFSROOT settings
- template/etc/network/interfaces may need tweaking
- template/etc/resolv.conf is set up for our network, not yours
- template/etc/init.d/populate_var might need directories added or
chown
invocations removed, depending on what packages you have installed
- template/etc/rsyslog.conf needs editing for the syslog server's IP address
And also
- insserv calls in Makefile may need adjusting if you have other services on the server that you don't want to also run on the client
And on the server
- you'll need to be exporting the
nfsroot/
directory as NFS, ditto /home
and /usr
. My /etc/exports
looks something like this
/usr/local/client/nfsroot 10.0.0.0/24(ro,no_root_squash,no_subtree_check)
/home 10.0.0.0/24(ro,no_root_squash,no_subtree_check)
/usr 10.0.0.0/24(ro,no_root_squash,no_subtree_check)
- you need to run a dhcp server (I use "dnsmasq", which also provides DNS service). Make sure this is only running on your vlan address: I don't know whether elastichosts will filter out rogue DHCP servers running on their network or will just come around and break your fingers for trying, but either way it's not a good idea.
- If you want the clients able to syslog, you need to configure the syslog server to accept syslog messages from them. rsyslogd seems to be standard in Debian these days - I mention this because it does remote syslogging over TCP not the traditional UDP, so make sure both ends are speaking the same protocol and you don't have iptables rules between them that are dropping your messages on the floor.
How we build the files
The nfsroot
Creating the nfsroot is done by the Makefile rootfs target
It starts by rsyncing the real root into nfsroot/
with a whole bunch
of exclusions, then copies files from template/
over the copied
files to cater for the bits that need a different configuration on the
client than they do on the server, then does some other fiddling
around. Most notable:
- we have to copy libcrypto and libz into /usr because the dhcp client
needs those libraries and it runs before /usr is mounted (see
http://bugs.debian.org/592361 - though according to that page this bug
is now fixed)
- we blat the generated files
etc/udev/rules.d/*persistent*
which
are correct for the server but not for the client.
- Debian will run better with no
/etc/hostname
than it will with the wrong one
- Debian squeeze uses a slightly exciting parallelising
dependency-based system for running init scripts, so we can't just
copy files into init.d, we need to run
insserv
to make it see then.
(As a long-time Unix user who doesn't pay enough attention when these
kinds of changes are made, this took ages to work out). Similarly to
disable daemons that run only on the server, we use insserv -r
.
- a couple of files need to be writable, so we replace them with symlinks
-
/etc/network/run
is pointed to /lib/init/rw
-
/etc/mtab
is pointed to /proc/mounts
- We create our own
etc/resolv.conf
. Our elastichosts clients
generally have a public (dynamically allocated) IP address assigned
to eth0 and a vlan attached to eth1. DHCP gets exciting here: the
client boots off eth1 and gets the address of that interface using
boot-time kernel code, then runs the user-space dhclient
tool to
get an eth0 address, and we'd rather one rely on the conjunction of all
that to get /etc/resolv.conf
right
- populate_var pretty much does what it says on the tin but might need more directories adding/removing depending on what you have installed
The initramfs
The Makefile ramfs.img
target
makes an initramfs image which knows how to mount root on nfs. This
particular magic is built into Debian and the only particular point of
note here is that we use nfsroot/etc/initramfs-tools
as the config
directory so we know we're generating a config for the client without
treading on the server's usual initramfs config (which it might need
when it boots itself). In our setup the only file that's actually
changed is template/etc/initramfs-tools/initramfs.conf
which has
settings for BOOT, DEVICE, NFSROOT that probably differ from what the
server wants for itself
Creating the cd image
This is pretty straightforward too. The Makefile boot_cd.iso
target
runs mkisofs to generate a CD image using the initramfs image and
other files taken from isolinux.
Uploading it
We had to slightly patch the elastichost-upload
script to add the
ability to create shared images as well as exclusive ones. This is
controlled by the api key claim:type
, which the elastichosts API
docs describe as
follows: "either 'exclusive' (the default) or 'shared' to allow
multiple servers to access a drive simultaneously"
The patched version is in the git repo, accompanied by the patch
Once you've uploaded the first one you can uncomment the DRIVE_UUID
param at the top of Makefile so that subsequent attempts update the
same drive instead of creating a new one every time.
Conclusion
There you have it. It's certainly a bit rough and ready right now and
requires editing a few too many files to be completely turnkey, but
hopefully it will save someone somewhere some time. If you have bug
fixes, send me patches (or fork it on github and send me pull
requests); if you have suggestions, my inbox is open; if you know you
need something like this but can't understand what I'm writing about,
my consulting rates are reasonable ;-)