A big problem we hit with the current release (1.5.0) of the Ruby memcache-client library is that if the memcache server connection dies, it leaves the mongrel permanently broken. I've written a big patch to refactor it to reconnect properly and also retry the request (once) to give it a fresh start.
Might as well just quote myself from the ticket:
I've written a big patch for memcache-client that does two things. Firstly, it reconnects properly to if the connection dies, so that you won't get permanently broken mongrels when the memcache server goes down but has been restarted or otherwise fixed up.
Secondly, if the connection dies, it retries requests – once only, it won't keep looping if things aren't working.
In doing this I've also cleaned up a bit of the codebase to provide for the refactoring that implements the retry-once mechanism. I've integrated the two apparently-equivalent patterns that were used to handle locking when multithreading is on and factored that into the mechanism too, so there's less repetition of the locking code.
We've tested this new version out with a genuine breakage – specifically, using the (now fixed) Solaris libevent event
port bindings to memcache, which made connections die painfully, quite regularly; with the old client, that would quickly leave us with permanently-broken mongrels (until we restarted them), but with this patched client they happily try and
reconnect to the memcache server, and handle repeat errors cleanly.