I’ve used memcached quite frequently in projects, it is a great product. I heard of a memcached compatable product called Membase about 8 months or so ago, but did not have time to try it out until last week. All I can say is WOW. I’ll never use stand alone memcached server(s) again.
Highlights
- crazy easy to install and make a cluster.
- 0 changes to your app code. Operates seamlessly with memcached protocol. If you want to take advantage of advanced features, you need to modify app code.
- you can dynamically add and remove nodes without losing all your keys/data.
- 2 bucket types:
- Membase: supports data persistence (writes them ionicely to disk) and replication (one node dies, you dont lose your key/value pairs). It sends data to disk as fast as it can (while giving priority to getting data back from disk). This is done asynchronously (with an option for synchronous), so clients shouldn’t be able to perceive a difference between Membase and memcached data buckets.
- Memcached: no persistence or replication. all in memory. I would highly recomend going membase bucket unless you have some I/O concerns (like you get charged for I/O in the cloud).
- Awesome admin web UI. You can get feel for it in this screencast (this is for v2.0, but its similar). It updes real time, gives great stats, and is intuitive.
- lots of documentation
- helpful community (I found the freenode #couchbase irc channel to be really helpful)
How I’m using Membase
I have a membase cluster setup in EC2. All m1.smalls. I would not recomend smalls if you can afford going larger, as smalls only have 1 cpu core and have “moderate” I/O. Because of the I/O cost ($) in EC2, and because I don’t need persistence or replication for my apps (at this point), I went with membase bucket. I’d highly recomend going Membase bucket if you can. I’ve spoken with one of the Couchbase dev’s, and in practice they have not seen much performance difference between the membase and memcached buckets.
I’ve had pretty good transactions/sec with just 2 m1.smalls – easily over 1,250/second. I’m sure I could get more no problem. I get roughly the same numbers when using membase and memcached buckets.
I have 2 autoscale groups setup to keep my cluster highly available AND easily scale up/down based on load. Essentially, I have one “master” node that I assign an elastic IP. Its in a auto scale group of min 1/max 1 – so if there are problems it gets replaced and automatically gets the EIP. I then have a second autoscale group that on startup joins the cluster via the master node EIP. This is a super cheap way of getting enterprise class HA and scale – 4 data centers (availability zones), “ip failover”.
I could have the cluster automatically scale up based on load, but I decided not to go this route based on the fact that when the cluster topology changes it needs to be “rebalanced”. This can be a resource intensive task – so I would not want it happening while I was under load (the reason it scaled up in the first place). For now I just monitor load manually, and scaling up is 1 command to add a node, and a push of a button to rebalance (when I deem its a good time).
If you do decide to run membase in the cloud, you must read Membase in the cloud docs.
How my app connects to Membase
My apps that consume the membase cluster are php based. Spefically I use php-fpm and nginx. FPM has some great features like fastcgi_finish_request(), but thats another blog post in its self. Anyways, Membase has a great app called Moxi. It runs either client side (where your webapp is) or server side (on Membase cluster nodes). If you want to get up and running fast (without touching your existing app), just point your app at any server in the membase cluster (port 11211 by default) and it will use the moxi on the membase server without you even knowing.
If you want to improve performance, and take advantage of connection pooling – check out client side moxi. Client side moxi will stay in contact with the cluster, so if any node goes down it will handle it transparently for you. Also, because it stays in contact with the cluster, when interacting with a key, it goes directly to the correct node that has that key (performance++). Client side moxi also has connection pooling – which is great for fast CGI processes. After installing client side moxi, just have your app make a persistent memcache connection to localhost:11211 and your done. Moxi acts as a proxy and does all the work.
In conclusion …
- If you use memcache servers, using Membase in their place is a no-brainer. Community version is free (if you can adhere to the open source licence). You can get everything up and running in < 5 minutes on Debain, RPM or windows based machines.
- When creating buckets, you should use membase type buckets. Using memcached buckets does not make sense in most cases.
- Spend the time to research client side moxi. Also, you may be using a language that supports a type 2 membase client. If so you may want to use it instead of moxi.
I’m still not a membase expert, so if you have any thoughts/improvements please leave me a comment.
Thanks for sharing your experience!By the way, one minor correction on the different bucket types. Membase sends data to disk as fast as it can (while giving priority to getting data back from disk). It’s not just once the RAM quota associated with the bucket is full. This is done asynchronously (with an option for synchronous), so clients shouldn’t be able to perceive a difference between Membase and memcached data buckets. It’s really a question of semantics on what to do when the bucket is full. Membase and Couchbase data buckets free up memory for new stuff, keeping a hot cache, and memcached data buckets throw something old away.