Skip to Content

Best Practices for Implementing Memcached

Permalink

In recent weeks, more details about Facebook's implementation and scale of use of Memcached have been astounding the tech community. First, HighScalability.com reported on a presentation at UC San Diego given by Facebook's VP of Technology earlier this month, noting that:
"Their caching tier services 120 million queries every second and it's the core of the site. The problem is Memcached is hard to use because it requires programmer cooperation. It's also easy to corrupt. They've developed a complicated system to keep data in the caching tier consistent with the database, even across multiple distributed data centers."

Last week at Web 2.0 Summit, Facebook's VP of Engineering, Mike Shroepfer, shared even more numbers about the popular site's operations, which can be found in detail on TechCrunch. Many things jumped out at me, but being that we are in the business of commercial Memcached and helping websites scale out, this bullet was especially interesting: "50 million operations a second via Memcached - We scaled Memcached 5 X its original performance – we rewrote it."

What Facebook has done with Memcached for itself is similar to what Gear6 has done with our own distribution for those websites that may not necessarily want to be in the business of tweaking and re-writing Memcached for themselves. We've added features for high availability such as replication and clustering, and we've replaced the static slab allocator with one that dynamically optimizes memory use for a smaller footprint.

Still, even with these features baked in, there are basic best practices for implementing and managing Memcached -- whether you go with our commercial distribution or go it alone. The following comprise the Top 10 best practices we've learned from helping over a dozen highly trafficked sites and networks implement Gear6 Web Cache for their caching tier.

1. If you are using MySQL, use the MySQL Memcached UDFs inside the queries.

2. Use 64 bit servers, with more than 4GB. Install as much memory as you can afford. 32 bit machines are a waste of power and rack space.

3. Cache "expensive operations". Expensive means anything that takes a while to run or that is run a often.

4. Profile your applications to optimize the size of each Memcached instance and pool.   This will help optimize memory allocation and prevent inefficient distribution of cached objects.

5. Cache bi-directionally. When you update the database, also update the cache.  When you have a cache miss, read from the database, then write it to the cache for next time.

6. Design to withstand failures gracefully. Handle cache misses, as described in #4. Handle losing a cache node, using consistent hashing.  Using the Gear6 HA feature, losing a cache node becomes much less likely.

7. Use consistent hashing. This is "baked in" to clients based on libMemcached.

8. Be careful with numbers of connections. If you have a lot of application servers, and a lot of Memcached servers, a fully meshed connections can cause N-squared scaling problems.  Watch that carefully.

9. Monitor your eviction stats.  If you start getting evictions of "hot" data, then Memcached can actually start slowing you down! When this happens, get more memory and/or more nodes.

10. Once again, have enough memory.  You want to have your entire "hot" working set sitting in the Memcached.

11. Monitor your caches and caching statistics. Sudden changes in these numbers should be treated seriously and tracked down.

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.