<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>metaduck &#187; Performance</title>
	<atom:link href="http://www.metaduck.com/category/performance/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.metaduck.com</link>
	<description></description>
	<lastBuildDate>Wed, 05 May 2010 12:04:43 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Rails cache in distributed environment</title>
		<link>http://www.metaduck.com/2009/10/rails-cache-in-distributed-environment/</link>
		<comments>http://www.metaduck.com/2009/10/rails-cache-in-distributed-environment/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 10:01:55 +0000</pubDate>
		<dc:creator>Pedro Teixeira</dc:creator>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[Rails]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://localhost/metaduck/?p=22</guid>
		<description><![CDATA[Page and fragment caching are life-savers for Rails application scalability. Specially for page cache, they can make your app fast, specially if you use a webserver like Nginx, serving static files directly without touching the Rails stack.
But maintaining cache consistency across a distributed Rails application can be challenging.

When page caching, Rails writes page result in [...]]]></description>
			<content:encoded><![CDATA[<p>Page and fragment caching are life-savers for Rails application scalability. Specially for page cache, they can make your app fast, specially if you use a webserver like <a href="http://nginx.net/">Nginx</a>, serving static files directly without touching the Rails stack.</p>
<p>But maintaining cache consistency across a distributed Rails application can be challenging.</p>
<p><span id="more-22"></span></p>
<p>When page caching, Rails writes page result in a static file on the public folder (when using the default options), allowing the web server to serve it directly.</p>
<h2>Expiring cache</h2>
<p>Cache expiration must be done explicitly by your app using the <a href="http://guides.rubyonrails.org/caching_with_rails.html">expire_page command</a>. This should be done when changes are made to your model (creations, deletions and updates), and should affect one or more pages, depending on your app. The cache expiration should be placed on model sweepers, as <a href="http://guides.rubyonrails.org/caching_with_rails.html#sweepers">explained here</a>.</p>
<h2>Distributed environment</h2>
<p>What about whenu  you are using more than one box for serving your Rails app? When one box calls ethe expire page command, it only cleans the local cache, rendering the other boxes cache remain inconsistent.</p>
<h2>Solutions</h2>
<p>There are several solutions to this. Let's look at them:</p>
<h3>1. dRb cache store</h3>
<p>dRb (or distributed Ruby) cache store uses a singleton process to communicate your cache decisions. This is not a good solution because:</p>
<ul>
<li>there is a single point of failure: the dRb process</li>
<li>web servers generally can't talk to dRb. even if they could, serving static files locally is much faster</li>
</ul>
<h3>2. Memcache Store</h3>
<p>Using a <a href="http://www.danga.com/memcached/">memcached</a> service is one good solution. Memcached can be use clustering and load balancing, and it is pretty fast. But, if you are using a distributed Rails environment mainly for the sake of redundancy, or don't want to complicate the environment setup, don't use memcache store.</p>
<h3>3. Cron-based expiration</h3>
<p>You can expire cache on a scheduled basis. This can be enough for some applications. But for some, specially when you have to keep a tight  cache consistency, this is not enough</p>
<h2>4. Build your own distributed cache cleaning</h2>
<p>In this solution, your model cache sweepers are responsible for cleaning the cache (deleting page cache files) on the other machines.</p>
<p>But how does one machine contact the other machines?</p>
<p>One solution I came up with envolves every machine having a Mongrel server listening on a public TCP port. (When I say public,. mean accessible to the other machines on the cluster. This is not a service that you  want to be public on the internet) .</p>
<p>This HTTP service is there just to listen to cache expiration events. It accepts, as arguments, the paths of the page cache</p>
<h3>Security concerns</h3>
<p>This service can be implemented on your Rails app, but it should not be accessible to</p>
<h2>Drawbacks</h2>
<p>There are several problems with this aproach:</p>
<h3>1. Every machine must know each other</h3>
<p>In order for one machine to contact each other when expiration must occur, every machine must know the other machines. This can be challenging using Rails config, but can be done in Capistrano tasks.</p>
<h3>2. It does not scale well</h3>
<p>Every time you add a machine you are increasing the cache expiration cost.</p>
<h3>3. Fault tolerance</h3>
<p>When you expire a cache page, you must contact EVERY other box. If the cache expiration service of one box is down, the cache expiration will fail. Error handling must be done carefully, having a fall-back mechanism like putting the cache expiration command on a queue.</p>
<h2>A better solution</h2>
<p>One better solution is to make cache expiration events ASYNCHRONOUS. When expiring a page, an event is triggered, and every other box is listening on this channel.</p>
<p>This can be achieved using UDP broadcasts, and having every box listening on this UDP port.</p>
<h3>Drawbacks (again)</h3>
<p>A fall-back mechanism must be in place, though, in case one box is down during the cache expiration event, rendering the cache inconsistent.</p>
<p>This can be done using some kind of persistent message queue instead of UDP broadcasts, but I think this can be an overkill for most applications.</p>
<p>Expect to hear from me soon regarding the implementation of this solution!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.metaduck.com/2009/10/rails-cache-in-distributed-environment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
