<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-9149488327872994243</id><updated>2011-07-25T11:47:06.342+01:00</updated><category term='design'/><category term='architecture'/><category term='cache'/><category term='software'/><title type='text'>Get it done</title><subtitle type='html'>Software architecture - simpler.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://getitdone-simpler.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>8</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-2102030904787410463</id><published>2008-08-08T12:53:00.001+01:00</published><updated>2008-08-08T12:53:19.369+01:00</updated><title type='text'>test widget</title><content type='html'>&lt;div&gt;none&lt;/div&gt;&lt;br /&gt;&lt;script type="text/javascript" src="http://cs44.clearspring.com/o/4702b4df5a93e1de/489c33add812f24e/4702b4df69cf5019/299c48f6/widget.js"&gt;&lt;/script&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-2102030904787410463?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=2102030904787410463' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/2102030904787410463'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/2102030904787410463'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2008_08_01_archive.html#2102030904787410463' title='test widget'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-1387336134257275612</id><published>2006-08-30T13:02:00.000+01:00</published><updated>2006-08-30T13:03:17.116+01:00</updated><title type='text'>Linking to Technocrati</title><content type='html'>&lt;a href="http://www.technorati.com/claim/6q8q2gjf" rel="me"&gt;Technorati Profile&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-1387336134257275612?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=1387336134257275612' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/1387336134257275612'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/1387336134257275612'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#1387336134257275612' title='Linking to Technocrati'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-968302054078859594</id><published>2006-08-27T18:23:00.000+01:00</published><updated>2006-08-27T19:44:11.443+01:00</updated><title type='text'>Scheduling jobs in a web farm  Or Clustered services</title><content type='html'>I have migrated my blogs to &lt;a href="http://pranshujain.wordpress.com/"&gt;pranshujain.wordpress.com&lt;/a&gt;. This article goes &lt;a href="http://pranshujain.wordpress.com/2006/08/29/scheduling-jobs-in-a-web-farm-or-clustered-services/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I had asked a question &lt;a href="http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=557490&amp;amp;SiteID=1"&gt;here&lt;/a&gt; about this topic. The answer I got really surprised me. I was told to consider Grid computing. Anyway, my thoughts on the topic are below.&lt;br /&gt;&lt;br /&gt;Even though we are writing 3-tier applications, which are clustered and load balanced using hardware load &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0" onclick="BLOG_clickHandler(this)"&gt;balancers&lt;/span&gt;, we are often left wondering what to do about scheduled jobs and what to do about Long running jobs - which cannot be web pages.&lt;br /&gt;The options that we face are:&lt;br /&gt;1) Make them &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1" onclick="BLOG_clickHandler(this)"&gt;SQL&lt;/span&gt; Server Jobs. Typically databases are clustered, even if not, they must be available for the application to work. So it is better to tie the failure dependency to the database. The challenge that we face here is that it is really not advisable to have custom &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2" onclick="BLOG_clickHandler(this)"&gt;DLLs&lt;/span&gt; running on potentially shared database machine.&lt;br /&gt;2) Make them &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3" onclick="BLOG_clickHandler(this)"&gt;EXEs&lt;/span&gt; and trigger by windows scheduler. The problem here is that windows scheduler is not cluster aware . If this needs to be done, we either need to live with manual &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4" onclick="BLOG_clickHandler(this)"&gt;failover&lt;/span&gt; of the jobs, or we need to schedule the job on multiple machines and implement some kind of locking possibly using database - to ensure that only one job runs at a time.&lt;br /&gt;3) Look at a clustered scheduler - including windows cluster APIs in case you have an OS level clustering at the web/app server level. In my experience, it is rate to have a cluster at this level, but if there is one, you must be ready to exploit it. All OS clusters, including &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5" onclick="BLOG_clickHandler(this)"&gt;veritas&lt;/span&gt;, provide cluster services programming. You usually have two options : you can either make the job/service a part of the machine &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6" onclick="BLOG_clickHandler(this)"&gt;healthcheck&lt;/span&gt;. so if your job fails, the cluster fails over. Secondly, you can make the job run only on the primary node of the cluster. Possibly its the second option we are looking for. There are third party clustered schedulers available, mostly commercial.&lt;br /&gt;4) Windows services: here again, we can take advantage of OS cluster services to make the service rum on primary node of the cluster only. Alternatively, we can code a lock at database level to make only one service active.&lt;br /&gt;5) Grid computing APIs : Grid computing tools, acting as glorified schedulers, can ensure that the job runs once, successfully and only once.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-968302054078859594?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=968302054078859594' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/968302054078859594'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/968302054078859594'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#968302054078859594' title='Scheduling jobs in a web farm  Or Clustered services'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-6078467499581565901</id><published>2006-08-27T18:16:00.000+01:00</published><updated>2006-08-27T18:22:47.052+01:00</updated><title type='text'>Getting around "You are being redirected to an unsecure site"</title><content type='html'>I have migrated my blogs to &lt;a href="http://pranshujain.wordpress.com"&gt;http://pranshujain.wordpress.com&lt;/a&gt; this item goes &lt;a href="http://pranshujain.wordpress.com/2006/08/29/getting-around-you-are-being-redirected-to-an-unsecure-site/"&gt;here&lt;/a&gt;&lt;br /&gt;Doing a response.redirect from a HTTPS page to HTTP page is not considered good as the users get this warning "you are being redirected to an unsecure site".&lt;br /&gt;The one way which I have used to avoid this warning is to tell a client side javascript to do the redirect.&lt;br /&gt;i.e. if I want to go from https://myserver/a.aspx to http://myserver/b.aspx, I do the following&lt;br /&gt;From a.aspx, i do a response.redirect to https://myserver/redirect.aspx?targeturl=http://myserver/b.aspx&lt;br /&gt;Then by using a javascript on redirect.aspx, i make a page on load javascript method to call something like&lt;br /&gt;&amp;lt;script&amp;gt;&lt;br /&gt;Window.location(&amp;lt;%=targetUrl%&amp;gt;);&lt;br /&gt;&amp;lt;/script&amp;gt;&lt;br /&gt;this makes the client browser move to the http page.&lt;br /&gt;While coding such redirect - you may want to consider that the deployment of the application may be done on different environments with different server URLs. Hence you may want to read the hostname ("www.myserver.com" of https://www.myserver.com/) from what the client has specified.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-6078467499581565901?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=6078467499581565901' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/6078467499581565901'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/6078467499581565901'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#6078467499581565901' title='Getting around &quot;You are being redirected to an unsecure site&quot;'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-4962265385739768103</id><published>2006-08-27T11:44:00.000+01:00</published><updated>2006-08-27T11:48:12.100+01:00</updated><title type='text'>Sizing for CMS</title><content type='html'>I have migrated my blogs to &lt;a href="http://pranshujain.wordpress.com"&gt;http://pranshujain.wordpress.com&lt;/a&gt; This article goes &lt;a href="http://pranshujain.wordpress.com/2006/08/28/sizing-for-cms/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I will try to complement Apoorv's blog at http://www.apoorv.info/ on Portals and Content Management.&lt;br /&gt;In one of the recent blogs, Apoorv mentioned sizing. That is one thing which I happen to have worked a lot on.&lt;br /&gt;I will list a few things which I noticed about the content management systems that I worked on :&lt;br /&gt;- It is the database access that kills.&lt;br /&gt;- SQL Queries on the presentation layer tend to be a lot heavier than the queries on the Content management backend&lt;br /&gt;- Unless and until you have a very simple presentation - it will always make sense to cache the presentation as HTML pages and serve that to customers instead of dynamic pages (of course everyone knows that)&lt;br /&gt;- Even if the update frequency or volume is large (lets say more than a page a minute on an average) and the database size gets large - even the publishing process takes its toll. It is good to have an aggressive archiving for the content. In case Archiving is not feasible (after all it is a content management system) - a replicated database for presentation may be the only option.&lt;br /&gt;- Coming to sizing you are likely to get a better projections by benchmarking against existing applications&lt;br /&gt;- For benchmarking against existing application - you need to have page views per second for the most frequently used dynamic pages and the database size.&lt;br /&gt;- If you have a benchmark - you can half the performance for every 10 times increase in data size&lt;br /&gt;- Typically, if you can serve 7 pages per second by 1 CPU of application server and 2 CPU of database server for a non cached application - it is considered good performance. Typically the CMS pages which are updating "one content item" only can achieve this kind of performance for a database having 10 to 50 thousand content items.&lt;br /&gt;- XML processing is usually a big killer, so if you are transferring around large structured documents using web services, and a document size is expected to be more than 20 KB then you have to really look at the performance. As per a benchmark I am doing now - 1 CPU can consume a web service returning 1 Meg data only 3 times a second. This is with it doing no processing at all - just a web service call using regular soap client. So as a thumb rule - if you are making web service calls - it will be a good start to halve the above benchmark of 7 pages per CPU on the app server to form a target to aim for.- Some CMS have object or XML databases. I am not sure how you can size for them if the content size is beyond a certain size.- Search engines fall in a different league. I am not sure how to size for them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-4962265385739768103?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=4962265385739768103' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/4962265385739768103'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/4962265385739768103'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#4962265385739768103' title='Sizing for CMS'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-1599272281855944425</id><published>2006-08-26T22:25:00.000+01:00</published><updated>2006-08-26T22:56:24.931+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='cache'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>Cache - access and expiry</title><content type='html'>I have migrated my blogs to &lt;a href="http://pranshujain.wordpress.com"&gt;http://pranshujain.wordpress.com&lt;/a&gt; this article goes &lt;a href="http://pranshujain.wordpress.com/2006/08/27/cache-access-and-expiry/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In my previous post on &lt;a href="http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#2032513534029848048"&gt;cache implementation&lt;/a&gt; I had talked about where to keep the cache. Now I will talk about how to access the cache and how to expire the same.&lt;br /&gt;&lt;br /&gt;Before we go into those topics, it is very important to consider the tolerance of stale data for maximum optimization.&lt;br /&gt;Lets say we cannot tolerate stale data at all - lets say the application is for selling a hotel room in Thai &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0" onclick="BLOG_clickHandler(this)"&gt;Bhat&lt;/span&gt; converted to Sterling. Now, depending on the rate I get from my bank and the date of stay, I get a price.&lt;br /&gt;So &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_1"&gt;every time&lt;/span&gt; anyone requests for a room, I have to check the price like&lt;br /&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2" onclick="BLOG_clickHandler(this)"&gt;PriceInGBP&lt;/span&gt;=&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3" onclick="BLOG_clickHandler(this)"&gt;PriceInTHB&lt;/span&gt;* ( Select &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4" onclick="BLOG_clickHandler(this)"&gt;ExchangeRate&lt;/span&gt; where &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5" onclick="BLOG_clickHandler(this)"&gt;fromCurrency&lt;/span&gt;=&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6" onclick="BLOG_clickHandler(this)"&gt;THB&lt;/span&gt; and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7" onclick="BLOG_clickHandler(this)"&gt;ToCurrency&lt;/span&gt;=&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8" onclick="BLOG_clickHandler(this)"&gt;GBP&lt;/span&gt; and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9" onclick="BLOG_clickHandler(this)"&gt;ValidityFromDate&lt;/span&gt;&lt;=12Jan and ValidityToDate &gt;= 12Jan and isLive=True)&lt;br /&gt;- Since I am selling in future so I have to look up rates on that date.&lt;br /&gt;Now if I cannot tolerate stale data, the best I can do is to put a trigger on the exchange rate table to update a ExchangeRateVersionNumber Table. The exchange rate version number is updated on any change on the entire exchange rate table.&lt;br /&gt;Thus my application changes to&lt;br /&gt;If CachedExchangeRateVersionNumber = select * from ExchangeRateVersionNumber , used Cached exchange rate, Else the above query to fetch the exchange rate.&lt;br /&gt;Here we see that the query on the database, hence load on the database is much smaller aiding scalabililty.&lt;br /&gt;&lt;br /&gt;However, atleast one query needs to be fired everytime.&lt;br /&gt;Now lets assume that we could tolerate stale data for 1 minute,&lt;br /&gt;we could change the getExchangeRate to&lt;br /&gt;If CacheExchangeRateTimeStamp &gt; = currentDateTime - 1 minute, then use cached , otherwise the above.&lt;br /&gt;This is going to result in a significantly faster execution times as it will save the round trips to the database.&lt;br /&gt;&lt;br /&gt;Lets now take the scenario where the exchange rate table is updated by a feed from my bank, instead of being entered manually. Here, If I cache, I have to tolerate stale data - even if that is for a minute. There can be no trigger ( unless supported by my bank) which can help check validity.&lt;br /&gt;&lt;br /&gt;Now lets come back to accessing the cache. We want to read from the cache simultaneously via multiple threads, however when the cache is being updated, we need all the threads to stop.&lt;br /&gt;While implementing such lookup - we need to be careful that we are not locking/synchronizing while reading - i.e. we must not force different objects to read in sequence.&lt;br /&gt;&lt;br /&gt;In the typical implementation, I would prefer a hashtable as it already provides such thread-safety ( provided there is only one writing thread).&lt;br /&gt;&lt;br /&gt;The granularity of this lock should be the same as granularity of cache update.&lt;br /&gt;&lt;br /&gt;Now looking at expiry:&lt;br /&gt;We have already discussed that in some cases, we need to have a Garbage collector like daemon clearing the cached items. This will also be required in case of LRU cache.&lt;br /&gt;If the number of items in cache are limited, we dont need such daemon. The expiry checking happens while fetching from it.&lt;br /&gt;In some other cases, we will expire and seed cache at pre-defined times.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Here are some questions which I answered in a recent post on MSDN forums:&lt;br /&gt;&lt;strong&gt;1. At what moment of time should we invalidate the cache like at the start of transaction , just before updating the database or any other time&lt;br /&gt;&lt;/strong&gt;It really depends on what kind of data you are caching. If you are caching flight information as the expedia example above, the transaction is not in your control hence you have no option but to have a slight tolerance for stale data.&lt;br /&gt;Assuming your own system is updating the data in cache and assuming that it is in a common datastore like a SQL database. If I also assume that we are talking about a clustered environment : there are multiple instances of the cache - one in each app-pool x each machine . Hence the component which is updating the data which is eventually cached has no way of reliably marking the cache invalid in all the different instances of the in-memory cache.&lt;br /&gt;This forces all the different caches to have their own "listeners" checking for cache update.If I donot talk patterns here and only talk implementation - If you were having an application which can tolerate stale data, you could have a background thread updating the cache periodically from the database. IF you were having an application which cannot tolerate stale cache, you will have to check the DB status before each read from the cache. Now in DB you could use triggers to populate cache status in a single cell table ( lets say last updated timestamp) which the cache compares with itself and updates the cache if required, This way, the load on the database is much lesser than what it would be if it were to retrieve the entire cache.&lt;br /&gt;So in this case, updating the cache invalid flag as a part of the atomic transaction will help.&lt;br /&gt;&lt;strong&gt;2 When should we lock the cache&lt;br /&gt;&lt;/strong&gt;In either case, we should lock the cache at the time update check or update is happening. Now the granularity of the lock depends on the granularity in which we want to refresh the data.&lt;br /&gt;&lt;strong&gt;3. how should we overcome the data retrieval latency issue which ultimately lead to stale data for some time in cache&lt;br /&gt;&lt;/strong&gt;For an externally maintained data, it may not be possible. For a self maintained data, sample implementation is discussed above.&lt;br /&gt;&lt;strong&gt;4. if a thread is making some changes in a data in database how can we make sure that cache gets refreshed for the other thread to get fresh value .&lt;br /&gt;&lt;/strong&gt;I think I got your dilemma now - I would say that keep the cache as a static hashtable or equivalent and always get data from it for all your threads ( something like (MyObject)Cache.getFromCache(itemid) ) - lock the getFromCache method when you are checking for updates.&lt;br /&gt;However, if you are looking at a code which is definitely going to be non clustered, you could put data in threads, have events and delegates to trigger cache - have an implementation of observer pattern or state machine. State machine if you have a dependancy like&lt;br /&gt;Thread A has cached data-&gt; Class B depends on that state of class A-&gt; class C depends on state of class B&lt;br /&gt;where you want the change to be &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_10"&gt;propagated&lt;/span&gt; all the way to C before you release the locks.&lt;br /&gt;&lt;strong&gt;5. Should we update the cache inside the transaction or after completion of transaction (in both scenarios of wanting and not wanting stale data)&lt;br /&gt;&lt;/strong&gt;It should be a part of the atomic transaction in either case.&lt;br /&gt;&lt;strong&gt;6.If one thread is reading or writing a value to cache, how to block other threads to access that cache object.&lt;br /&gt;&lt;/strong&gt;If we keep the cache in one static class, and implement a lock on the read from cache method **when the update or check for update is happening** ( I am not talking of a synchronous one at a time access). Now you could do an explicit lock of the object/method or you could use a &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11" onclick="BLOG_clickHandler(this)"&gt;dataset&lt;/span&gt; like &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12" onclick="BLOG_clickHandler(this)"&gt;Hashtable&lt;/span&gt;.Synchronized - which already have the implementation - which locks all &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13" onclick="BLOG_clickHandler(this)"&gt;getters&lt;/span&gt; when setters are happening.&lt;br /&gt;I would say for following will fit your purpose best- based on what I think your requirement is ( self maintained data in database, potential clustering, potentially no tolerance for stale data, &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_14" onclick="BLOG_clickHandler(this)"&gt;TTL&lt;/span&gt; and not &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_15" onclick="BLOG_clickHandler(this)"&gt;LRU&lt;/span&gt;)&lt;br /&gt;1) Chose something like a static synchronized &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_16" onclick="BLOG_clickHandler(this)"&gt;hashtable&lt;/span&gt; for your cache&lt;br /&gt;2) For a cluster - you would need to poll to get changes.&lt;br /&gt;3) Poll for changes at a timeout or at every get depending on whether you can tolerate stale data or not&lt;br /&gt;4) Define a coarse granularity of cache expiry (otherwise polling for cache expiry can offset any gains got by having the cache in the first place).&lt;br /&gt;If I were writing a generic cache - I would probably go for three different implementations for - externally maintained items, self maintained items and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_17" onclick="BLOG_clickHandler(this)"&gt;LRU&lt;/span&gt; cache - and Maybe a fourth one for non-clustered applications - like Games.&lt;br /&gt;&lt;br /&gt;Do post a comment and I will try to keep up with responses.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-1599272281855944425?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=1599272281855944425' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/1599272281855944425'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/1599272281855944425'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#1599272281855944425' title='Cache - access and expiry'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-2032513534029848048</id><published>2006-08-26T17:30:00.000+01:00</published><updated>2006-08-26T17:53:26.399+01:00</updated><title type='text'>Cache - Implementation</title><content type='html'>I have migrated my blog to &lt;a href="http://pranshujain.wordpress.com"&gt;http://pranshujain.wordpress.com&lt;/a&gt;  This article goes &lt;a href="http://pranshujain.wordpress.com/2006/08/27/cache-implementation/"&gt;here&lt;/a&gt;.&lt;br /&gt;In my previous post on &lt;a href="http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#269417046337886371"&gt;Cache concepts&lt;/a&gt; I had indicated that cache are defined by how we want to expire them. I also talked about granularity of cache.&lt;br /&gt;&lt;br /&gt;In this post - I will look at where to keep the cache .&lt;br /&gt;&lt;br /&gt;When it comes to caching implementation - the consideration are : Where should we keep the cache, how do we access the cache and how do we keep the cache updated. Accessing and updating cache will form topics for subsequent posts.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Where to keep the cache. &lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;We all agree that the closer the cache is to the consumer, the more optimal it is.&lt;br /&gt;Hence - for a browser based application- the most optimal way would be to cache it in user browsers. This can be done by having simple Get URLs and by specifying appropriate meta tags and HTTP headers indicating a cache expiry time.&lt;br /&gt;The problem with this is that it becomes unpredictable - whether the browsers cache or not and whether they actually expire cache when we want them to depends on the browsers and the user settings. This is however by all means is favorable for items which are not subject to change ( typically images &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_0"&gt;do not&lt;/span&gt; change even if text of an article does). The second problem with this is that this is cached per user, and not across users.&lt;br /&gt;&lt;br /&gt;The second closest place is proxy servers. The proxy servers have the same advantages and problems as the browsers and I will not go into it. Typically, we want to defeat the browser and proxy cache - and I would probably write a &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_1"&gt;separate&lt;/span&gt; post on that.&lt;br /&gt;&lt;br /&gt;The next nearest location is the Web servers ( I will go ahead and make an assumption that we are talking about a 3 tier application which is clustered at each level.).&lt;br /&gt;Now here we have an option that we use a daemon job to act as a user, access the dynamic page and place the &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_2"&gt;resiting&lt;/span&gt; page as a static HTML page. The links point to this HTML page - either directly or using a &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0" onclick="BLOG_clickHandler(this)"&gt;ISAPI&lt;/span&gt;&lt;/span&gt; / &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1" onclick="BLOG_clickHandler(this)"&gt;NSAPI&lt;/span&gt;&lt;/span&gt; &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2" onclick="BLOG_clickHandler(this)"&gt;plugin&lt;/span&gt;&lt;/span&gt; which work directly on the web server.&lt;br /&gt;This is the most optimal form of caching on the server. Serving such pages only require the same processing as a static HTML file. This is an ideal candidate for web sites which are accessible without &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3" onclick="BLOG_clickHandler(this)"&gt;login&lt;/span&gt;&lt;/span&gt;, or which have an "all or nothing" &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_4" onclick="BLOG_clickHandler(this)"&gt;login&lt;/span&gt;&lt;/span&gt; - no user profile based access.&lt;br /&gt;&lt;br /&gt;The next location is withing the web-app. The web-app is most popular location for caching contents. Most popular web platforms provide an option to cache parts of web page - either as a part of the language - like &lt;a href="http://msdn2.microsoft.com/en-us/library/xsbfdd8c.aspx"&gt;Asp.net&lt;/a&gt; or &lt;a href="http://docs.sun.com/source/817-1833-10/pwajsp.html#wp42334"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5" onclick="BLOG_clickHandler(this)"&gt;JSP&lt;/span&gt;&lt;/span&gt; cache tag&lt;/a&gt;, or using third party components like &lt;a href="http://www.opensymphony.com/oscache/"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_6" onclick="BLOG_clickHandler(this)"&gt;oscache&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;.&lt;br /&gt;This is highly efficient as we are caching computed HTML - with the least possible amount of processing on the server end, while still allowing personalized / dynamic content for the rest of the page.&lt;br /&gt;It is worth noting that such cache is not Clustered and here there will be an instance of the cached item on each web server instance. This will pose a challenge while refreshing the cache in case of event driven cache. This could also lead to &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10" onclick="BLOG_clickHandler(this)"&gt;in-consistent&lt;/span&gt; results for a short period in case of &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7" onclick="BLOG_clickHandler(this)"&gt;TTL&lt;/span&gt;&lt;/span&gt; caches ( two different users hitting different web servers could see different results).&lt;br /&gt;&lt;br /&gt;So where is such a cache applicable? Well - almost everywhere. Lets say you have an application which requires authentication and personalizes parts of site - like navigation based on access rights. Here we can cache the "relatively" static parts of the page.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Third option will be to cache objects in memory. Almost all applications cache reference data in memory either on web-app or on the app layer or both. The cached objects are placed in some kind of singleton or static objects. The &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8" onclick="BLOG_clickHandler(this)"&gt;datastore&lt;/span&gt;&lt;/span&gt; is structured depending on the granularity of cache expiry - and also based on whether there are limited cache items or growing/very large number of items in the cache. Lets take a few examples to understand this better.&lt;br /&gt;As in last article,&lt;br /&gt;1) Departure boards of all London Airports. Here feeds come from different airports, number of airports is limited and all get flights for an airport get updated together. Here it might make sense to have one "&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_9" onclick="BLOG_clickHandler(this)"&gt;hashtable&lt;/span&gt;&lt;/span&gt;" or similar per airport and flights as items within them. On &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_14"&gt;receiving&lt;/span&gt; a new feed, the entire &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_15" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10" onclick="BLOG_clickHandler(this)"&gt;hashtable&lt;/span&gt;&lt;/span&gt; may be cleared. So here, cache clearing is done on &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_16"&gt;retrieving&lt;/span&gt; a new feed and there is no worrying about cache running away with all available memory.&lt;br /&gt;2) Flight results at &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_17" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_11" onclick="BLOG_clickHandler(this)"&gt;expedia&lt;/span&gt;&lt;/span&gt;.com or similar: Here the flight results depend on number and age of passengers, departure and return dates, preferences like economy, business class etc. It is likely that the same query may never get repeated - like If I searched for today's flight, from tomorrow it will never be there. In such case, the cached result will just sit and eat away the memory. Such cases &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_18"&gt;require&lt;/span&gt; cached items to be specifically recorded in a list - and the items in the list be cleared by a background thread- very much like garbage collection in managed .NET code or Java. The &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_19" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12" onclick="BLOG_clickHandler(this)"&gt;datastore&lt;/span&gt;&lt;/span&gt; in such case may be a static &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_20" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13" onclick="BLOG_clickHandler(this)"&gt;hashtable&lt;/span&gt;&lt;/span&gt; + a table containing object references and expiry times sorted by expiry time.&lt;br /&gt;3) Inventory for an e-commerce shop. The Item summary including inventory may be cached at search results level. On navigating to item details, or on attempting to buy, the cached inventory for the specific item may be refreshed. The &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_21" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_14" onclick="BLOG_clickHandler(this)"&gt;datastore&lt;/span&gt;&lt;/span&gt; in this case maybe something like a &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_22" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_15" onclick="BLOG_clickHandler(this)"&gt;resultset&lt;/span&gt;&lt;/span&gt;.&lt;br /&gt;4) Reference data: In case of any change, we may want to refresh all reference data elements.&lt;br /&gt;The &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_23" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_16" onclick="BLOG_clickHandler(this)"&gt;datastore&lt;/span&gt;&lt;/span&gt; in this case may be a static &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_24" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_17" onclick="BLOG_clickHandler(this)"&gt;hashtable&lt;/span&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Remember that these object cache will be one per server instance ( 1 per &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_25" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_18" onclick="BLOG_clickHandler(this)"&gt;JVM&lt;/span&gt;&lt;/span&gt;, 1 per App pool etc.)&lt;br /&gt;&lt;br /&gt;The final option is to cache objects in database. Now there are two distinct scenario where we do that&lt;br /&gt;1) We create materialized views and hence "cache" the table values in a way its optimal to retrieve.&lt;br /&gt;2) Data fetched from external sources is stored in the database for &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_26"&gt;subsequent&lt;/span&gt; use.&lt;br /&gt;&lt;br /&gt;Cache is kept in database in multiple scenario like&lt;br /&gt;a) No tolerance of stale data. Database cache is a single place and hence can be updated ( read observer pattern, state machines and related patterns) as soon as the source data gets updated.&lt;br /&gt;b) Grid computing / 2 Tier application or a very large web-farm - in which case it might be sub-optimal to have the items cached at each node.&lt;br /&gt;c) Data comes from external interfaces at a very high cost, and hence 1 fetch per &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_27" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_19" onclick="BLOG_clickHandler(this)"&gt;JVM&lt;/span&gt;&lt;/span&gt;/&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_28" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_20" onclick="BLOG_clickHandler(this)"&gt;CLR&lt;/span&gt;&lt;/span&gt; may be too in-efficient. Hence the item is cached in the database, and it may as well be cached in the individual machines.&lt;br /&gt;&lt;br /&gt;In the next post on cache, I will explore in detail accessing and updating cache, especially event driven ones.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-2032513534029848048?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#269417046337886371' title='Cache - Implementation'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=2032513534029848048' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/2032513534029848048'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/2032513534029848048'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#2032513534029848048' title='Cache - Implementation'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9149488327872994243.post-269417046337886371</id><published>2006-08-26T14:44:00.000+01:00</published><updated>2006-08-26T15:27:36.356+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='software'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='cache'/><category scheme='http://www.blogger.com/atom/ns#' term='architecture'/><title type='text'>Cache - the concepts</title><content type='html'>I have migrated my blogs to &lt;a href="http://pranshujain.wordpress.com"&gt;http://pranshujain.wordpress.com&lt;/a&gt; . this entry goes &lt;a href="http://pranshujain.wordpress.com/2006/08/27/cache-the-concepts/"&gt;here&lt;/a&gt;.&lt;br /&gt;In the early days of &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_0"&gt;Internet&lt;/span&gt;, when memory was expensive and &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_0" onclick="BLOG_clickHandler(this)"&gt;CPUs&lt;/span&gt;&lt;/span&gt; were less powerful, and the dreams were big and budgets were small - caching was perhaps the biggest buzz thing in software architecture. Today, we have &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_2"&gt;at least&lt;/span&gt; 10x more powerful &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_1" onclick="BLOG_clickHandler(this)"&gt;CPUs&lt;/span&gt;&lt;/span&gt;, memory is &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_4"&gt;at least&lt;/span&gt; 10 times cheaper, and hardware and software can also be scaled many times over ( who would have imagined windows supporting 64 processors? ) - Cache is still there - and is supported out of the box in some web programming languages!!&lt;br /&gt;I came across &lt;a href="http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=571093&amp;amp;SiteID=1"&gt;this&lt;/a&gt; question in &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_5" onclick="BLOG_clickHandler(this)"&gt;&lt;span class="blsp-spelling-error" id="SPELLING_ERROR_2" onclick="BLOG_clickHandler(this)"&gt;MSDN&lt;/span&gt;&lt;/span&gt; architecture form - which triggered this note.&lt;br /&gt;&lt;br /&gt;There are three distinct kinds of scenario which need different kinds of caching and cache expiry.&lt;br /&gt;1) Most Frequently Used: Imagine that you are building an application like Amazon.co.uk - you have millions of items in the catalogue, you can fetch details about all of them in &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_3" onclick="BLOG_clickHandler(this)"&gt;runtime&lt;/span&gt; from the database. Here, we know that the book descriptions are not going to change very often, hence we know that we &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_4"&gt;do not&lt;/span&gt; need to hit the database &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_5"&gt;every time&lt;/span&gt; we need to show details of a book / or other item. However, the sheer size of the database is so large that we cannot ( no not even now) contemplate keeping all of it in memory of the application server. The solution here is to keep the most frequently used items in the Cache and remove the rest. The cache expiry algorithm here would be to remove the &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_6"&gt;Least&lt;/span&gt; Recently Used item from the cache. These kinds of cache are called &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_7" onclick="BLOG_clickHandler(this)"&gt;LRU&lt;/span&gt; cache for least recently used. It is funny how its named after the mechanisms of expiry and not on how elements are cached.&lt;br /&gt;If we think a bit more, we will realize how old this concept it. Computer architecture - memory management - virtual memory uses this concept - a quick look at &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_8" onclick="BLOG_clickHandler(this)"&gt;wikipedia&lt;/span&gt; for &lt;a href="http://en.wikipedia.org/wiki/Virtual_memory"&gt;Virtual Memory&lt;/a&gt; shows that this concept has been around from 1959! phew!&lt;br /&gt;&lt;br /&gt;2) Time to Live: There are many data for which the &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_9"&gt;resources&lt;/span&gt; are abundant and we can cache and manage all in memory. If you have subscribed to &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_10" onclick="BLOG_clickHandler(this)"&gt;RSS&lt;/span&gt; feeds via any of the readers -like &lt;a href="http://www.live.com/"&gt;http://www.live.com/&lt;/a&gt; or &lt;a href="http://www.google.co.uk/ig"&gt;www.google.co.uk/ig&lt;/a&gt; you know that the data you see when you launch these pages may not be the latest. These readers &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_11"&gt;do not&lt;/span&gt; go to the source very frequently to get an update. They get the feeds and keep it in memory for about 2 hours or whatever is the time you specify. This kind of caching is named "Time to Live" or &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_12" onclick="BLOG_clickHandler(this)"&gt;TTL&lt;/span&gt; - again because the item in memory has a certain time to live before it expires. You can see this kind of cache on &lt;span class="blsp-spelling-error" id="SPELLING_ERROR_13" onclick="BLOG_clickHandler(this)"&gt;ebay&lt;/span&gt; ( you will notice that while viewing a list of items, the bid price &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_14"&gt;does not&lt;/span&gt; always reflect what is happening inside - esp for items which are ending soon).&lt;br /&gt;&lt;br /&gt;3) Event based: here we cache the data till an external event forces the cache to be expired. some of the online travel sites - cache the hotel availability information, till they get an error while booking saying that there are no rooms available. This is a trigger for cache expiry. Similarly the B2C new sites have explicit "publishing" action which clears the cache and lets the users see the latest articles.&lt;br /&gt;&lt;br /&gt;In all three scenarios, the characteristics of cache are in fact the characteristics of cache expiry.&lt;br /&gt;&lt;br /&gt;The next thing to consider is the Granularity of cache.&lt;br /&gt;Many times, information comes in Packets, and hence cache should also expire in packets.&lt;br /&gt;Lets say you have an application showing departure boards of all &lt;span class="blsp-spelling-corrected" id="SPELLING_ERROR_15"&gt;London&lt;/span&gt; airports. Now its likely that you will get a feed every "x" minutes from different airports. And when you get that, you want to expire the status of all flights of the particular airport and reload them. It might be too much of an effort to compare individual flights and change selectively. In some cases, this processing may out-weigh the benefit of the cache in the first place.&lt;br /&gt;&lt;br /&gt;It is usually easy to identify the parameters for caching - and for a different value of any of those parameters, we can assign a different "cache set". For example, search results will depend on the query parameters typed by the user. It may also depend on the language option specified by the user and any other structured search field selected by the user. So - caching in that case will have all the structured search options as parameters - and if any of that changes - it can not use the cached value and has to do a query.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I will talk more about caching and cache implementation in my next post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9149488327872994243-269417046337886371?l=getitdone-simpler.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9149488327872994243&amp;postID=269417046337886371' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/269417046337886371'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9149488327872994243/posts/default/269417046337886371'/><link rel='alternate' type='text/html' href='http://getitdone-simpler.blogspot.com/2006_08_01_archive.html#269417046337886371' title='Cache - the concepts'/><author><name>Pranshu Jain</name><uri>http://www.blogger.com/profile/01238353222080252804</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry></feed>
