Prebuilding Cache

Post Reply
foosh101
Posts: 61
Joined: Tue Oct 22, 2002 2:13 pm

Prebuilding Cache

Post by foosh101 »

We periodically rebuild our database and indexes. If possible, I would like to pre-build the cache of search results for our most common searches but am a little confused as to exactly how the caching works.

In the example for caching and vortex scripts, it uses the SQL tag to query the cache like it is a table. Is it a table, or is it just in memory on the computer the search is done on and the SQL tag just knows the difference between that and a normal query? Is there any way to build the cache on one machine, and then copy it over to several other machines? Also, how does importing new records and re-indexing tables affect the cache? Does it automatically expire, or does a page stay in cache untill I manually expire it?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Prebuilding Cache

Post by mark »

Caching is done by the OS. Using a file or files (by performing SQL queries for example) brings the used parts of the file(s) into cache. Using the same areas of the same files increases the chances that they will stay in the cache.

Or, if you're referring to "Caching frequently accessed results" from the tutorial, the cache is just a table of answers and could be built on one machine and copied.
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

Prebuilding Cache

Post by Kai »

The cache example at http://www.thunderstone.com/texis/site/ ... sults.html is indeed an on-disk table in the database, like any other. (RAM tables can't store data for future invocations.) Each row of the cache table contains a URL (with the query), the results it generated, and an id/timestamp. The cache is queried like any other table with <SQL>.

You could copy the cache table to other machines with the same standard caveats about copying any table: cpdb is safest as it uses locks and can copy across different platforms. Alternatively, FTP in binary mode can be used, given the following caveats during the transfer:

o The source and destination platforms are identical (the platform string in parentheses printed by texis -version is identical)

o All modifications to the table are stopped on the source machine

o No accesses to the table at all happen on the destination machine

o The table already was created on the destination machine with the identical schema, or addtable is used afterwards

o All indexes on the table are FTPed at the same time, OR are dropped and re-made on the destination machine. Make sure there is an index on the URL/query, and one on the id field.

The cache does not automatically expire. Exactly how to do so depends on the nature and frequency of data updates, queries, etc. One common method is to periodically adelete cache entries older than a certain time, eg. every 5 minutes run "delete from cache where id < '-10 minutes'" to delete entries older than 10 minutes. More frequent queries might need more frequent deletes to keep the cache table reasonably-sized (ie. fast). You might want to modify the delete to preserve certain very-frequent queries (eg. a permanent query link on the home page).

Whenever the main table changes, the cache must be flushed as well. If the main table is batch-updated infrequently, you can just drop and re-make the cache table after updating the main table. For constantly-updated tables, it's best to just frequently delete old entries as above. The time to delete old entries can be fine-tuned based on the site; keeping older cache entries (eg. delete where id > '-30 minutes') may improve cache hits, but increases the chances of stale results to the user, and vice-versa for a younger cache.
foosh101
Posts: 61
Joined: Tue Oct 22, 2002 2:13 pm

Prebuilding Cache

Post by foosh101 »

Awesome. Very helpful info. Thanks.
Post Reply