ThumbCached
Distributed caching and storing system for small binary file/data
Version: 1.0.1
EditAbout
ThumbCached is a simple, high-performance, distributed caching and storing system for small file/data. It's commonly used to store large number of non-critical and small capacity and read frequently binary file or data, such as image thumbnails and user custom head icons on web site.
EditFeatures
- Use two large files to store a large number of small data files, it can save disk space that the lots of small files wasted, also it’s simple to move and backup data.
- Server and applications can be distributed in different servers, and multiple applications can also access the same server.
- Built-in data caching system, it can enhance the access speed of data that frequently visits.
- using standard HTTP protocol, kinds of programming languages (such as ASP.Net, PHP, etc.) can use the existing components to access the service.
- Use a “key” to access the file/data information and binary content, needn’t have to remember the path and file name, it can simplify the programming.
- One server can open several storage units, different types of data (such as user custom head icons and image thumbnails) can stored separately.
- Use asynchronous socket to provide high-performance.
EditService protocol
ThumbCached using standard HTTP protocol to communicate with applications, it contains add/update, fetch and delete operations. The operation target is call “block” (a block contains a “key”, last modified time, binary data content), the “key” is a string that made up with letters and numbers, for example:"item001", "23A8F001C".
1. Add or update the block
Use HTTP POST method, URL format: "/key".
POST method must attach the block binary content. If POST the same key twice, it will update the old content. If you need to specify the last modified time of a block, can append the time to the URL, such as "/key?20080815123000", the time format is "yyyyMMddHHmmss". When the method success, server will return HTTP 200-OK status code and empty content.
2. Fetch the block
Use HTTP GET method, URL format: "/key".
If methods success, server will return the block content, and the last modified time of block will exist in the HTTP response header "Last-Modified". If the specified block does not exist, server will return HTTP 404-NotFound status code and empty content.
It can also specify a time value in the URL, such as "/key?20080815123000", when the block’s last modified time is newer than the specified time, server will return the block content, or it will return HTTP 304-NotModified status code and empty content, and the block actually modified time will exist in the "Last-Modified" header. By this way can reduce unnecessary data transmission.
3. Delete the block
Use HTTP GET method, URL format: "/key/remove". If methods success, server will return HTTP 200-OK status code and empty content.
EditThe installtion and usage of ThumbCached
ThumbCached can run in console or run as Windows services. Download the source code (requires vs2008 compiler) or the compiled version, run "ThumbCached.exe" that can run as console. If need run as Windows service, run "InstallService.bat" batch to install first, and when hope to uninstall the service, run "UninstallService.bat".
The configuration file is "ThumbCached.exe.config" (console mode) and "Thumb- CachedService.exe.config" (Windows services mode), it’s need to restart the service to take effect when the configuration changed.
Configuration examples:
<thumbCached>
<cache>
<node id="tcd001">
<storeFile infoFilename="d:\tcd_avatar.db" dataFilename="d:\tcd_avatar.dat" />
<blockPool bufferPoolSize="128" activeTime="300" />
</node>
<node id="tcd002">
<storeFile infoFilename="d:\tcd_icon.db" dataFilename="d:\tcd_icon.dat" />
<blockPool bufferPoolSize="128" activeTime="300" />
</node>
<node id="tcd003">
<storeFile infoFilename="d:\tcd_thumbnail.db" dataFilename="d:\tcd_thumbnail.dat" />
<blockPool bufferPoolSize="128" activeTime="300" />
</node>
</cache>
</thumbCached>
<nbaseHttpd>
<binds>
<endpoint address="*" port="18500" nodeId="tcd001" />
<endpoint address="*" port="18501" nodeId="tcd002" />
<endpoint address="*" port="18502" nodeId="tcd003" />
</binds>
<connection keepAlive="true" timeout="180" connectionsLimit="5000"/>
</nbaseHttpd>
Figure 1
The server created 3 storage units in this example.
define the storage unit, the meaning of the main attributes are as follows:
- infoFilename and dataFilename: the path and file name of data files, note that in the FAT32 format disk partition, it’s only support a maximum capacity of 4 GB of a single file, so it’s better to store data file in NTFS format disk partition.
- bufferPoolSize: block buffer pool size, the default value is 128 MB.
- activeTime: the max time of block item can stay in the buffer pool, the default value is 300 seconds, when an item have not been visited outdated, it will be removed from the buffer pool.
- id: each storage unit need to define a non-repetition id, used to associate with the network binding end-point.
define the network binding end-point, the meaning of the main attributes are as follows:
- address and port: bind the IP address and port, if you want to bind all IP address (if server have multiple network interfaces or IP address), you can use an asterisk (*) as all IP addresses.
- nodeId: the corresponding id with the above storage unit,
- keepAlive: turn on or turn off the long connection between the server and client, default value is true.
- timeout: the long connections timeout value, the default is 180 seconds.
- connectionsLimit: the maximum connections allowable.
EditThe usage of TCClient
TCClient is a client for ThumbCached, it’s composed by C# 2.0, it can use to ASP.Net or other applications that base on .Net Framework.
1. The main methods
The main class include ThumbBlock.cs and TCClient.cs.
ThumbBlock.cs is the entity object for block, with 3 properties:
The key of block.
- public DateTime LastModifyTime
The last modified time of block.
The binary content of block.
The main methods of TCClient.cs:
- public void Add(string key, byte[] data)
Add or update a block.
- public void Add(string key, byte[] data, DateTime lastModifyTime)
Add or update a block, and specify the last modified time.
- public ThumbBlock Get(string key)
Fetch the specified block
- public ThumbBlock Get(string serverid, string key, DateTime ifModifySince)
Fetch the specified block
- public void Remove(string key)
Remove a block
2. Client configuration
In order to use TCClient components, it’s required to add the following sample paragraph to the application configuration files (web.config or app.config):
<configSections>
<section name="tcclient" type="Doms.TCClient.Configuration.TCClientConfigSection, TCClient,
Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" />
</configSections>
<tcclient>
<serverNodes defaultServer="icon01">
<server id="icon01" address="127.0.0.1" port="18500" />
<server id="icon02" address="127.0.0.1" port="18501" />
<server id="icon03" address="127.0.0.1" port="18502" />
</serverNodes>
</tcclient>
Figure 2
3. Demo
In order to use TCClient.dll, add it to project’s references list first. The following code demonstrates the use of TCClient to add, access and delete a block.
private static void Test()
{
TCClient client = new TCClient("icon02");
//add block
byte[] data = new byte[] { 1, 2, 3, 4, 5 };
DateTime time = DateTime.Now;
client.Add("item001", data, time);
try {
//get block
ThumbBlock block = client.Get("item001", time.AddDays(1));
Console.WriteLine("block content length:{0}", block.Data.Length);
} catch (BlockNotFoundException)
{
Console.WriteLine("block not found");
} catch (BlockNotModifyException ex)
{
Console.WriteLine("block actually last modified time is: {0}", ex.LastModifyTime);
Console.WriteLine("block content does not modify since: {0}", time.AddDays(1));
}
//remove block
client.Remove("item001");
}
Figure 3
EditSpeed test
Below use a single thread client to test ThumbCached that how many requests it can done in one a second, and each request perform one 10KB binary data.
Note: Here the test result is not complete and is not stringent, the actual performance will be different because the environment is different, so here the result for reference purposes only.
Hardware:
CPU: Athlon64 X2 4200+
RAM: RAM 2G DDR2
Network: 100M base
The testing perform 6 kinds request:
1. add block
2. update block
3. delete block
4. fetch(read) block
5. close the service cache and add block
6. close the service cache and fetch block
First run the test in the local mode (that is, client and server in the same machine), the test results are as follows:
Figure 4
Then run the test in the remote mode:
Figure 5
EditOnline resources and feedback
The latest source code and documentation of ThumbCached and TCClient can be found on the http://sourceforge.net/projects/thumbcached .
If there are problems to discuss, suggestions or found bugs, you can visit the forum at google: http://groups.google.com/group/thumbcached
ThumbCached is one of products of Doms team, Doms team mainly focus on research and development of distributed data storage system, welcome to visit the web site: http://www.domstorage.com.