EldoS | Feel safer!

Software components for data protection, secure storage and transfer

Understanding the MetaDataCache

Also by EldoS: CallbackFilter
A component to monitor and control disk activity, track file and directory operations (create, read, write, rename etc.), alter file data, encrypt files, create virtual files.
Posted: 03/28/2012 09:58:10
by Daniel Wehrle (Priority Standard support level)
Joined: 08/08/2008
Posts: 32

My application was run up to now with a deactivated MetaDataCache. For improving the performance I activated it and all works fine, but I have some questions about the cache.

My tests write and read 1 kB files was up to 70% faster what is really good, also starting the test two times increased the performance. But in case I run it four times, the write performance (Number of written files) is 50 % slower than with a deactivated MetaDataCache.

I could now imagine that due to more files overall and more files per folder the MetaDataCache cannot hold all of this information. So the cache entrys are removed fr om the cache.

Now I want to understand this in deep, is the a lim it of the overall cached file information? Is it like 1000 file objects are cached or like the cache can grow up to 100 MB of RAM? Are those values depending on the available memory of the system or is there a way to increase the cache size?

What make me also feel strange on this that in case I use 10 MB files instead of 1 KB files, meaning also less files are written and read altogether the read performance seams to decrease about 20%.

Computer, Application and Data are identical for all tests. Only the MetaDataCache is activated or deactivated.

-- Daniel
Posted: 03/28/2012 12:45:42
by Volodymyr Zinin (EldoS Corp.)

The metadata cache can hold up to 256 objects per volume that represent already closed files/directories. These objects contain everything about file/directory except file data. During file closing the number of these objects is checked and if it exceeds the limit then the least recently used objects are deleted. This causes the OnGetFileInfo callback call on the next file/directoy opening (before the OnCreate or OnOpen callback) if there is no information about the file in the metadata cache. In your case perhaps these extra OnGetFileInfo calls causes the performance decrease.

In the current CallbackFS version it isn't possible to change the cache size. But, I think, we will add it later.

Also there is the file data cache which is used to cache file data (for details see the DisableFileCache method).
Posted: 04/11/2012 21:33:16
by Sangmin Lee (Standard support level)
Joined: 06/03/2009
Posts: 57

I have some questions about metadata/data cache.

In FUSE, cache time interval can be defined. Its default value is 1 sec.
Thus, if a file isn't again used for the interval, automatilcally removed from cache.
In CBFS metadata cache, objects are maintained by replacing?

For file data cache;
- what are its size and time interval?
- What's it related with windows data cache?

The explanation in DisableFileCache method is too simple. -.-;;

Thanks in advance,

Posted: 04/12/2012 03:58:00
by Volodymyr Zinin (EldoS Corp.)

Sangmin Lee wrote:
In CBFS metadata cache, objects are maintained by replacing?

Yes. There can be up to 256 items per volume that represent already closed files/directories. And if a new element is inserted to the cache then the least recently used one is deleted.

Sangmin Lee wrote:
For file data cache; - what are its size and time interval?

Currently the cache globally (not on per volume basis) can have up to 90*1024*1024 pages. The pages have fixed size which is 4096 bytes. So the maximum size of the cache is 360Mb. The dirty pages flush is performed either asynchronously by a special internal 'dirty page writer' thread or at the time when a file is being closed. In any case the data from the cache has already been flushed when the latest handle to a file is closed. Also the data is purged (removed) from the cache at this time.

Sangmin Lee wrote:
- What's it related with windows data cache?

There is no any relation with the system cache. More over the main reason to implement own file data cache is to workaround some problems when the system cache is used. The main problem was because of a deadlock sporadically occurred when the system cache was used in parallel inside CallbackFS driver as well as inside the callbacks implementation (many or all customers are used code in the callbacks that explicitly or implicitly uses the system caching). The problem there lays in the system cache implementation - during its flushing a global resource is acquired (a mutex or something) and released only after the flush is finished. So when a flush occurs the request to write data comes to the CallbackFS driver, which calls the OnWriteFile callback, and this callback can do something on another file system and this actions can trigger another flush in the system cache which tries to acquire the mentioned above global resource, and this causes deadlock. By using the internal data cache we workarounded this problem.

BTW in the next CallbackFS version we are going to significantly upgrade the file data cache (implement variable sized pages, etc). So any propositions/ideas are welcome.



Topic viewed 1795 times

Number of guests: 1, registered members: 0, in total hidden: 0


Back to top

As of July 15, 2016 EldoS Corporation will operate as a division of /n software inc. For more information, please read the announcement.

Got it!