EldoS | Feel safer!

Software components for data protection, secure storage and transfer

Metadata cache woes

Also by EldoS: Solid File System
A virtual file system that offers a feature-rich storage for application documents and data with built-in compression and encryption.
#33751
Posted: 06/23/2015 13:37:50
by david bennett (Standard support level)
Joined: 03/29/2013
Posts: 50

To start off, I'm using CBFS version 4. I am having issues when I move a file from the desktop into my virtual file system using Windows Explorer (I don't know if move is an issue with other programs). Interestingly, if I copy the file rather than move it, there is no problem. It only happens on move.

What makes this tricky is that the file on the desktop has a special extension, but after the file is moved into the CBFS the extension is removed. To let everyone know that this has happened, I call both EldoS' NotifyDirectoryChange() and Windows SHChangeNotify() as shown below.

Code
        // Notify EldoS
        CbFS->NotifyDirectoryChange(virtualname + SPECIAL_EXTENSION, CallbackFileSystem::fanRemoved, true);
        CbFS->NotifyDirectoryChange(virtualname, CallbackFileSystem::fanAdded, true);

        // Notify Windows Explorer
        SHChangeNotify(SHCNE_RENAMEITEM, SHCNF_PATH, (LPCTSTR)GetMountingPointName() + virtualname + SPECIAL_EXTENSION), (LPCTSTR)virtualname);


Return codes from both CbFS->NotifyDirectoryChange() calls are good. The virtualname variable contains the path relative to the mounting point, including a leading backslash.

The behavior I see is strange and varies from machine to machine and seems to reflect some caching issue. The file will appear to be the wrong size, or deleting the file will fail, or after you delete the file trying to create a file with the same name will fail. After trying to eliminate possibilities, the bottom line is that if I disable the EldoS metadata cache everything works just fine.

Since this is not the first issue I've had with the metadata cache, and since the last issue I reported was apparently fixed in CBFS version 5 and you are not going to back-port the fixes, I decided it would be best for me to try to work around the issue and hope that it's fixed in version 5 when I have time to move forward. My workaround was to try to figure out a way to just flush the EldoS metadata cache. This will affect performance, but only during this rather rare edge case (or so I hoped).

I saw no function in the documentation to force a flush of your metadata cache, so what I tried was to disable the cache, wait a while, then re-enable the cache. I realize that I can't do this in a callback function so I am sending a signal to my main program and it does the flush. For the most part, this solution works. Unfortunately it doesn't work the same on all machines. On four of five machines we've tested on, a delay between disable and enable of about 500msecs seems to work fine. On the other of the five machines, 5 full seconds is not sufficient and it seems to require seven seconds of delay between the disable and the re-enabling of the cache to cause it to flush reliably.

Because of the bug I mentioned that you've already fixed in CBFS 5, I can't really leave the metadata cache permanently disabled. I can probably get away with leaving the cache disabled for some number of seconds, but even that will leave a window where the other problem may occur.

So what I'm really looking for is the answer to a few questions:

1) Does the code fragment I've included above look wrong in some way?
2) Is there a way I've missed that I can force a flush of the metadata cache?
3) Is there any amount time the cache is turned off that will guarantee a complete flush of the cache? If it's ten seconds, that's okay, but I just need to know how I can be sure it will work on every machine.

Thanks.
#33752
Posted: 06/23/2015 14:50:53
by Eugene Mayevski (EldoS Corp.)

Thank you for the detailed report.

Quote
david bennett wrote:
1) Does the code fragment I've included above look wrong in some way?


NotifyDirectoryChange is asynchronous in this situation. When it returns there's no guarantee that the operation has been completed. This is done to prevent possible deadlocks.

And this is probably the root of your problems. You need to postpone the refresh calls to after callback completion.

Quote
david bennett wrote:
2) Is there a way I've missed that I can force a flush of the metadata cache?


Turning the cache off and on should work, but as said, the problem is not in the cache but in your code.

Thus question 3 is not applicable.


Sincerely yours
Eugene Mayevski
#33756
Posted: 06/23/2015 21:05:54
by david bennett (Standard support level)
Joined: 03/29/2013
Posts: 50

When you say I need to postpone the refresh calls, what does that mean? I am not calling refresh and I don't know of a refresh function. Are you talking about the Windows SHChangeNotify call?

How can I *know* that the NotifyDirectoryChange call has finished? If I call that outside of the callback will it wait until the metadata change is completed so that I can then call "refresh" (whatever that is) and be certain that it will get the right thing?

Thanks again,
Dave
#33757
Posted: 06/24/2015 01:10:20
by Eugene Mayevski (EldoS Corp.)

Quote
david bennett wrote:
When you say I need to postpone the refresh calls, what does that mean? I am not calling refresh and I don't know of a refresh function. Are you talking about the Windows SHChangeNotify call?


As I have understood, both of your calls to NotifyDirectoryChange and SHChangeNotify are made in the callback / event handler. What I meant is that you have to move them outside of the callback thread. They must not be called from the callback or from any function, synchronously called from the callback handler.
For example, you can use PostMessage() WinAPI call and post a message with a special code to your main thread, which will handle this message and call the above mentioned functions in context of this main thread. The key is to use PostMessage or other asynchronous mechanism (you can also use our MsgConnect for interthread communications) and not SendMessage() or other kind of synchronous call.

Quote
david bennett wrote:
How can I *know* that the NotifyDirectoryChange call has finished? If I call that outside of the callback will it wait until the metadata change is completed so that I can then call "refresh" (whatever that is) and be certain that it will get the right thing?


If you call NotifyDirectoryChange like I described above (from the main thread or some other worker thread), it will complete the operation, then return. After this you can call SHChangeNotify. And then you should not need to disable and re-enable the cache.


Sincerely yours
Eugene Mayevski
#33758
Posted: 06/24/2015 01:56:36
by david bennett (Standard support level)
Joined: 03/29/2013
Posts: 50

Okay, I thought that might be what you meant, but I couldn't be sure. I currently do something similar to cause the flushing of the metadata cache outside of the callback since the documentation for those functions, unlike this one, says that they can't be made during the callback.

I think the documentation of this function is pretty light. It is a tricky topic worthy of more than the few sentences in the function description and it says there that "this method may be called from callback / event handlers or from outside of callback / event handlers," which is very different from what you wrote above. While the documentation may be literally correct, it's probably not going to do what one expects so I would recommend that you include a warning to that effect. It would have saved you time and effort required to answer my request. The information will be here now with this post, I guess, so maybe that's good enough.

Thanks for your help.
#33767
Posted: 06/24/2015 09:37:23
by Eugene Mayevski (EldoS Corp.)

You are right, the documentation for some reason doesn't mention what I described above. I was sure that this behavior was described in the docs and I guess that maybe it was some modification of the documentation, which lead to removal of the note (such things can happen, unfortunately). I'll fix this now.

NotifyDirectoryChange is discussed massively in this forum, yet I agree that it's forum is not very handy to read when you need a formal description of the function.

Please let us know if the change solves your problem.


Sincerely yours
Eugene Mayevski
#33768
Posted: 06/24/2015 09:43:22
by Volodymyr Zinin (EldoS Corp.)

The file name passed to NotifyDirectoryChange must contain full path and start with the root folder. For example:
CbFS->NotifyDirectoryChange(L"\\folder1\\folder2\\folderX\\file.ext", CallbackFileSystem::fanRemoved, true);

NotifyDirectoryChange called inside the callbacks is always processed asynchronously.
#33774
Posted: 06/24/2015 22:48:44
by david bennett (Standard support level)
Joined: 03/29/2013
Posts: 50

Well I tried the mechanism you suggested and the result was pretty interesting. Instead of issuing the NotifyDirectoryChange() call within the callback, I issue a signal that kicks off a thread I created for this specific purpose. That thread calls NotifyDirectoryChange() and then SHChangeNotify(). With my initial testing, this didn't work. I wound up with the same behavior I had before.

Speculating that maybe the background thread is starting before the callback finished, I added a Sleep(500) *before* the call to NotifyDirectoryChange(). This made it work for the simple case where I move a single file. It still didn't work when I tried to move six files. I suspect that I will see differences on different machines as I did with the cache flush version.

My new plan is to take your Mapper example and change it to implement the feature my product uses and assuming that exhibits the same problem, upload that so you guys can take a look at it. This will be a fair amount of work and I don't want to do it unless you think it will be useful.
#33775
Posted: 06/24/2015 23:44:21
by Eugene Mayevski (EldoS Corp.)

The case when you rename six files can be caused by the implementation of your function (the one that calls notification methods) - it's possible that the 6th operation triggers the same "first" execution of the worker function . I.e. instead of a signal you should use PostMessage and with 6 files you will have 6 messages and 6 independent calls to your worker function. Each should notify about its own file.

Please try this approach first.

In general, there's no way at the moment to schedule something to be executed after completion of the callback in a guaranteed way. I think that we need to add some mechanism. But it can be added only to CBFS 6 or even later.

Regarding Mapper sample - it would be interesting to investigate the behavior, although our capabilities to patch the old version are severely limited anyway, so IF the problem exists in CBFS 5 and 6, it can be fixed only in the recent versions anyway. So let's start with a PostMessage mechanism I described above.


Sincerely yours
Eugene Mayevski
#33777
Posted: 06/25/2015 00:39:02
by david bennett (Standard support level)
Joined: 03/29/2013
Posts: 50

My current method builds a queue and as each file is pulled off the queue it's waiting half a second so the sixth file is processed three seconds after the move completes.

I will try the PostMessage method but it seems unlikely to get me there. I think I may need to try something completely different. If you have any suggestions I would be interested. One thought I had was to explicitly do the move calling SHFileOp after the close file. This just seems like overkill but maybe the only way to make it work.

I think that I will probably try to create the mapper example anyway because I can use that to see if CBFS 5 has the same issue. We do plan to move at some point as another bug that I found in metadata caching is supposedly fixed in that.
Also by EldoS: Solid File System
A virtual file system that offers a feature-rich storage for application documents and data with built-in compression and encryption.

Reply

Statistics

Topic viewed 4721 times

Number of guests: 1, registered members: 0, in total hidden: 0




|

Back to top

As of July 15, 2016 EldoS Corporation will operate as a division of /n software inc. For more information, please read the announcement.

Got it!