EldoS | Feel safer!

Software components for data protection, secure storage and transfer

Concurrency issues with CbFS

Also by EldoS: SecureBlackbox
200+ components and classes for digital security, signing, encryption and secure networking.
#18367
Posted: 12/04/2011 09:17:11
by Brahim Bakayoko (Standard support level)
Joined: 11/22/2011
Posts: 37

Code
[2011-12-4 1:0:33.43][3992][OnSetFileAttributes] \Testing\Testroot\TestF1\A.txt
[2011-12-4 1:0:33.43][3992][OnGetFileSecurity] \Testing\Testroot\TestF1\A.txt
[2011-12-4 1:0:33.43][3436][OnSetEndOfFile] \Testing\Testroot\TestF1\A.txt
[2011-12-4 1:0:33.43][3436]      [WARN]:[OnSetEndOfFile] SetLengthByHandle: error=5, msg=Input/output error
         \Testing\Testroot\TestF1\A.txt ->

[2011-12-4 1:0:33.43][2540][OnSetEndOfFile] \Testing\Testroot\TestF1\A.txt
[2011-12-4 1:0:33.43][2540]      [WARN]:[OnSetEndOfFile] SetLengthByHandle: error=5, msg=Input/output error
         \Testing\Testroot\TestF1\A.txt ->

[2011-12-4 1:0:33.43][2428][OnSetAllocationSize] \Testing\Testroot\TestF1\A.txt

In the logs above, you can see 4 threads accessing the same file at the same time (time resolution is in milliseconds).
Per the documentation, these calls should have been serialized into one thread.
Why isn't it the case here?
#18368
Posted: 12/04/2011 09:26:42
by Eugene Mayevski (EldoS Corp.)

Quote
Brahim Bakayoko wrote:
Per the documentation, these calls should have been serialized into one thread. Why isn't it the case here?


There are two properties in CallbackFileSystem class: SerializeCallbacks and ThreadPoolSize. If you want true serialization, set SerializeCallbacks to true and ThreadPoolSize to 1. Or perform locking on file level (i.e. guard each open file with critical section)


Sincerely yours
Eugene Mayevski
#18369
Posted: 12/04/2011 14:43:35
by Brahim Bakayoko (Standard support level)
Joined: 11/22/2011
Posts: 37

Hi Eugene,

I don't want true serialization, but I want all operations pertaining to a file to go through only one thread as the documentation states.
And I quote 9https://www.eldos.com/documentation/cbfs/ref_cl_cbfs_prp_serializecallbacks.html):
Quote

...
When SerializeCallbacks is false, the callback functions are called from a single thread for the same file, but for different files the callback functions can be called in parallel. The number of parallel threads is deterined by ThreadPoolSize property.


Performing locking at the file handle level is just not ideal.

Thanks for looking into this.
#18372
Posted: 12/04/2011 22:27:04
by Eugene Mayevski (EldoS Corp.)

So far you can do what I have described.


Sincerely yours
Eugene Mayevski
#18374
Posted: 12/04/2011 23:25:21
by Brahim Bakayoko (Standard support level)
Joined: 11/22/2011
Posts: 37

Eugene,

What you have described is not practical.

1. Running on a single thread is performance restrictive.
2. And so is locking on the file level.

I understand the current state of affair, but I am hoping for a more sensible solution to be worked on in the API.
Maybe I should specify that, although the logs don't show it, all 4 threads are executing on the same file handle.
So, when I am talking of file, I am talking of a unique handle.
Multiple threads having each their own handle to a given file is a non-issue, but all threads executing on the same handle leads to unnecessary synchronization issues.

Thanks.
#18375
Posted: 12/05/2011 01:15:56
by Volodymyr Zinin (EldoS Corp.)

Quote
Brahim Bakayoko wrote:
Per the documentation, these calls should have been serialized into one thread.

Calls for one file is serialized into one thread, but this thread can be different from time to time. Let me describe it in details. With each opened file/directory a special structure is associated where there is a queue of pending requests which must be passed to the user callbacks and a pointer to a worker thread which is currently being processed these requests (this pointer can be NULL if any worker thread has not been assigned yet). The worker threads are what the CallbackFileSystem.ThreadPoolSize property is specified (in future we are going to make the worker threads quantity can be changed dynamically). The worker threads quantity often is less than the quantity of opened files and in this case worker threads are reassigned from one file to another (i.e. some worker thread can call a callback for one file, then it can call a callback for another file, and then will return to process the requests for the first file). But at the same time only one worker thread can call the callbacks for a file. I.e. callbacks for a file/directory are always called sequentially. I.e. one callback is finished and only after it another callback is called. But this another callback can be called by another worker thread. So it isn't necessary to do locking on the file level.
Perhaps in your case the millisecond interval isn't enough or the function that gets current time woks incorrectly. But of course there is a possibility of a bug in CallbackFS too. Please check it more.
#18376
Posted: 12/05/2011 05:29:56
by Eugene Mayevski (EldoS Corp.)

Quote
Brahim Bakayoko wrote:
What you have described is not practical.


I'd say that arguing is much less practical. If the file is accessed from several threads in parallel, something needs to be done for serialization. This something can be done either in the driver or in your code, but it *must* be done. If for any reason the driver doesn't do this (or doesn't do it the way you want it to be done), then you can argue, you can complain, whatever else, but you will have to add synchronization.


Sincerely yours
Eugene Mayevski
#18378
Posted: 12/05/2011 07:38:32
by Brahim Bakayoko (Standard support level)
Joined: 11/22/2011
Posts: 37

@Vladimir

Thanks for the clarity provided.

I am going to both increase the time resolution (at least try) and log the completion of each callback to get more insight into what is going on.
My logs seem to insinuate that the callbacks are received in parallel without any completion wait.
If the API behaves as you have described, then we have the serialization that I am after and the issue would be in my code.
So, let me make the additional changes and report back.
Thanks again.

@Eugene

It is less about arguing than seeking clarity.
Yes, synchronization must be done. The best place to do it is what's in a way being discussed. If the API is working as Vladimir described, then the synchronization (well, pseudo serialization in this case) is being done at the proper place (in the API).
My logs seem to insinuate that I am getting the callbacks in parallel (with no wait on completion), which I am going to re-test and report back.

Thanks.
#18379
Posted: 12/05/2011 07:55:51
by Eugene Mayevski (EldoS Corp.)

Quote
Brahim Bakayoko wrote:
Yes, synchronization must be done. The best place to do it is what's in a way being discussed. If the API is working as Vladimir described, then the synchronization (well, pseudo serialization in this case) is being done at the proper place (in the API). My logs seem to insinuate that I am getting the callbacks in parallel (with no wait on completion), which I am going to re-test and report back.


I am sorry if my words seemed to be harsh, - I was just trying to put it that there exist things we need to base our actions upon (and which we can't easily change).

You can write "callback enter" and "callback exit" events to the log to track the order.

As for Vladimir's explanation - it leaves some place for misunderstanding. The whole idea is that file operations are reported (your callbacks are called) in the order they arrive from the OS. However, no assumptions about threading can be made. I.e. two consequent writes for the same file can happen in two threads, *but* by the time second operation is started (callback is called), first operation must have been completed.

Any excessive synchronization (beyond simple serialization mentioned above) should be left for user's implementation for the reason that the user knows better what resources he needs to guard and in which way.

Let's wait for results of your tests.


Sincerely yours
Eugene Mayevski
#18383
Posted: 12/06/2011 11:37:31
by Brahim Bakayoko (Standard support level)
Joined: 11/22/2011
Posts: 37

Quick update:

I have increased the time resolution of my logs and added callback exit log entries.

Two things:

1. The issue as posted is with my code and not with the CbFS API.
2. So far, the logs show that each callback does complete before another one is triggered for the same handle (regardless of thread).

I am continuing my testing and will post back for a conclusion.
Also by EldoS: CallbackFilter
A component to monitor and control disk activity, track file and directory operations (create, read, write, rename etc.), alter file data, encrypt files, create virtual files.

Reply

Statistics

Topic viewed 3097 times

Number of guests: 1, registered members: 0, in total hidden: 0




|

Back to top

As of July 15, 2016 EldoS Corporation will operate as a division of /n software inc. For more information, please read the announcement.

Got it!