EldoS | Feel safer!

Software components for data protection, secure storage and transfer

File Parallel Reading

Also by EldoS: CallbackRegistry
A component to monitor and control Windows registry access and create virtual registry keys.
#20272
Posted: 05/30/2012 04:55:50
by Kenny Kim (Standard support level)
Joined: 08/19/2009
Posts: 38

From CBFS Help:
Quote
CallbackFileSystem.SerializeCallbacks property
Use this property to tell the driver to serialize all callback function calls to a single thread.

When SerializeCallbacks is true, all callback functions are called sequentially from a single thread. When SerializeCallbacks is false, the callback functions are called from a single thread for the same file, but for different files the callback functions can be called in parallel. The number of parallel threads is deterined by ThreadPoolSize property.


Quote
When SerializeCallbacks is false, the callback functions are called from a single thread for the same file...
Does this mean that there is no way, let's say, copy one file using Explorer to 2 different locations using 2 threads?
Because, we are having problem with playing and copying a HD movie file at the same time.

Test scenario:
1.Start playing a movie using KMPlayer or DivX Player.
2.Start copying the movie using Explorer to another location.
Result: the movie player either stops playing or it will play the movie with intervals (depends on movie file bitrate).

SerializeCallbacks property is set to False

I have attached the file info.

CBFS Driver Ver. 3.2.107.271
System Info: Windows 7 Prof x64, Core i5 3.30 GHz, 16 GB RAM.


#20273
Posted: 05/30/2012 07:12:15
by Volodymyr Zinin (EldoS Corp.)

Unfortunately in the current version there is no way to make such parallel processing. But you are right that such feature will help to improve reading (and other operations that don't change a file).
We will add it in the next version.
#20503
Posted: 06/18/2012 20:41:55
by Ivan P (Priority Standard support level)
Joined: 04/11/2011
Posts: 66

Hello,

As it's said, callbacks for the same file are serialized even if SerializeCallbacks is false.
But I've just got the opposite situation while debugging: the single file is accesed from parallel threads. The drive is mounted as a network share. I tried to open a file in FAR manager (just pressed F3). Debug assertion happened and I broke into debugger inside my OnCbfsReadFile handler. While I was stepping over my code, another thread got into OnCbfsReadFile for the same file:
Code
[Far.exe: 4436:10]: GetInfo: '\i-robot.jpg' - Size: 25019
[Far.exe: 4436:11]: Open: '\i-robot.jpg' , ctx:00000000, access/share:[00120089:0007] - [GENERIC_READ]:[READ, WRITE, DELETE]
Allocated ctx:001A2A28
[<Unk>: 4436:10]: Reading 4096 bytes of '\i-robot.jpg' (ctx:001A2A28) at offset 0
[<Unk>: 4436:12]: Reading 25019 bytes of '\i-robot.jpg' (ctx:001A2A28) at offset 0


As you can see the last 2 lines indicate 2 reading that are executed in parallel threads (managed threads 10 and 12). And it's not 2 sequential calls in different pool threads. They are really parallel: both are in different location of OnCbfsReadFile handler at the same time.

I have a timeout set to 30 seconds.
Could the following scenario happen?:
1. OnCbfsReadFile is called in thread 10. Then thread execution hangs for some reason (in my case - stopped in the debugger)
2. Driver detects it as timeout and report error to application (Far.exe)
3. Far.exe retries a read operation. This causes driver to use another thread from the thread pool (thread 12) to call OnCbfsReadFile handler.
4. Thread 10 awakes, and this causes simultaneous access from threads 10 and 12
to the same data structure associated with the file handle.

Though I'm not sure if it hapens only after timeout is expired.

The problem here is that I supposed that I could avoid thread synchronization when working with the same file handle (due to callback serialization).
Now it seems to me that I cannot do this (at least in case if callback is unable to finish within a timeout period).
I suppose this happens because thread that was handling previous call was not terminated after timeout (and before the next call).

Am I correct?
Does it happen only when debugger is attached to my handler app (and this preventd thread from being killed)?
Is it .Net sepcific?
What are recommendations for handling such situation correctly?

Thanks,
IP

CBFS version: 3.2.108.273
Windows 7 x64
#20505
Posted: 06/19/2012 02:39:27
by Volodymyr Zinin (EldoS Corp.)

This is because the timeout occurred for the first call. The CallbackFS driver in this case finishes processing of the request and when a new request is coming it's started to process using another worker thread from the thread pool (specified by the ThreadPoolSize property).
Of course in such case several requests are being processed in parallel. It seems it's a bug and in the next version we will correct it in the following way - in the case the timeout has already occurred for a request which is currently being processed in a callback, all new incoming requests for this file will be finished immediately with the timeout error.

PS: It doesn't correspond to the parallel processing which was discussed before. We are going to implement it and it will be manageable by setting of a special flag.
#21925
Posted: 10/11/2012 03:27:16
by Kenny Kim (Standard support level)
Joined: 08/19/2009
Posts: 38

Good afternoon.

I am, again, adressing the "File Parallel Reading" problem that I have risen several months ago, as it is very critical issue for us.

I have several questions:
1.
Quote
From "changes.txt":
Parallel execution of read requests to the same file is now possible. It can be enabled using <a href="ref_cl_cbfs_prp_parallel_processing_allowed.html">ParallelProcessingAllowed</a> property.


Does this mean that, file parallel reading is now available, which will solve the problem discussed previously?
Quote
...let's say, copy one file using Explorer to 2 different locations using 2 threads?


2. If 1st question's answer is positive, is setting ParallelProcessingAllowed property to true and SerializeCallbacks property to false the only step to achieve file parallel reading?
Code
mCbFs.SerializeCallbacks=false;
mCbFs.ParallelProcessingAllowed=true;


I have manipulated with setting different values to different CbFs properties, but still, I cannot copy file A to location X and Y simultaneously at the same speed, as if I would copy file A and B to location X and Y.

In the documentation, which comes together with CBFS 4.0 beta 2, I could not find a clear description on how to read a single file with two or more applications independently from each other.

Quote
OnCreateFile event/delegate/callback description:
Sometimes it can happen that OnCreateFile is fired for a file which already exists. Normally such situation will not happen...


Quote
CallbackFileSystem.CallAllOpenCloseCallbacks property description:
When CallAllOpenCloseCallbacks is true, the driver calls OnCreateFile each time the file is opened and OnCloseFile each time the handle of the file is closed. In other words, if you open the file the same file from two threads, OnCreateFile will be called twice, for each open operation.

So, why we would call OnCreateFile more than once for a single file?

Quote
CallbackFileSystem.ParallelProcessingAllowed property description:
When parallel processing is allowed and two or more read requests are made over the same file, they are (given there are worker threads available) sent to OnReadFile callback/event handler in parallel.

Does this mean that, while app A is reading file X with fileHandleA, and if app B wants to read file X too, then app B is given fileHandleB by calling OnOpenFile?
Or fileHandleA is shared between app A and app B?

CBFS Driver Ver. 4.0.121.296
System Info: Windows 7 Prof x64.

Thank you.
#21933
Posted: 10/11/2012 05:38:39
by Volodymyr Zinin (EldoS Corp.)

Quote
Ulughbek Muslimov wrote:
Does this mean that, file parallel reading is now available, which will solve the problem discussed previously?

Yes. Actually this feature was created because of your report.

Quote
Ulughbek Muslimov wrote:
2. If 1st question's answer is positive, is setting ParallelProcessingAllowed property to true and SerializeCallbacks property to false the only step to achieve file parallel reading?

It's also necessary to set the ThreadPoolSize property to be greater than 1. These threads are used to call the callbacks. So if all of them are currently working (i.e. the callbacks are being called) but there are some pending requests that allowed to be processed in parallel, then these pending requests can't be processed until any worker thread has freed.
Just set ThreadPoolSize in some "reasonable" value (maybe NumOfProcessors*2). It depends on your implementation of the CallbackFS callbacks - how long they are processed, whether they wait (i.e. free the processor).

Quote
Ulughbek Muslimov wrote:
OnCreateFile event/delegate/callback description: Sometimes it can happen that OnCreateFile is fired for a file which already exists. Normally such situation will not happen...

Such situation can be. For example if the CREATE_ALWAYS flag is specified during the CreateFile call.

Quote
Ulughbek Muslimov wrote:
So, why we would call OnCreateFile more than once for a single file?

It seems I didn't understand your question. Any the system API CreateFile call causes a file to be opened. If it's requested to recreate the file and it's allowed then the OnCreate callback is called.

Quote
Ulughbek Muslimov wrote:
Does this mean that, while app A is reading file X with fileHandleA, and if app B wants to read file X too, then app B is given fileHandleB by calling OnOpenFile? Or fileHandleA is shared between app A and app B?

In both cases parallel processing is allowed.
#21934
Posted: 10/11/2012 05:56:27
by Kenny Kim (Standard support level)
Joined: 08/19/2009
Posts: 38

Thank you for your detailed answer.

Quote
It's also necessary to set the ThreadPoolSize property to be greater than 1.

ThreadPoolSize is set to maximum available.

Let me try to accomplish what I want for couple of days, and I will report you back if I succeed.
#22173
Posted: 10/24/2012 15:50:03
by Kenny Kim (Standard support level)
Joined: 08/19/2009
Posts: 38

Good morning.

I have another question about File Parallel Reading.

When I mount a disk and read fileA in parallel, I can see that for each open request number of file OpenCount grows:

...
OPEN: fileA.mkv, PID: 1713, Process Name: explorer.exe, OpenCount: 2
OPEN: fileA.mkv, PID: 4148, Process Name: dllhost.exe, OpenCount: 3
OPEN: fileA.mkv, PID: 4616, Process Name: KmPlayer.exe, OpenCount: 4
OPEN: fileA.mkv, PID: 6460, Process Name: KmPlayer.exe, OpenCount: 5
...

and, as stated in documentation, each process will initiate a new worker thread for OnReadFile callback.

How does CbFs detect a new Open/Read request for the same file?
Based on Process ID/Name of the process that originated the operation?

Let me explain the situation:
Virtual Disk X is mounted in Windows 2008 Server.
One of the folders of Disk X is shared using Windows File Sharing Service.
And, a number of users are accessing that shared folder and watching fileA.mkv simultaneously (basically, Windows 2008 Server is being used as a streaming server).
There is no problem in watching process itself, but when one of the users starts downloading the file to his computer:
a) His player will freeze for sure, until download completes.
b) Moreover, this will cause other users' players to freeze too.

When I tracked OnOpenFile callback, I have noticed that, in this case, the PID of the originator of the OpenFile Requests is always 4, in Windows Task Manager the description of the process is NT Kernel & System (I guess this is Network Sharing Service):
...
OPEN: fileA.mkv, PID: 4, Process Name: , OpenCount: 12
...
OPEN: fileA.mkv, PID: 4, Process Name: , OpenCount: 13
...
OPEN: fileA.mkv, PID: 4, Process Name: , OpenCount: 14
...
OPEN: fileA.mkv, PID: 4, Process Name: , OpenCount: 15
...


How does perform CbFs in this case?
Does it differentiate each OnOpenFile requests initiated by Network Sharing Service (which is receiving multiple Open/Read requests for the same file from different computers), and start a new thread to complete the request, or are requests serialized, implying that all requests are originating from a single process?

CbFs driver ver.: 4.0.124.303
Win 7 x64

SerializeCallBacks=false;
ThreadPoolSize=10000;
ParallelProcessingAllowed=true;
CallAllOpenCloseCallBacks=true;


Thank you.
#22174
Posted: 10/25/2012 02:30:21
by Volodymyr Zinin (EldoS Corp.)

Quote
Ulughbek Muslimov wrote:
and, as stated in documentation, each process will initiate a new worker thread for OnReadFile callback.

It isn't so. There is an internal pool of worker threads and if any request from any originator process is coming to CallbackFS, and it's possible to process it immediately, and there is a free worker thread in the pool, then the request is processed. In the other case the request is placed to the waiting queue.

Quote
Ulughbek Muslimov wrote:
How does CbFs detect a new Open/Read request for the same file? Based on Process ID/Name of the process that originated the operation?

Actually it isn't necessary to do and CallbackFS doesn't do it. All read requests are possible to process in parallel. Of course if a process sends several read requests asynchronously, then sends write requests, and again sends read requests, then the first part of read requests will be processed in parallel, then the write requests will be processed sequentially, and then again the last bunch of read requests.

Quote
Ulughbek Muslimov wrote:
When I tracked OnOpenFile callback, I have noticed that, in this case, the PID of the originator of the OpenFile Requests is always 4, in Windows Task Manager the description of the process is NT Kernel & System (I guess this is Network Sharing Service):

This is the "system" process which is used by "SMB server" to support sharing. CallbackFS works with it in the same way as with other processes, but in addition there is some support of network shares - "oplocks", which is special synchronization mechanism for network sharing embedded in Windows. CallbackFS supports it. But for Windows 7 and especially for Windows 8 this feature has been significantly improved and we are going to add this improvement to CallbackFS v4. Perhaps it will help in your case.
Also by EldoS: CallbackRegistry
A component to monitor and control Windows registry access and create virtual registry keys.

Reply

Statistics

Topic viewed 3878 times

Number of guests: 1, registered members: 0, in total hidden: 0




|

Back to top

As of July 15, 2016 EldoS Corporation will operate as a division of /n software inc. For more information, please read the announcement.

Got it!