EldoS | Feel safer!

Software components for data protection, secure storage and transfer

Write and delayed close

Posted: 06/23/2011 03:59:51
by Volodymyr Zinin (Team)

Or implement a list of opened sockets. Set the maximum allowed sockets value and if it's achieved then close the least recently used one.
Posted: 06/23/2011 04:16:25
by Sangmin Lee (Standard support level)
Joined: 06/03/2009
Posts: 57

In the case, almost all sockets on windows become pending on 'TIME_WAIT' state.
Our FS performance degrades sharply.

So, in our FS, the socket being established on the first WriteFile() call maintains by write completion and then, is closed on CloseFile().
Posted: 06/23/2011 04:26:29
by Volodymyr Zinin (Team)

Usually the time span between the immediate and asynchronous close calls isn't big enough. You can see it by the use of Process Monitor from sysinternals.com, but set there to show IRP_MJ_CLOSE requests, because they are not shown by default.
So it seems is not a problem in connection with socket pending on 'TIME_WAIT' state.
Posted: 06/23/2011 07:21:59
by Sangmin Lee (Standard support level)
Joined: 06/03/2009
Posts: 57

I implemented an alternative methold using count semaphore.
It limits max. number of sockets being concurrently connected.

#define MAX_DS_SOCKS 5
On Mount(),
sock_sem = CreateSemaphore(..., MAX_DS_SOCKS, MAX_DS_SOCK, ..);
DWORD sock_count = MAX_DS_SOCKS;
sock_mutex = CreateMutex();

On Umount(),

On WriteFile(),
WaitForSingleObject(&sock_sem, INFINITE);
WaitForSingleObject(&sock_mutex, INFINITE);
sockfd = socket();

On CloseFile(),
WaitForSingleObject(&sock_mutex, INFINITE);

I tested two cases;
Test case 1: single thread, "C:\] test.exe m:\ 1 6"
Test case 2: double thread, "C:\] test.exe m:\ 2 100"

On Test case 1, it completed without errors.
On Test case 2, it didn't complete and was blocked on semaphore.
The reason is that CloseFile() callbacks to files in writing weren't called.

You can confirm from attached following log.
Please, let me know why CloseFile() is not called from CBFS.


[ Download ]
Posted: 06/23/2011 07:49:49
by Sangmin Lee (Standard support level)
Joined: 06/03/2009
Posts: 57

From 3rd file to 6th file, CloseFile() callbacks were delayed.
On the part of our storage nodes, connections were restricted.

Generally, sockets in TIME_WAIT state are released after tends of seconds.
Assume user applications on our FS writing small files constantly from multiple threads.

There was no problem in applications with a small number of threads.

Test case 1 logs;
CreateFile(default:M:/5/5/_FILE_9051) <-- 1st file
client_write(/5/5/_FILE_9051): ds_connect(), TID(3068)
ds_connect: get, sock_count = 4

WriteFile(default:M:\5\5\_FILE_9051), Position(0), size(5119).
WriteFile(default:M:\5\5\_FILE_9051), Position(520215), size(4113).

CreateFile(default:M:/8/_FILE_3095) <-- 2nd file
client_write(/8/_FILE_3095): ds_connect(), TID(3068)
ds_connect: get, sock_count = 3

WriteFile(default:M:\8\_FILE_3095), Position(0), size(5119).
WriteFile(default:M:\8\_FILE_3095), Position(520215), size(4113).

CloseFile(default:M:\5\5\_FILE_9051): Entry.. <-- 1st file close
InvalidateFile(/5/5/_FILE_9051): release, sock_count(4), TID(2876)

CloseFile(default:M:\8\_FILE_3095): Entry.. <-- 2nd file
InvalidateFile(/8/_FILE_3095): release, sock_count(5), TID(3068)

CreateFile(default:M:/1/10/1/1/_FILE_570) <-- 3rd file
client_write(/1/10/1/1/_FILE_570): ds_connect(), TID(3068)
ds_connect: get, sock_count = 4

WriteFile(default:M:\1\10\1\1\_FILE_570), Position(0), size(5119).
WriteFile(default:M:\1\10\1\1\_FILE_570), Position(520215), size(4113).

CreateFile(default:M:/7/_FILE_8035) <-- 4th file
client_write(/7/_FILE_8035): ds_connect(), TID(3068)
ds_connect: get, sock_count = 3

WriteFile(default:M:\7\_FILE_8035), Position(0), size(5119).
WriteFile(default:M:\7\_FILE_8035), Position(520215), size(4113).

CreateFile(default:M:/1/_FILE_7046) <-- 5th file
client_write(/1/_FILE_7046): ds_connect(), TID(2876)
ds_connect: get, sock_count = 2

WriteFile(default:M:\1\_FILE_7046), Position(0), size(5119).
WriteFile(default:M:\1\_FILE_7046), Position(520215), size(4113).

CreateFile(default:M:/1/_FILE_1731) <-- 6th file
client_write(/1/_FILE_1731): ds_connect(), TID(2876)
ds_connect: get, sock_count = 1

WriteFile(default:M:\1\_FILE_1731), Position(0), size(5119).
WriteFile(default:M:\1\_FILE_1731), Position(520215), size(4113).

CloseFile(default:M:\1\10\1\1\_FILE_570): Entry.. <-- 3rd file
InvalidateFile(/1/10/1/1/_FILE_570): release, sock_count(2), TID(3068)

CloseFile(default:M:\7\_FILE_8035): Entry.. <-- 4th file
InvalidateFile(/7/_FILE_8035): release, sock_count(3), TID(3068)

CloseFile(default:M:\1\_FILE_7046): Entry.. <-- 5th file
InvalidateFile(/1/_FILE_7046): release, sock_count(4), TID(3068)

CloseFile(default:M:\1\_FILE_1731): Entry.. <-- 6th file
InvalidateFile(/1/_FILE_1731): release, sock_count(5), TID(3068)
Posted: 06/24/2011 05:38:15
by Volodymyr Zinin (Team)

When your code waits inside a callback the worker thread which calls the callback stays blocked and therefore can't process other requests. In the case when all worker threads (see the CallbackFileSystem.ThreadPoolSize property) are waiting, no other requests will be processed.
If the problem connected with a lack of sockets why not to use only one or several of sockets and send all the requests via them?
Posted: 06/24/2011 05:45:23
by Volodymyr Zinin (Team)

BTW you can use Process Monitor from sysinternals.com to find what I/O requests cause the problem. Run it and set a filter there to show only requests passed to your virtual disk (for example the filter can be like this "Path"->"Begins with"->"X:"). Mostly the system requests correlate with the CBFS callbacks.
Posted: 06/27/2011 22:30:54
by Sangmin Lee (Standard support level)
Joined: 06/03/2009
Posts: 57

I know default ThreadPoolSize is 65536. In our FS, we use this value. And, In Test case 2, the loop count was just 100. Therefore, I think it's impossible all worker threads in CBFs was suspending. Because the # of callbacks being waiting in our client was not reached to ThreadPoolSize.
And, I run ProcessMonitor on test. I saw the close system call to the file that wasn't called back to our client was performed in ProcessMonitor.

Can I get current # of available worker threads in CBFS? If possible, I can determine if all worker threads in CBFS are using.
Posted: 06/28/2011 03:48:45
by Volodymyr Zinin (Team)

In the case of ThreadPoolSize is 65536 (which is 0xFFFF) CallbackFS sets it to some default value which in the latest build is numberOfProcessors*10 but can be changed in future. Try to set it for example to 200.

Sangmin Lee wrote:
And, I run ProcessMonitor on test. I saw the close system call to the file that wasn't called back to our client was performed in ProcessMonitor.

The real close of a file occurs only if the IRP_MJ_CLOSE request is passed by the system to CallbackFS. Process Monitor by default doesn't show such request. In order to see it remove from the Process Monitor filter the following record:
Operation => Begins with => IRP_MJ_ => Exclude

Sangmin Lee wrote:
Can I get current # of available worker threads in CBFS?

By monitoring your callbacks it's possible to calculate how many worker threads are running. I.e. if you set ThreadPoolSize to 200 and in the current time 10 callbacks are being executed (for example there are 10 calls of the OnClose callback and all of these threads are waiting for sock_mutex) then 190 of the worker threads are free.



Topic viewed 5221 times

Number of guests: 1, registered members: 0, in total hidden: 0


Back to top

As of July 15, 2016 EldoS business operates as a division of /n software, inc. For more information, please read the announcement.

Got it!