EldoS | Feel safer!

Software components for data protection, secure storage and transfer

maybe bug in TElSignedCMSMessage destructor

Also by EldoS: MsgConnect
Cross-platform protocol-independent communication framework for building peer-to-peer and client-server applications and middleware components.
#12859
Posted: 03/26/2010 04:34:06
by Christoph Moar (Standard support level)
Joined: 08/28/2009
Posts: 46

Hi,
i have found an issue with the c++ builder implementation of the TElSignedCMSMessage class.
What happens is that - probably - the TElSignedCMSMessage constructor/destructor might be using a resource and not freeing it up completely. I noticed the problem in the following way:

- i make a loop and make approx. 3000 new() and delete() operations on the TElSignedCMSMessage class. i do not notice any exception or problem here.
- after doing this, I am using another api which basically uses tcp connections

Well, strangely enough, this second api gives me errors and cannot really connect to the destination anymore. If i do not make the loop with the TElSignedCMSMessage class, everything goes fine.

I noticed this since we have a batch routing doing mass data signature and timestamping and storage into a document management system. After approx 2000 signature operations (which included 2000 new() and delete() on telsignedcmsmessage), the document management system API was giving me exceptions. I tracked down the situation and identified that the issue disappears as soon as I do not new() and delete() the signedcmsmessage anymore.

Instead, now I create one signedcmsmessage object and "reuse" it all the times using the createnew() method. The issue thus disappeared.

I understand this is not really a very precise error identification, but believe me it took me days to get at least to this point. Not knowing which ressources the telsignedcmsmessage actually acquires/releases, there's not much more input I can give to help you identify a possible leak. Can you maybe have a look at that class constructor/destructor and check if there is any system ressource that might get used and not free'd correctly?

Regards,
Christoph
#12860
Posted: 03/26/2010 06:56:05
by Ken Ivanov (EldoS Corp.)

Thank you for contacting us.

Can you please clarify some points in your message:

Quote
i have found an issue with the c++ builder implementation of the TElSignedCMSMessage class.

Which C++ Builder version are you using?

Quote
i make a loop and make approx. 3000 new() and delete() operations on the TElSignedCMSMessage class. i do not notice any exception or problem here.

Are new() and delete() the only operations you perform within a loop?

Quote
Well, strangely enough, this second api gives me errors and cannot really connect to the destination anymore. If i do not make the loop with the TElSignedCMSMessage class, everything goes fine.

What kind of errors does that "Second API" expose? Do you face exactly the same errors every time you run your application or they differ from time to time?
#12861
Posted: 03/26/2010 08:17:02
by Christoph Moar (Standard support level)
Joined: 08/28/2009
Posts: 46

Hi Innokentiy,
- I use c++ builder 2010
- Yes, I removed anything else in the loop and the only things I do now are new() and delete() operations
- The strange thing itself is that I can make as many new()/delete() operations as I want. I immagined that if there was any resource allocation/free problem in constructor/destructor I should get an error when trying to do it event more time
- The "second api" is basically a black box and permits me to insert/update/delete documents from a document management system over a tcp connection.

Basically, the real scenario was:

(loop over 2000 documents)
for each document
{
other_api -> getdocument()
eldos->signdocument()
other_api -> putdocument()
}

This loop worked fine up to - reproducable - 2004 documents, then the other_api would throw a kind of "connect error" (sorry, no real operating system error here). So i basically dug down into the code and tried to simplify it to identify the error. it got again reproducable by doing this:

one single document
{
other_api -> getdocument()
loop 2000 times: { eldos->signdocument() }
other_api -> putdocument()
}

So the next step was to check what happens in the eldos->signdocument block of code. I removed step by step any call or object i was doing, until, at the end, the following code remains:

one single document
{
other_api -> getdocument()
loop 2000 times
{
new telcmssignedmessage();
delete telcmssignedmessage();
}
other_api -> putdocument()
}

Again, the error occurs.
If I remove the delete telcmssignedmessage() (thus producing a memory leak), the error remains. Once I remove the new telcmssignedmessage() the error
is gone.

So basically now I rewrote the code not to use a new() and delete() of a cmssignedmessage, but to re-use instead the very same instance for all the documents I am signing. no error anymore.

--

So all I can say is a hypothesis: it looks like there might be something in the constructor of telcmssignedmessage which somehow uses a system resource (maybe file handles? maybe sockets?). Obviously I cannot check this, but maybe you understand what I am thinking about. But i do not really understand why - if telcmssignedmessage constuctor would be acquiring such a resource - the same constructor should not throw an exception itself if, after 2000 iterations, it cannot acquire a new one again (like later the other_api suggests do be doing).

I know this is all rather nebulous, for now I have a working workaround anyhow, but I wanted to suggest to have a look at the code, maybe really there is something involved.

thanks for your interest in this issue,
regards

christoph
#12870
Posted: 03/28/2010 22:47:26
by Ken Ivanov (EldoS Corp.)

Thank you for very detailed explanation of the issue. We will try to narrow it down.
#12871
Posted: 03/29/2010 02:25:57
by Christoph Moar (Standard support level)
Joined: 08/28/2009
Posts: 46

Thanks Innokentiy,
and thanks for investigating the issue. As I said, what happens is very strange and I cannot even exclude that the issue is actually a poorly error_safe other_api, but the issue is:

- if instantiating and deleting the telcmssignedmessage is - hypothetically - creating some kind of resource leak
- why is a subsequent new() of telcmssignedmessage not throwing an error
- but a subsequent usage of the other_api throwing a reproducable error

This is rather nebulous, I would assume that if telcmssignedmessage is acquiring resources in the constructor (and, hypothetically, not releasing them in the destructor), then a subsequent constructor call should throw an error. This is not the case. So it might even be that telcmssignedmessage is kind of "robust" to such an issue and maybe falls back on a fallback code, while the other_api is not doing it?


I would regret having you spend time on this issue and maybe finding out that the issue is at the other_api side, even though I have really tried everything to make sure that that is not the case. Maybe it's enough if you look at the constructor/destructor if telcmssignedmessage and tell me if and what kind of resources it is acquiring/releasing, withouth you having to spend too much time?

thanks,
regards
christoph
#12879
Posted: 03/29/2010 06:39:45
by Ken Ivanov (EldoS Corp.)

We have managed to pinpoint the reason. The problem is caused by a small resource leakage in each TElSignedCMSMessage instance. Please apply the following fix to eliminate the leak:
a) open the SBCMS.pas unit,
b) find the implementation of the destructor of TElCMSMessage,
c) add the FreeAndNil(FMessage) call right after the Reset() call. This way, the correct destructor implementation will look like
Code
destructor TElCMSMessage.Destroy;
begin
  Reset;
  FreeAndNil(FMessage);
  inherited;
end;


It is likely that the other_api tries to allocate all the needed resources at once (and fails not being able to do it), while TElSignedCMSMessage allocates a little amount of the leaking resource for every instance. That's why you succeeded to create another TElSignedCMSMessage object, while failing to run the other_api successfully.
#12881
Posted: 03/29/2010 07:09:16
by Christoph Moar (Standard support level)
Joined: 08/28/2009
Posts: 46

Great Innokentiy.
Unfortunately, I am not compiling my own pas units - i use the precompiled c++ builder version. I you want me to test the issue, just tell me from which compiled version on you will have the fix integrated and I'll let you have confirmation by then. Or you send me the SBCMS.dcu/.hpp file along and I'll check it right away. For me, it's no hurry, for the time being I am creating one single message instance and reusing it all the times for now.

Let me know if and when you are releasing a precompiled version with this patchlevel.

thanks!

christoph
#12882
Posted: 03/29/2010 07:44:23
by Ken Ivanov (EldoS Corp.)

Well, the workaround will be slightly more complex then:
1) Create a descendant of TElSignedCMSMessage class (e.g., TElFixedSignedCMSMessage) with the only destructor declared,
2) Override the destructor in the following way (Delphi notation):
Code
destructor TElFixedSignedCMSMessage.Destroy;
begin
  inherited;
  FreeAndNil(FMessage);
end;


The fix will be "officially" included to the future build update (SBB 8 beta).

The workaround you are using is correct and does not lead to resource leakage (a single instance of the object is actually leaking), so you can safely use it for now.

Thank you very much for reporting the issue.
Also by EldoS: CallbackDisk
Create virtual disks backed by memory or custom location, expose disk images as disks and more.

Reply

Statistics

Topic viewed 1536 times

Number of guests: 1, registered members: 0, in total hidden: 0




|

Back to top

As of July 15, 2016 EldoS Corporation will operate as a division of /n software inc. For more information, please read the announcement.

Got it!