EldoS | Feel safer!

Software components for data protection, secure storage and transfer

S3 Upload consumes a lot of memory

Also by EldoS: CallbackFilter
A component to monitor and control disk activity, track file and directory operations (create, read, write, rename etc.), alter file data, encrypt files, create virtual files.
#35085
Posted: 11/23/2015 10:16:42
by ntr1 (Standard support level)
Joined: 02/05/2014
Posts: 73

Hello I noticed when you upload a large file to S3, using the WriteObject method, the component allocates a quantity of memory equal to the file size.

If you're uploading a 1 GB file, 1 GB is used from RAM.

Is it the default behavior and there is a way to change this, or are we wrong in some settings ?

Thank you
#35087
Posted: 11/23/2015 10:55:14
by Ken Ivanov (EldoS Corp.)

Thank you for contacting us.

Could you please specify how exactly are you calling the method? A piece of the relevant code would be great.

Ken
#35088
Posted: 11/23/2015 11:19:48
by ntr1 (Standard support level)
Joined: 02/05/2014
Posts: 73

Hi,

this is the component creation:

S3Comp := TElAWSS3DataStorage.Create(nil);
S3Comp.OnSecurityHandlerCreated := S3_HandleStorageSecurityHandlerCreated;
S3Comp.OnProgress := S3_HandleStorageProgress;
S3Comp.Overwrite := True;
S3HTTPClient := TElHTTPSClient.Create(nil);
S3HTTPClient.OnCertificateValidate := S3_HandleCertValidate;
S3Comp.HTTPClient := S3HTTPClient;
S3Comp.HTTPClient.SendBufferSize := 65535;
S3Comp.UseVersion4Signatures := True;

then, ths simple function to upload a file:

Handler := nil;
F := TFileStream.Create(sFileFullPath, fmOpenRead);
S3Comp.WriteObject(sBucketName, ExtractFileName(sFileFullPath), F, '', '', nil, nil, Handler);
FreeAndNil(F);

When the WriteObject is called for a 1 GB file for example, you will see the process rapidly increasing its memory usage to 1 GB, and then the upload starts. How to change this, and make the upload without allocating the whole file in memory?
#35092
Posted: 11/23/2015 13:23:48
by Ken Ivanov (EldoS Corp.)

I can see the problem now, thank you.

Due to specifics of AWS version 4 signature requirements, an S3 client that uploads an object in one-step (non-multipart) mode must include its contents in the general signature hash calculation routine. In other words, every such upload operation requires two passes over the object content: the first to calculate the hash, and the second to actually send the object to the server. This way, TElAWSS3DataStorage must cache the object data somewhere to avoid rewinding the stream. This was mistakenly implemented via a built-in memory stream, hence the greedy behaviour. We will re-work this part of code to actually ask the user for the temporary stream, where they could pass a different (e.g. file) type of stream, and thus avoid huge memory allocations for larger objects.

The ideal solution though would be switching to multipart uploads for larger files instead of uploading them in one piece. To activate multipart uploads, please decrease the MultipartUploadThreshold property down to the desired threshold value (e.g. 100MB), and all files of larger size will be uploaded in small chunks, avoiding the creation of large in-memory streams.

Thank you for reporting the problem, it was a good catch.

Ken
#35134
Posted: 11/30/2015 05:09:28
by ntr1 (Standard support level)
Joined: 02/05/2014
Posts: 73

Unfortunately the workaround doesn't work.

You can try by yourself when you set the MultipartUploadThreshold to 100MB (131072)the OnProgress event is rarely processed, so the program freezes during the upload.

Same thing if I change the property MultipartUploadPartSize.

By the way, in the Help file, both properties are described as:

"Specifies the minimal object size for which a multipart upload method will be used."

But they have different names and different default values. What's wrong?
#35142
Posted: 12/01/2015 05:46:29
by Ken Ivanov (EldoS Corp.)

Quote
Unfortunately the workaround doesn't work.

Hmm, do I understand right that the memory allocation issue has not gone even after you've switched to multipart upload mode?

Quote
You can try by yourself when you set the MultipartUploadThreshold to 100MB (131072)the OnProgress event is rarely processed, so the program freezes during the upload.

We will check that now, thank you.

Quote
By the way, in the Help file, both properties are described as:

"Specifies the minimal object size for which a multipart upload method will be used."

Thank you for letting us know about the mistake. There is a typo in the description of the MultipartUploadPartSize property. The title should read 'Specifies the size of one part in bytes for multipart upload mode'. The rest of the information seems to be correct.

We will get back to you shortly once the things are clear with OnProgress event.

Ken
#35256
Posted: 12/21/2015 07:07:46
by ntr1 (Standard support level)
Joined: 02/05/2014
Posts: 73

Any news about the OnProgress bug and memory leak?
#35257
Posted: 12/21/2015 08:45:02
by Ken Ivanov (EldoS Corp.)

The allocation issue was partially addressed in build 14.0.286 by employing temporary file streams instead of a memory stream. The granularity of OnProgress is currently being re-worked. We expect it to be implemented for the next SecureBlackbox update. Sorry for making you wait.

I'd also like to stress that this is neither a bug nor a memory leak, as you referred to them above. While some over-allocation indeed does take place, the memory is correctly cleaned up after use.

Ken
#36015
Posted: 02/26/2016 09:18:59
by ntr1 (Standard support level)
Joined: 02/05/2014
Posts: 73

Hi,

I've updated Cloudblackbox to the version 14.

Still the upload of a file to Amazon S3 consumes a lot of memory.

For example, if you want to send a 3 GB file and you've only 2 GB of RAM, the result is an "Out of Memory" error. This makes the component practically unusable in a lot of scenarios.


Setting the properties MultipartUploadPartSize or MultipartUploadThreshold to a small value such as 128 MB doesn't help.

How can we solve this?
#36016
Posted: 02/26/2016 09:40:00
by Ken Ivanov (EldoS Corp.)

Hi,

Quote
For example, if you want to send a 3 GB file and you've only 2 GB of RAM, the result is an "Out of Memory" error. This makes the component practically unusable in a lot of scenarios.

In order to get use of temporary file streams, please handle the TElAWSS3DataStorage.OnCreateTemporaryStream event. Inside the handler you will need to create a temporary file stream object and return it via the handler's by-reference Stream parameter.

If OnCreateTemporaryStream event is not handled, or if no stream is returned from it, an internal memory stream is created (leading to the same memory allocation issue).

Quote
Setting the properties MultipartUploadPartSize or MultipartUploadThreshold to a small value such as 128 MB doesn't help.

That's weird. Did you try smaller values (e.g. threshold = 20MB, part size = 10MB)?

Ken
Also by EldoS: Callback File System
Create virtual file systems and disks, expose and manage remote data as if they were files on the local disk.

Reply

Statistics

Topic viewed 3947 times

Number of guests: 1, registered members: 0, in total hidden: 0




|

Back to top

As of July 15, 2016 EldoS Corporation will operate as a division of /n software inc. For more information, please read the announcement.

Got it!