EldoS | Feel safer!

Software components for data protection, secure storage and transfer

Encoding issue with CBFS and C# / C++ CLR

Posted: 03/05/2015 09:23:37
by Paolo Manili (Basic support level)
Joined: 02/23/2015
Posts: 16


we are developing a C# client using CBFS for our private cloud file sharing solution. Our client forwards callbacks to our C++ CLR libraries which use UTF-8.
However, we are running into a problem with character encoding with strings received from CBFS when there are accented characters involved (i.e. è,à,ò,ù,ì etcetera) we end up with corrupt strings.
By searching the forum we found that CBFS should be passing Unicode characters, however we can't seem to convert them successfully.

This is an example of our conversion code:

    System::String ^Converter::ToUTF8(System::String ^str) {
      if (System::String::IsNullOrEmpty(str)) return "";
        System::Text::Encoding^ df = System::Text::Encoding::Default;
        System::Text::Encoding^ u8 = System::Text::Encoding::UTF8;
        auto bytes = df->GetBytes(str);
        auto result = u8->GetString(bytes);
        return result;

We also tried passing System::Text::Encoding::Unicode, but that result in Chinese ideograms coming up.
Are we missing something?

Thanks in advance for any help you can provide,
Paolo Manili
Posted: 03/05/2015 11:03:30
by Volodymyr Zinin (Team)

Hi Paolo,

Is your function correct? As I understand the function takes a String object and return a String object. String in C# is a collection of Char objects, which represent characters in the UTF-16 format. So the input and the output strings should be the same.
Posted: 03/05/2015 11:33:05
by Eugene Mayevski (Team)

Your function makes no sense.

1) String holds a UTF-16 string in .NET. Your function can't return "UTF8 string".
2) CBFS gives you strings in Unicode. You need to convert them using a call to Encoding::UTF8::GetBytes(SourceString) . This method will return an array of bytes that represent a UTF8 string. You then pass a UTF8 byte array (this is a byte array, not an array of chars!) to your function.

Sincerely yours
Eugene Mayevski
Posted: 03/12/2015 07:20:12
by Paolo Manili (Basic support level)
Joined: 02/23/2015
Posts: 16

Ok, thanks for the clarification.

I understand that the quoted code out of context can seem a little pointless, it is a snippet taken from our C# to C++ bridge.
We are now trying to see if using the Encoding::UTF8::GetBytes(string) gives us the expected result.
We will post the results asap.

Thank you for support
Paolo Manili
Posted: 07/28/2015 04:25:16
by Paolo Manili (Basic support level)
Joined: 02/23/2015
Posts: 16

Hi, we worked on the issue and it did turn out to be a problem with how we encoded characters.

We have mostly resolved the issue, though it has presented itself in another form.

When we add files with accented characters to our CBFS volume through explorer, everything works fine, and we can interact fully (delete, rename, move etc...).

However, when one such file is remotely added to our structure, even though the name shows up fine in Explorer, and on our database and in the queries, somme types of interaction fail: for example, delete calls fail on files containing è ò à ù ì, and so do renames.
By debugging I found that the call that probably fails is "OnEnumerateDirectory" when we get the call for the file with passed Mask as a filter, our database responds with the correct file, and the name matches, (and in the debugger the passed names look identical save for Case ) however windows gives it a fail and shows the "File is unavailable" error.

Is there some special escaping we should do when setting the filename in OnEnumerateDirectory?

Thanks in advance,
Paolo Manili
Posted: 07/28/2015 04:55:41
by Eugene Mayevski (Team)

There's no escaping needed but as I mentioned above the system and CBFS work with UTF-16. If you perform conversion, most likely the problem is in the conversion code. Please check both OnEnumerateDirectory, OnOpenFile and OnGetFileInfo event handler implementations - they are all involved in file deletion operation. Add some logging to your code to ensure that you get (in parameters) and report the same file names.

Sincerely yours
Eugene Mayevski



Topic viewed 4570 times

Number of guests: 1, registered members: 0, in total hidden: 0


Back to top

As of July 15, 2016 EldoS business operates as a division of /n software, inc. For more information, please read the announcement.

Got it!