EldoS | Feel safer!

Software components for data protection, secure storage and transfer

SMTP Charsets

Also by EldoS: Callback File System
Create virtual file systems and disks, expose and manage remote data as if they were files on the local disk.
#27022
Posted: 10/28/2013 17:05:38
by Tim Frost (Standard support level)
Joined: 07/20/2007
Posts: 17

I am looking (without success yet) for some brief documentation on how SMTP charset conversion is implemented, principally for outbound e-mail. I would like to have a convenient list of the supported charset name syntax, without having to delve into many source files. The source seems to have a concept of registering charsets, and this may be of interest to us if we want to support only a subset of available charsets. There seem to be a large number of source files involved, and these perhaps duplicate tables already available in Windows: what is the reason for not simply using the Windows API for the most common charsets?

I have not found any documentation for any of the Charset classes or functions, other than something like, 'this field has the charset name'.
#27023
Posted: 10/29/2013 01:58:28
by Eugene Mayevski (EldoS Corp.)

SMTP doesn't do any charset conversion - it's a protocol for sending mail, not or processing it. Conversions are done in and by MIME classes.

SecureBlackbox includes many charsets and the set of charsets is wider than Windows charsets. It doesn't make sense to use Windows conversion if we have ours, and now you've forgotten that SecureBlackbox targets dozens (if not hundreds) of platforms and operating systems, where "Windows" (that Windows which has charsets as not all "Windows" do) is just a tiny bit of the platform list.


Sincerely yours
Eugene Mayevski
#27024
Posted: 10/29/2013 04:21:24
by Tim Frost (Standard support level)
Joined: 07/20/2007
Posts: 17

Sorry, "SMTP" was inaccurate. And I had not forgotten that SBB targets other platforms, but this is not necessarily a good reason for including many large tables in all versions when they are not needed in some.

What I cannot find is even the minimum of documentation about how all this works. Is there even a list of supported charset names? Is utf-8 the default?

We are considering switching an existing SMTP/MIME application to use SBB to build the messages. For many years we have used other software to build the messages and SBB to sign/encrypt them when needed. So I need to document for our users how they specify charset names. Even a simple statement that all the IANA-listed names and aliases are supported would be helpful, together with any omissions. And some notes on the benefits of defining USE_CHARSETS_EXPLICITLY and/or SB_REDUCED_CHARSETS (which I found in the source) in a high-volume e-mail environment would be useful also.

The fact that your documentation is now so comprehensive makes the omission of any mention of the charset classes surprising (unless it is hidden somewhere!).
#27026
Posted: 10/29/2013 05:28:17
by Alexander Ionov (EldoS Corp.)

Quote
Tim Frost wrote:
And I had not forgotten that SBB targets other platforms, but this is not necessarily a good reason for including many large tables in all versions when they are not needed in some.

It's much simplier for us to have own charsets conversion library which operates the same way on all supported platforms than to deal with charsets support on each platform separately.

Quote
Tim Frost wrote:
Is there even a list of supported charset names?

You can get a list of registered charsets in the following way:
Code
procedure EnumSupportedCharsets(const Category, Description, Name, Aliases: string;
  UserData: TUserData; var Stop: Boolean);
var
  List: TStrings;
begin
  List := TObject(UserData) as TStrings;
  List.Add(Description + '=' + Name);
end;

FCharsets := TStringList.Create();
EnumCharsets(EnumSupportedCharsets, FCharsets);

The EnumCharsets routine is declared in the SBChSConv unit.

Quote
Tim Frost wrote:
Is utf-8 the default?

No, the default charset is ISO-8859-1. MIME RFC 2045 defines US-ASCII as the default charset, but this makes the highest 128 characters unusable. So we use ISO-8859-1 as the default charset.

Quote
Tim Frost wrote:
And some notes on the benefits of defining USE_CHARSETS_EXPLICITLY and/or SB_REDUCED_CHARSETS

Defining of SB_REDUCED_CHARSETS reduces the list of registered charsests to the following: UTF-32, UTF32-BE, UTF-16, UTF-16BE, UTF-8, UTF-7, US-ASCII, ISO-8859-1. These charsets are registered explicitly in the SBChSConv unit. If you don't need other charsets to be registered, add the SB_REDUCED_CHARSETS definition and rebuild the BaseBBox package.

Quote
Tim Frost wrote:
The fact that your documentation is now so comprehensive makes the omission of any mention of the charset classes surprising (unless it is hidden somewhere!).

You're the first one who would like to deal with charsets support classes directly. So there was no need earlier to document them.


--
Best regards,
Alexander Ionov

Reply

Statistics

Topic viewed 793 times

Number of guests: 1, registered members: 0, in total hidden: 0




|

Back to top

As of July 15, 2016 EldoS Corporation will operate as a division of /n software inc. For more information, please read the announcement.

Got it!