Forum: 4WheelHam BBS

Default Character Set when none is defined

From Scott Street@1:266/625 to All on Tue Jun 3 09:45:20 2025

Fellow Developers,

I've been tinkering writing a Python based tosser/packer. Some things I've noticed in processing mail; not all messages contain a character set kludge flag (and I find, to my surprise, my own current BBS doesn't support it -cringe-). Though, more to the specific question: what character set does one use as a default when one is not defined?

Do I use ASCII, CP437, CP850, or something else? (I'm in the US, so gut reaction is to use CP437) But is that the right choice?

Secondly, FTS documents suggest that the character set is to applied to the body and header portions of the message, does that include the kludge lines? I'm currently handling them separately as 'ascii'; which given the history of Fidonet, I chose as the more likely answer.

Looking for your thoughts.

And if a different echo is more proper, please point me in that direction and I'll continue the discuss there.

Scott

--- Mystic BBS v1.12 A49 2024/05/29 (Linux/64)
* Origin: <=-{ The Digital Post }-=> (1:266/625)

From deon@3:633/509 to Scott Street on Wed Jun 4 08:07:31 2025

Re: Default Character Set when none is defined
By: Scott Street to All on Tue Jun 03 2025 09:45 am

Hey Scott,

Though, more to the specific question: what character set does
one use as a default when one is not defined?

Do I use ASCII, CP437, CP850, or something else? (I'm in the US, so gut reaction is to use CP437) But is that the right choice?

The CHARS kludge is for the benefit of the reader, not the sender.

It basically tells the reader, that this message is encoded in (CP.., ASCII.., etc), so that if:
* The reader uses the same encoding, it can display the message as is, OR
* The reader uses a different encoding, it can convert it from the (CP.., ASCII.., etc) to the one it uses.

So whats the right choice for you - I'm assuming you (and your users) author messages in the same encoding, so that should be what it is set to.

I add CP437 to messages that my mailer authors, when sending out echomail/netmail.

Secondly, FTS documents suggest that the character set is to applied to the body and header portions of the message, does that include the kludge lines? I'm currently handling them separately as 'ascii'; which given the history of Fidonet, I chose as the more likely answer.

So, the To, From, Subject, and Message Body (including Tagline/Origin lines) can all be technically changed by the sender, and thus in the sender's encoding.

I dont recall what the FTSC documents say (there is one on the chars kludge), but I think it should apply to all of them - such that a reader knows how to present the receied message to a user.

...��
--- SBBSecho 3.27-Linux
* Origin: I'm playing with ANSI+videotex - wanna play too? (3:633/509)

From Rob Swindell@1:103/705 to Scott Street on Tue Jun 3 17:16:35 2025

Re: Default Character Set when none is defined
By: Scott Street to All on Tue Jun 03 2025 09:45 am

-cringe-). Though, more to the specific question: what character set does one use as a default when one is not defined?

Do I use ASCII, CP437, CP850, or something else? (I'm in the US, so gut reaction is to use CP437) But is that the right choice?

Here's what FTS-5003 says about that:
Incoming messages without "CHRS" control lines should be considered
as being written in pure ASCII, but may be treated as being written
in some default character set or character encoding scheme. Such as
IBM codepage 437, IBM codepage 866 or UTF-8. It is recommended that
message readers offer the user the option of manually selecting a
different character set or encoding scheme for these messages on a
per-area, per-message or other basis.

For Synchronet, CP437 is assumed when no other character set/encoding is explicitly specified.

Secondly, FTS documents suggest that the character set is to applied to the body and header portions of the message, does that include the kludge lines? I'm currently handling them separately as 'ascii'; which given the history of Fidonet, I chose as the more likely answer.

Interesting question. Are you actually finding non-ASCIi chars in kludge lines? I'd be curious what those are (the kludge lines/values).

And if a different echo is more proper, please point me in that direction and I'll continue the discuss there.

Maybe FTSC_PUBLIC would be more appropriate for FTN development questions. I seem to recall a NET_DEV echo too, though I don't think it gets much participation.
--- SBBSecho 3.27-Linux
* Origin: Vertrauen - [vert/cvs/bbs].synchro.net (1:103/705)

From Scott Street@1:266/625 to deon on Tue Jun 3 21:25:52 2025

On 04 Jun 2025, deon said the following...

The CHARS kludge is for the benefit of the reader, not the sender.

I was thinking I would need to do some translation before a message was stored, but I since changed my thinking on that. I'll let the BBS translate from the writer's charset to the reader's as needed.

Thanks for the reply.

--- Mystic BBS v1.12 A49 2024/05/29 (Linux/64)
* Origin: <=-{ The Digital Post }-=> (1:266/625)

From Scott Street@1:266/625 to Rob Swindell on Tue Jun 3 21:28:58 2025

On 03 Jun 2025, Rob Swindell said the following...

Here's what FTS-5003 says about that:
Incoming messages without "CHRS" control lines should be considered
as being written in pure ASCII, but may be treated as being written
in some default character set or character encoding scheme. Such as
IBM codepage 437, IBM codepage 866 or UTF-8. It is recommended that
message readers offer the user the option of manually selecting a
different character set or encoding scheme for these messages on a
per-area, per-message or other basis.

Perfect. I missed that reading the docs; but it does spell out what I had thought.

Interesting question. Are you actually finding non-ASCIi chars in kludge lines? I'd be curious what those are (the kludge lines/values).

No, not in my small, about 500 messages sample; but I wanted to be prepared.

Maybe FTSC_PUBLIC would be more appropriate for FTN development
questions. I seem to recall a NET_DEV echo too, though I don't think it gets much participation.

Indeed, thanks for those. I thought I remembered echos specific for development, it's been a while.

Thanks for the reply.

--- Mystic BBS v1.12 A49 2024/05/29 (Linux/64)
* Origin: <=-{ The Digital Post }-=> (1:266/625)

From Carol Shenkenberger@1:275/100 to Scott Street on Sun Jun 8 23:03:04 2025

Re: Default Character Set when none is defined
By: Scott Street to All on Tue Jun 03 2025 09:45 am

Fellow Developers,

I've been tinkering writing a Python based tosser/packer. Some things I've noticed in processing mail; not all messages contain a character set kludge flag (and I find, to my surprise, my own current BBS doesn't support it -cringe-). Though, more to the specific question: what character set does o use as a default when one is not defined?

Do I use ASCII, CP437, CP850, or something else? (I'm in the US, so gut reaction is to use CP437) But is that the right choice?

Secondly, FTS documents suggest that the character set is to applied to the body and header portions of the message, does that include the kludge lines I'm currently handling them separately as 'ascii'; which given the history o Fidonet, I chose as the more likely answer.

Looking for your thoughts.

And if a different echo is more proper, please point me in that direction an I'll continue the discuss there.

Scott

Hey Scott, nothing wrong asking it here but ask it also in FTSC_PUBLIC ok? Oddly, ASIAN_LINK can get replies on technical issueson code pages. I can't promise the developers in that odd quirky echo are still there but it's worth a shot.

xxcarol
--- SBBSecho 2.12-Win32
* Origin: Shenk's Express (1:275/100)

Who's Online
Recent Visitors
- Gwylbert
  Tue Jul 29 23:58:56 2025
  from Sydney, Nsw via Telnet
- Guest
  Wed Jul 30 20:51:26 2025
  from No via Telnet
- Stacy
  Wed Jul 30 12:30:38 2025
  from Kapolei, Hi via Telnet
- Death916
  Wed Jul 30 03:38:37 2025
  from Sacramento via Telnet

System Info

Sysop:	Saxainden
Location:	Littleton, CO
Users:	56
Nodes:	10 (0 / 10)
Uptime:	19:41:15
Calls:	872
Calls today:	3
Files:	43
Messages:	43,537

Default Character Set when none is defined

Who's Online

Recent Visitors

System Info