Sunday, May 04, 2008

Digital Mode of the Week: ASCII

ASCII stands for American Standard Code for Information Interchange. It was developed in the United States as a standard means of encoding text readable by people as bits readable by digital computers. It gradually replaced other such American codes as IBM's EBCDIC and Commodore's PETSCII. US ASCII is something of a de facto standard worldwide.

You're using ASCII right now. It's still the basis for most of the text characters used by computers, although as a subset of several much larger character sets that are now used. Plain text files are still usually straight ASCII. It's still fundamental to most of our digital modes.

ASCII was originally developed at Bell Labs for use with wireline TWX machines (an AT&T version of the Teletype). It is essentially an expansion and reordering of ITA2 to make it more useful to computers or "dumb" terminals with modems attached. Today's ASCII is a 7-bit asynchronous code with 128 characters (starting at zero). The first 33 characters (0-32) are non-printing, consisting of null (all zeroes) plus a number of control codes, and the space/blank character (decimal 32).

ASCII can, of course, be sent by frequency-shift keying, and in fact it wasn't long before hams investigated its use as an improvement to radioteletype (RTTY). However, its greater complexity and speed made results on noisy HF circuits disappointing at best when just using straight ASCII. Instead, it's usually sent by packet radio or other error-checking teleprinting schemes.

ASCII characters are sent with "framing" consisting of one start bit and one or two stop bits (remember Baudot's use of a longer stop). Characters usually map to bytes, and since these have 8 bits in modern computers, there's a bit left over. Various things are done with this extra bit.

Many modems have the option to use this 8th bit as a parity bit. This gives a rudimentary error check. If parity is used, the 8th bit will be set or unset so that every character has an even number of ones (even parity) or an odd number (odd parity). If parity is turned off on 7-bit ASCII, the receiver will (hopefully) ignore the 8th bit.

The 8th bit is also used to expand the character set to the full 255. Although SHIFT IN and SHIFT OUT are provided, setting this bit can also send the expanded characters, essentially treating ASCII as an 8-bit code with no parity check.

Unfortunately, there's no international standard for this, and technically it's something of a misnomer to apply the name ASCII to all 8 bits. The high-bit characters are often dependent on application. They can contain accented letters and symbols used in a particular language, or little pieces of lines and corners useful for drawing boxes on old text based terminals.

All of this leads to those infamous ASCII receiver setup parameters that are used in most of our digital modes. These are character length (7 or 8 data bits), stop bits (one or two), and parity (odd, even, or none). While some straight ASCII software can autobaud, it's usually also necessary to set the baud rate by hand. Common HF rates are 100, 110, 300, 600, 1200, 1800, and 2400.

Usually getting all this right in a short wave listening situation is by trial and error. It helps that there are really only two settings in common use. These are 7E1 (7 data bits, even parity, one stop bit), and 8N1 (eight data bits, no parity, one stop bit). It is also possible to emulate the old ITA2 alphabet by simply transmitting the appropriate character set in 5N1 or 5N2 (5 data bits, no parity, one or two stop bits). You see this done by the French Navy when sending data in newer modes such as STANAG 4285.

Here's the 7-bit ASCII in a compact table found on Wikipedia:

An expanded listing of this code is at this column's web site.