ASCII and Unicode quotes (2023)

forMarkus Kuhn

Resume:Do not use the ASCII backtick (0x60) as left quotes together with the ASCII apostrophe (0x27) as the corresponding right quotes (as in'other'). Otherwise, your text will look quite strange with most modern fonts (for example, on Windows and Mac systems). Only older X Window System fonts and some older video terminals display ASCII 0x60/0x27 as left and right quotes, while most modern systems follow ISO and Unicode standards. If you can only use ASCII typewriter characters, use the apostrophe character (0x27) as left and right quotes (as in'to quote'). If you can use Unicode characters, directional quotes are available as the characters U+2018, U+2019, U+201C, and U+201D (as in'to quote'o"to quote").


oUnicodeand ISO 10646 standards define the following characters:

U+0022BLADEASCII and Unicode quotes (1)neutral (vertical), used as an opening or closing quotation mark; the preferred English characters for double quotes are U+201C and U+201D
U+0027APOSTROPHEASCII and Unicode quotes (2)mixed-use neutral (vertical) glyph; the preferred character for the apostrophe is U+2019; the preferred English characters for double quotes are U+2018 and U+2019
U+0060GRAVE ACCENTASCII and Unicode quotes (3)
U+00B4acute accentASCII and Unicode quotes (4)
U+2018SINGLE LEFT BLADEASCII and Unicode quotes (5)
Sub+2019SINGLE RIGHT BLADEASCII and Unicode quotes (6)this is the preferred character to use as an apostrophe
U+201CDOUBLE LEFT QUOTESASCII and Unicode quotes (7)
U+201DCORRECT DOUBLE QUOTESASCII and Unicode quotes (8)

ASCII and ISO 8859 were only designed to support the very restricted type style available to typewriter users. The two ASCII characters

0x22BLADEASCII and Unicode quotes (9)
0x27APOSTROPHEASCII and Unicode quotes (10)

they supposedly represent the neutral (vertical) glyphs commonly used on typewriters. They shouldnoused as directional quotation marks.

ISO 8859 and Unicode fonts must display both accented characters

0x60GRAVE ACCENTASCII and Unicode quotes (11)
0xB4acute accentASCII and Unicode quotes (12)

as mutually symmetric shapes.

The problem

Unfortunately, X Window System fonts have long contained the following mutually symmetrical glyphs:

(Video) Characters, Symbols and the Unicode Miracle - Computerphile

0x27APOSTROPHEASCII and Unicode quotes (13)
0x60GRAVE ACCENTASCII and Unicode quotes (14)

These forms were even sanctioned by a North American version of the ISO646 (ANSI X3.4, aka ASCII) standard, which defined 0x27 as "apostrophe (closing single quote; acute accent)", but should have been changed when fonts were expanded. to cover ISO 8859-1, which added a separate acute accent at 0xB4. Obviously, you can't have 0x27/0x60 and 0x60/0xB4 as mutually symmetrical glyphs and at the same time have a different shape for 0x27 and 0xB4. Since 0x60/0xB4 are defined as accents by modern standards, their symmetrical form takes precedence, except that this wasn't fixed in X sources until 2004 (slightly earlier in versions that shipped with XFree86).

The old X sources encouraged some Unix software and documentation authors to abuse 0x60 together with 0x27 as directional quotes. This practice seemed somewhat acceptable as

ASCII and Unicode quotes (15)precioASCII and Unicode quotes (16)

if it was displayed with old X fonts, but it looked pretty ugly as

ASCII and Unicode quotes (17)precioASCII and Unicode quotes (18)

in most other modern display environments (for example, with properly designed Windows and Mac TrueType fonts, but also in many vintage video terminals from the 1970s and 1980s, such as those from Siemens/Nixdorf and many other manufacturers ).

For example, 0x60 and 0x27 appear in Windows NT 4.0 with the Lucida Console TrueType font (size 14) like this:

ASCII and Unicode quotes (19)

(Video) ASCII Code and Binary

Unicode and ISO 10646 make a very clear distinction between the undirected typewriter-style ASCII single quote and the apostrophe U+0027 as in

ASCII and Unicode quotes (20)precioASCII and Unicode quotes (21)

and the smart quotes U+2018 and U+2019 like this

ASCII and Unicode quotes (22)precioASCII and Unicode quotes (23)

Unicode 2.1explicitly says that U+2019 is the preferred punctuation apostrophe, as in "We've been here before." The Unicode standard also notes:

“For historical reasons, U+0027 is a particularly overloaded character. In ASCII, it is used to represent a punctuation mark (such as right single quote, left single quote, apostrophe, punctuation, vertical line, or prime) or a letter modifier (such as apostrophe modifier or acute accent). (Punctuation marks generally separate words; modifier letters are generally considered part of a word.) In many systems, it is always represented as a straight vertical line and can never represent a curved apostrophe or proper quotation mark.

To do?

If you create any Unix software, be sure to use the ASCII character 0x60 (`) as leading quotes as in'other'. Change it to use the character 0x27 (') on both sides, as in'to quote'. If you work in an environment where UTF-8 encoding is already used everywhere (for example, Plan9 and newer GNU/Linux installations), you may even decide to use proper directional quotes, as in'to quote'o"to quote".

Check your source code directories with

(Video) Amazing ASCII & Unicode Characters in Excel! Excel Magic Trick #1709


to find out where modifications are needed. Then use (with due care!) something like

perl -pi.bak -e "s/\`/'/g;"archivo1 archivo2...

to make the necessary replacements automatically or make the edits manually.

The use of 0x60 (grave accent) as a special control character in the Unix shell (to indicate command substitution as in`command`or better$(command)), in Perl, inLisp, or in TeX/troff (to denote a proper left single quote) does not need to be changed and remains unchanged. by Donald KnuthText book(chapter 2, page 3, end of second paragraph) has warned TeX users since 1986 that forms of apostrophes and backticks may appear as required by ISO and Unicode and not as used in the rest of the TeXbook. The Unix m4 macroprocessor is probably the only widely used tool that uses the `quote' combination as part of its input syntax; however, even this could be modified viachange quote.

Why should we fix this?

There are several reasons why the old X sources had to be corrected, and with them the associated ASCII backquoting practice:

  • Obviously, the grave accent and the acute accent must be mutually symmetrical, which was not the case in the old X fonts.
  • oUnicode4.0The standard explicitly says that U+0027 is a "mixed-use neutral (vertical) glyph" and displays the entire ASCII section like this:

    ASCII and Unicode quotes (24)

  • The ISO 10646 standard,ISO8859and ISO 646/ECMA-6the patterns also show the upright typewriter apostrophe for U+0027 and have U+0060 and U+00B4 as accents symmetrical to each other.
  • The ANSI X3.4:1986 (“ASCII”) code table, which was printed with the OCR-B font, also shows the upright typewriter apostrophe.historically, the originally proposed use of 0x60 in the 7-bit international coded character set was as a backtick (ISO TC97/SC meeting 2, 29-31 October 1963), and its meaning was only later expanded in implementation pattern US a also covers usage as a left single quote (MCCA 8(4)207-214,1965).
  • Most European keyboards have labels for the apostrophe and both accents. They have always resembled the ISO and Unicode standards. The photo below shows the relevant highlighted keys on a standard German PC keyboard, which has the acute/grave accent key on the left and the number sign/apostrophe key below the backspace key:

    ASCII and Unicode quotes (25)

    It can cause some confusion for users if the key labels and glyph shapes in fonts don't match, as they did in older Xfonts.

  • Microsoft and Apple fonts also follow modern standards and don't agree with older X fonts. X11 users really shouldn't be fooled about how characters they use will appear on other standards-compliant systems. Otherwise, you won't notice that, for example, all users of a Windows web browser (screenshot: Internet Explorer 5) see "back quotes" as in

    ASCII and Unicode quotes (26)

    (Video) Unicode, UTF 8 and ASCII

  • Since XFree86 4.0 was addedSupport for TrueType fonts, users of GNU/Linux systems are increasingly using modern fonts with the straight glyph 0x27 and getting funny quotes with older software that tries to display ASCII directional quotes (mostly variousFIELDpackages).
  • The characters 0x27 (apostrophe) and 0x22 (quotes) are often used to abbreviate minutes and seconds or feet and inches, which is another reason why 0x27 should be a version of 0x22 with a single hyphen, not a directional quote.

Updated X Window System Basic BDF Fontsthey have been available since 1998, in which the apostrophe and grave accent were fixed, along with various other errors. They have replaced the old fonts in XFree86 since version 4.0 and in the sample X.Org implementation since X11R6.8.

related tips


PostScript has a rather complicated history of how it maps ASCII bytes to glyphs. In PostScript fonts, each glyph is identified not by a code position, but by aglyph nameas "single quote". After the publication of the Unicode standard, Adobe released aPostScriptGlyph name for Unicode mappingtable. When a PostScript interpreter displays text, it uses acoding vectorto map 8-bit values ​​found in text strings to glyph names found in fonts.

glyph namecoding vector
U+0022BLADEASCII and Unicode quotes (27)cited0x220x220x22
U+0027APOSTROPHEASCII and Unicode quotes (28)simple blades0xA90x27
U+0060GRAVE ACCENTASCII and Unicode quotes (29)cave0xC10x910x60
U+00B4acute accentASCII and Unicode quotes (30)sharp0xC20x92/0xB40xB4
U+2018SINGLE LEFT BLADEASCII and Unicode quotes (31)left quote0x600x600x91
Sub+2019SINGLE RIGHT BLADEASCII and Unicode quotes (32)right of appointment0x270x270x92
U+201CDOUBLE LEFT QUOTESASCII and Unicode quotes (33)quotedblleft0xAA0x93
U+201DCORRECT DOUBLE QUOTESASCII and Unicode quotes (34)well quoted0xBA0x94

PostScript provides several predefined 8-bit encoding vectors. Printer driver authors can easily add their own. As the table above shows, the originalStandard PostScript encodingit followed a practice similar to the old X fonts, with all its flaws, that is, it assigned the ASCII bytes 0x60 and 0x27 to the opening and closing quotation marks ("quoteleft" and "quoteright" in PostScript glyph naming terminology, or U+ 2018 and EU+2019 in Unicode).

When ISO 8859-1 came out, Adobe added another predefined encoding vector to PostScript calledISOLatin1 encoding. This was supposed to be compatible with ISO 8859-1, but remained at 0x60 and 0x27 unchanged from the old one.standard encodingvector and therefore does not correctly print the ISO 8859-1 characters 0x27 and 0x60, which correspond to the Unicode characters U+0027 and U+0060 and must be represented by the PostScript glyphs “grave” and “quotesingle”. Adobe AuthorsPostScript Language Reference, Third Edition(Addison-Wesley, ISBN0-201-37922-8) acknowledge this in section E.5, footnote 3, page 783, where they note that the “ISOLatin1 encodingthe encoding vector deviates from the ISO 8859-1 standard” and that an application that wants to “exactly comply with the ISO standard must create a modified encoding vector”. The newer CE encoding vector (Central Europe, corresponding to Windows CP1250), now also described in the PostScript Language Reference, correctly assigns 0x27 to "quotesingle" and 0x60 to "severe".

If you write a PostScript driver, use the officialUnicode to PostScript mapping tableto map ASCII, ISO 8859, and ISO 10646 characters to PostScript glyphs, just like the Type 1 renderer updated in XFree86 4.0. don't use theISOLatin1 encodingencoding vector to print ISO 8859-1 text, without first changing it to assign 0x27 to "quotesingle" and 0x60 to "severe". (Also, you may want to assign 0x2D = DASH-MINUS to the PostScript "dash" glyph instead of the "minus" assignment used byISOLatin1 encoding).


The fountaincmtt10in the Computer Modern family of TeX follows the example of the standard PostScript encoding by providing straight double quotes and directional single quotes at ASCII positions 0x22, 0x60, and 0x27. It also provides a single quote, grave accent, and acute accent at code positions 0x0d, 0x12, and 0x13 respectively, but lacks directional double quotes:

U+0022 ASPEDIOS"ASCII and Unicode quotes (35)
U+0027 APOSTROPH\char"0DASCII and Unicode quotes (36)
U+0060 GRAVE ACCENT\char"12ASCII and Unicode quotes (37)
U+00B4 ACUTE ACCENT\char"13ASCII and Unicode quotes (38)
U+2018 LEFT SIMPLE QUOTES`ASCII and Unicode quotes (39)
U+2019 SINGLE RIGHT BLADE'ASCII and Unicode quotes (40)

So, to demonstrate the result of abusing ASCII straight quotes and backticks as directional quotes in a document written in LaTeX, you can write\texttt{\char"12quote\char"0D}. Non-typewriter fonts in Computer Modern do not have single or double quotes.

Usa LaTeXascending quote package(\usepackage{mention ascendant}) to map in literal modes the ASCII characters 0x27 and 0x60 to the correct glyphs.

(Video) ASCII and Unicode


created 12/19/1999 – last modified 12/11/2007 –


1. Standards: ASCII vs Unicode (Java)
(Nathan Schutz)
2. ASCII Code
(Tutorials Point)
3. Unlocking the Secret World of Unicode: You Won't Believe What These Characters Can Do!
4. C# Programming Tutorial 15 - Char Data Type and ASCII Unicode
(Caleb Curry)
5. Unicode - Video Tutorial Part 2
6. COMPUTING LESSON 02: Character Coding, Unicode, ASCII
(Kari Laitinen)


Top Articles
Latest Posts
Article information

Author: Jerrold Considine

Last Updated: 08/31/2023

Views: 6752

Rating: 4.8 / 5 (58 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Jerrold Considine

Birthday: 1993-11-03

Address: Suite 447 3463 Marybelle Circles, New Marlin, AL 20765

Phone: +5816749283868

Job: Sales Executive

Hobby: Air sports, Sand art, Electronics, LARPing, Baseball, Book restoration, Puzzles

Introduction: My name is Jerrold Considine, I am a combative, cheerful, encouraging, happy, enthusiastic, funny, kind person who loves writing and wants to share my knowledge and understanding with you.