Escape sequence: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>GreenC bot
 
imported>Dgpop
 
Line 1: Line 1:
{{about|sequences of characters that, because of a prefix, have a special meaning, possibly to control peripheral devices|specialized usages|Escape sequence (disambiguation)}}
{{about|sequences of characters that, because of a prefix, have a special meaning, possibly to control peripheral devices|specialized usages|Escape sequence (disambiguation)}}
{{Short description|Character combinations with ulterior meaning}}
{{Short description|Series of characters with a special meaning}}
{{Use American English|date=March 2019}}
{{Use American English|date=March 2019}}


In [[computer science]], an '''escape sequence''' is a combination of [[Character (computing)|characters]] that has a meaning other than the literal characters contained therein;<ref>{{cite web  
In [[computing]], an '''escape sequence''' is a sequence of [[Character (computing)|characters]] that has a special [[semantic]] meaning based on an established convention that specifies an [[escape character]] prefix in addition to the [[syntax]] of the rest of the [[Text string|text]] of a sequence.<ref>{{Cite web|url=https://www.spss-tutorials.com/escape-sequence/|title=Escape Sequence (General Concept)}}</ref><ref>{{cite web  
  |title=Escape Sequence |url=https://www.spss-tutorials.com/escape-sequence}}</ref> it is marked by one or more preceding (and possibly terminating) characters.<ref>{{cite web  
   |title=Characters |work=The Java Tutorials |url=https://docs.oracle.com/javase/tutorial/java/data/characters.html}}</ref> A convention can define any particular character code as a sequence prefix. Some conventions use a normal, printable character such as backslash (\) or ampersand (&). Others use a non-printable (a.k.a. control) character such as [[ASCII]] ''escape''.
   |title=Characters |work=The Java Tutorials |url=https://docs.oracle.com/javase/tutorial/java/data/characters.html}}</ref>
 
Escape sequences date back at least to the 1874 [[Baudot code]].<ref name="Economist_2013"/><ref name="Baudot"/><ref name="TC304"/>


==Examples==
==Examples==
* In [[C (programming language)|C]] and many derivative programming languages, a string escape sequence is a series of two or more characters, [[Escape sequences in C|starting with a backslash <code>\</code>]].<ref>{{cite web  
 
===Data transmission===
A common use of an escape sequence is to remove control characters from a data stream so that it does not cause its control function by mistake. The control character is replaced with an escape character and one or more other subsequent characters. After escaping the normal context in which the control character would have caused an action, the sequence is replaced by the removed character.<ref name="CMD.a"/> To transmit the escape character itself, two copies are sent.<ref name="IEY"/>
 
===Text literal===
An escape sequence is often used in [[character literal|character]] and [[string literal|string]] literals, to encode characters which are not printable or clash with the syntax of characters or strings. For example, [[control character]]s might not be allowed in a source file or may have undesirable side-effects if typed into a command.
 
In [[C (programming language)|C]] and many derivative programming languages, a backslash (<code>\</code>) in a [[string literal]] marks the beginning of an [[Escape sequences in C|escape sequence]].<ref>{{cite web  
   |quote=Character combinations consisting of a backslash <code>\</code> followed by a letter or by a combination of digits are called ''escape sequences''.  
   |quote=Character combinations consisting of a backslash <code>\</code> followed by a letter or by a combination of digits are called ''escape sequences''.  
   |title=Escape Sequences  
   |title=Escape Sequences  
   |date=3 August 2021  
   |date=3 August 2021  
  |url=https://msdn.microsoft.com/en-us/library/h21280bw.aspx}}</ref>
  |url=https://msdn.microsoft.com/en-us/library/h21280bw.aspx}}</ref><ref>{{cite web|url=https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.cbclx01/escape.htm
** Note that in C a backslash immediately followed by a newline does <em>not</em> constitute an escape sequence, but splices physical source lines into logical ones in the second translation phase, whereas string escape sequences are converted in the fifth translation phase.<ref>{{cite web
  |title=Escape sequences|website=[[IBM]] }}</ref> Common escape sequences include: [[carriage return]] {{code|\r}}, [[newline]] {{code|\n}}, [[tab character|tab]] {{code|\t}}. To account for the fact that using a printable character for escape causes that character to lose its normal meaning, a sequence of two backslash characters (<code>\\</code>) encodes a single backslash. An escape sequence can also specify a character by its code value. For example, the backslash can be encoded as either <code>\x5c</code> or <code>\134</code> which specify the character code value as [[hexadecimal]] and [[octal]], respectively.
 
A backslash immediately followed by a [[newline]] (which is necessarily outside of a string literal) does not mark an escape sequence. The [[C preprocessor]] joins the line with the subsequent line.<ref>{{cite web
   |url=http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1570.pdf#page=29
   |url=http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1570.pdf#page=29
   |title=ISO/IEC 9899:201x Committee Draft N1570
   |title=ISO/IEC 9899:201x Committee Draft N1570
   |language=English
   |language=English
   |quote=5.1.1.2 Translation phases, 2.: Each instance of a backslash character (<code>\</code>) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. [...]}}</ref>
   |quote=5.1.1.2 Translation phases, 2.: Each instance of a backslash character (<code>\</code>) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. [...]}}</ref>
** To represent the backslash character itself, <code>\\</code> can be used, whereby the first backslash indicates an escape and the second specifies that a backslash is being escaped.<ref>{{cite web
  |url=https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.cbclx01/escape.htm
  |title=Escape sequences|website=[[IBM]] }}</ref>
** A character may be escaped in multiple different ways. Assuming ASCII encoding, the escape sequences <code>\x5c</code> ([[hexadecimal]]), <code>\\</code>, and <code>\134</code> ([[octal]]) all encode the same character: the backslash <code>\</code>.
* For devices that respond to [[ANSI escape code|ANSI escape]] sequences, the combination of three or more characters beginning with the ASCII "escape" character (decimal character code 27) followed by the left-bracket character <code>[</code> (decimal character code 91) defines an escape sequence.


==Control sequences==
===Quoting escape===
When directed, this series of [[character (computing)|characters]] is used to change the [[State (computer science)|state]] of [[computer]]s and their attached [[peripheral]] devices, rather than to be displayed or printed as regular [[Data (computing)|data]] bytes would be, these are also known as '''control sequences''', reflecting their use in device control, beginning with the '''Control Sequence Initiator''' - originally the "escape character" ASCII code - character 27 (decimal) - often written "Esc" on [[keycap]]s.
{{see|String literal#Escape sequences}}


With the introduction of ANSI terminals most escape sequences began with the ''two'' characters "ESC" then "[" or a specially-allocated '''CSI''' character with a code 155 (decimal).
When an escape character is needed within a string literal, there are two common strategies:
* Doubled delimiter {{endash}} For example, <code><nowiki>'He didn''t do it.'</nowiki></code>)<ref name="IEY"/>
* Secondary escape sequence {{endash}} For example, the [[cmd.exe|command prompt]] command {{code|echo Cut^&Paste}} outputs "Cut&Paste" in by escaping the ampersand operator with a caret (<code>^</code>)<ref name="CMD.a"/>


Not all control sequences used an escape character; for example:
In C and many related languages, the escape character is the backslash ({{code|\}}). The single quotation mark character can be coded as <code><nowiki>'\''</nowiki></code> since <code><nowiki>'''</nowiki></code> is not valid. As a string literal is [[delimiter|delimited]] by double-quotes (<code>"</code>) the content cannot contain a double-quote unless it is escaped (<code>"\""</code>) or via a sequence that specifies the code of the double-quote character (<code><nowiki>\x22</nowiki></code>).
* modem control sequences used by AT/[[Hayes command set|Hayes-compatible]] modems<ref name="Hayes"/><ref name="CISCO"/>


* [[Data General]] terminal control sequences,<ref name="Data_General_Terminals"/><ref name="Kermit"/><ref name="DG210"/> but they often were still called escape sequences, and the very common use of "escaping" special characters in programming languages and command-line parameters today often use the "backslash" character to begin the sequence.
In [[Perl]] or [[Python (programming language)|Python 2]], the following is invalid syntax:


Escape sequences in communications are commonly used when a computer and a peripheral have only a single channel through which to send information back and forth (so escape sequences are an example of [[in-band signaling]]).<ref name="Dict"/><ref name="Terminal_Handbook"/> They were common when most [[dumb terminals]] used [[ASCII]] with 7 data bits for communication, and sometimes would be used to switch to a different character set for "foreign" or graphics characters that would otherwise been restricted by the 128 codes available in 7 data bits. Even relatively "dumb" terminals responded to some escape sequences, including the original mechanical Teletype printers (on which "glass Teletypes" or VDUs were based) responded to characters 27 and 31 to alternate between letters and figures modes.
<syntaxhighlight lang="perl">
print "Nancy said "Hello World!" to the crowd."
</syntaxhighlight>


==Keyboard==
This can be fixed by inserted backslash to escape:
An escape character is usually assigned to the [[Esc key]] on a [[computer keyboard]], and can be sent in other ways than as part of an escape sequence.  For example, the Esc key may be used as an input character in editors such as [[Vi (text editor)|vi]],<ref name="VI"/> or for backing up one level in a menu in some applications.<ref name="PCWorld_2009"/> The Hewlett Packard [[HP 2640]] terminals had a key for a "display functions" mode which would display graphics for all control characters, including Esc, to aid in [[debugging]] applications.


If the Esc key and other keys that send escape sequences are both supposed to be meaningful to an application, an ambiguity arises if a [[character terminal]] is in use. When the application receives the [[ASCII]] escape character, it is not clear whether that character is the result of the user pressing the Esc key or whether it is the initial character of an escape sequence (e.g., resulting from an arrow key press).  The traditional method of resolving the ambiguity is to observe whether or not another character quickly follows the escape character. If not, it is assumed not to be part of an escape sequence. This [[heuristic]] can fail under some circumstances, especially without fast modern communication speeds.
<syntaxhighlight lang="perl">
print "Nancy said \"Hello World!\" to the crowd."
</syntaxhighlight>


Escape sequences date back at least to the 1874 [[Baudot code]].<ref name="Economist_2013"/><ref name="Baudot"/><ref name="TC304"/>
Alternatively, the following uses "\x" to indicate the subsequent two characters are hexadecimal digits; "22" being the hexadecimal ASCII value for double-quote.


==Modem control==
<syntaxhighlight lang="perl">
The [[Hayes command set]], for instance, defines a single escape sequence, ''[[+++ (modem)|+++]]''. (In order to interpret ''+++'', which may be a part of data, as the escape sequence, the sender stops communication for one second before and after the ''+++''.) When the modem encounters this in a stream of data, it switches from its normal mode of operation, which simply sends any characters to the phone, to a command mode in which the following data is assumed to be a part of the command language. You can switch back to the ''online mode'' by sending the O command.
print "Nancy said \x22Hello World!\x22 to the crowd."
</syntaxhighlight>


The Hayes command set is [[Mode (user interface)|modal]], switching from command mode to online mode.<ref name="Modem_2011"/><ref name="Modem_Programming"/> This is not appropriate in the case where the commands and data will switch back and forth rapidly. An example of a non-modal escape sequence control language is the [[VT100]], which used a series of commands prefixed by a [[Control Sequence Introducer]].
[[C (programming language)|C]], [[C++]], [[Java (programming language)|Java]], and [[Ruby (programming language)|Ruby]] allow the same two backslash escape styles. [[PostScript]] and [[rich text format]] (RTF) also use backslash escapes. The [[quoted-printable]] encoding uses the [[equals sign]] as an escape character. [[URL]] and [[URI]] use [[percent-encoding]] to quote characters with a special meaning, as for non-ASCII characters.  


==Comparison with control characters==
===ANSI escape sequences===
{{main|Control character}}
A control character is a character that, in isolation, has some control function, such as [[carriage return]] (CR). Escape sequences, by contrast, consist of one or more [[escape character]]s which change the interpretation of subsequent characters.


==ASCII video data terminals==
The [[VT52]] terminal used simple [[Digraph (computing)|digraph]] commands like escape-A. Without the escape character prefix, {{code|A}} simply meant the letter {{code|A}}, but as part of the escape sequence {{code|escape-A}}, it had a different meaning. The VT52 also supported parameters. It was not a straightforward control language encoded as substitution.
The [[VT52]] terminal used simple [[Digraph (computing)|digraph]] commands like escape-A: in isolation, "A" simply meant the letter "A", but as part of the escape sequence "escape-A", it had a different meaning. The VT52 also supported parameters: it was not a straightforward control language encoded as substitution.


The later [[VT100]] terminal implemented the more sophisticated [[ANSI escape sequences]] standard (now ECMA-48) for functions such as controlling cursor movement, character set, and display enhancements. The Hewlett Packard [[HP 2640]] series had perhaps the most elaborate escape sequences for block and character modes, programming keys and their soft labels, graphics vectors, and even saving data to tape or disk files.
The later [[VT100]] terminal implemented the more sophisticated [[ANSI escape sequences]] standard (now ECMA-48) for functions such as controlling cursor movement, character set, and display enhancements. The [[HP 2640]] series had perhaps the most elaborate escape sequences for block and character modes, programming keys and their soft labels, graphics vectors, and even saving data to tape or disk files.


===Use in DOS and Windows===
In [[Windows]] (and [[MS-DOS]]), a utility, [[ANSI.SYS]],<ref>{{Cite web|url=https://www.oreilly.com/library/view/special-edition-using/0789725738/ch17.html|title=17. Understanding ANSI.SYS - Special Edition Using MS-DOS® 6.22, Third Edition [Book]|website=www.oreilly.com}}</ref> can be used to enable ANSI escape sequence support. In DOS via <code>$e</code> in the [[PROMPT (DOS command)|PROMPT]] command), and in 16-bit Windows via a command window. In [[Unix]] and [[Unix-like]] systems, the ANSI escape sequences are generally supported by the [[shell (computing)|shell]]. The rise of [[GUI]] applications has reduced the use of escape sequences, yet the ability to provide full-screen, text-based applications is still available.
A utility, [[ANSI.SYS]],<ref>{{cite book
  |title=17. Understanding ANSI.SYS - Special Edition Using MS-DOS 6.22
  |url=https://www.oreilly.com/library/view/special-edition-using/0789725738/ch17.html}}</ref> can be used to enable the interpreting of the ANSI (ECMA-48) terminal escape sequences under [[DOS]] (by using <code>$e</code> in the [[PROMPT (DOS command)|PROMPT]] command) or in command windows in 16-bit [[Windows]]. The rise of [[GUI]] applications, which directly write to display cards, has greatly reduced the usage of escape sequences on Microsoft platforms, but they can still be used to create interactive random-access character-based screen interfaces with the character-based library routines such as [[printf]] without resorting to a GUI program.


===Use in Linux and Unix displays===
==Related==
The default text terminal, and text windows (such as using [[xterm]]) respond to ANSI escape sequences.


==Quoting escape==
===Control sequence===
===Overview===
A control sequence is a sequence of characters that changes the [[State (computer science)|state]] of a computer [[peripheral]] instead of conveying the normal information that the characters represent. In an ANSI escape sequence, the escape sequence prefix, called [[Control Sequence Introducer|control sequence introducer]], can be either ASCII ESC (decimal 27) followed by <code>[</code> or CSI (decimal 155). Notable systems that did not use an escape character for control sequences include:
When an [[escape character]] is needed within the quoted/escaped string, there are two strategies used within programming and scripting languages:
* The [[Hayes command set]] defines a control sequence, <code>[[+++ (modem)|+++]]</code> that is [[Mode (user interface)|modal]]; switching from command to online mode. To ensure that the sequence is interpreted as a control sequence instead of embedded in content, the sender stops communication for one second before and after sending {{code|+++}}. When the modem detects condition, it switches from normal mode (sending characters to the phone) to a command mode in which the data is interpreted a command. Sending the O command switches back to the normal mode.<ref name="Modem_2011"/><ref name="Modem_Programming"/><ref name="Hayes"/><ref name="CISCO"/>
* doubled delimiter (e.g. <code><nowiki>'He didn''t do it.'</nowiki></code>)<ref name="IEY"/>
* [[Data General]] terminal control sequences,<ref name="Data_General_Terminals"/><ref name="Kermit"/><ref name="DG210"/> but they often were still called escape sequences, and the very common use of "escaping" special characters in programming languages and command-line parameters today often use the "backslash" character to begin the sequence.
* secondary escape sequence


An example of the latter is in the use of the caret (<code>^</code>). E.g. this outputs "You can do so via Cut&Paste" in [[cmd.exe|CMD]]. (otherwise, the ampersand has a restricted use)<ref name="CMD.a"/>
Escape sequences in communications are commonly used when a computer and a peripheral have only a single channel through which to send information back and forth (so escape sequences are an example of [[in-band signaling]]).<ref name="Dict"/><ref name="Terminal_Handbook"/> They were common when most [[dumb terminals]] used [[ASCII]] with 7 data bits for communication, and sometimes would be used to switch to a different character set for "foreign" or graphics characters that would otherwise been restricted by the 128 codes available in 7 data bits. Even relatively "dumb" terminals responded to some escape sequences, including the original mechanical Teletype printers (on which "glass Teletypes" or VDUs were based) responded to characters 27 and 31 to alternate between letters and figures modes.
 
echo You can do so via Cut^&Paste
 
===In detail===
{{see|String literal#Escape sequences}}
{{see also|Escape sequences in C}}
A common use of escape sequences is in fact to remove control characters found in a binary data stream so that they will not cause their control function by mistake. In this case, the control character is replaced by a defined "escape character" (which need not be the US-ASCII escape character) and one or more other characters; after exiting the context where the control character would have caused an action, the sequence is recognized and replaced by the removed character.<ref name="CMD.a"/> To transmit the "escape character" itself, two copies are sent.<ref name="IEY"/>
 
In many [[programming language]]s and command line interfaces escape sequences are used in [[character literal]]s and [[string literal]]s, to express characters which are not printable or clash with the syntax of characters or strings. For example, [[control characters]] themselves might not be allowed to be placed in the program coded by the editor program, or may have undesirable side-effects if typed into a command. The end-of-quote character is also a problem for programmers that can be solved by escaping it. In most contexts the escape character is the [[backslash]] ("'''\'''").
 
===Samples===
For example, the single quotation mark character might be expressed as <code><nowiki>'\''</nowiki></code> since writing <code><nowiki>'''</nowiki></code> is not acceptable.
 
Many modern programming languages specify the doublequote character (<code><nowiki>"</nowiki></code>) as a [[delimiter]] for a string literal. The backslash escape character typically provides ways to include doublequotes inside a string literal, such as by modifying the meaning of the doublequote character embedded in the string (<code><nowiki>\"</nowiki></code>), or by modifying the meaning of a sequence of characters including the hexadecimal value of a doublequote character (<code><nowiki>\x22</nowiki></code>). Both sequences encode a literal doublequote (<code><nowiki>"</nowiki></code>).
 
In [[Perl]] or [[Python (programming language)|Python]] 2
<syntaxhighlight lang="perl">
print "Nancy said "Hello World!" to the crowd.";
</syntaxhighlight>
produces a syntax error, whereas:
<syntaxhighlight lang="perl">
print "Nancy said \"Hello World!\" to the crowd.";  ### example of \"
</syntaxhighlight>
produces the intended output.
Another alternative:
<syntaxhighlight lang="perl">
print "Nancy said \x22Hello World!\x22 to the crowd.";  ### example of \x22
</syntaxhighlight>
uses "\x" to indicate the following two characters are hexadecimal digits, "22" being the ASCII value for a doublequote in hexadecimal.
 
[[C (programming language)|C]], [[C++]], [[Java (programming language)|Java]], and [[Ruby (programming language)|Ruby]] all allow exactly the same two backslash escape styles. The [[PostScript]] language and Microsoft [[Rich Text Format]] also use backslash escapes. The [[quoted-printable]] encoding uses the [[equals sign]] as an escape character.
 
[[URL]] and [[URI]] use [[percent-encoding]] to quote characters with a special meaning, as for non-ASCII characters.
 
Another similar (and partially overlapping) syntactic trick is [[stropping (syntax)|stropping]].


Some programming languages also provide other ways to represent special characters in literals, without requiring an escape character (see e.g. [[delimiter collision]]).
===Esc key===
Many [[computer keyboard]]s have an [[Esc key]] (where ''Esc'' is short for ''escape'') even though it is generally not used for entering an escape sequence. The [[Vi (text editor)|vi text editor]] uses the key to exit from input mode.<ref name="VI"/> Some application use the key to cancel an operation or navigate up a level of a nested context.<ref name="PCWorld_2009"/>


==See also==
==See also==
* [[Control character]]
* {{Annotated link|Format (Common Lisp)}}
* [[Escape character]]
* {{Annotated link|printf format string}}
* [[printf format string]]
* {{Annotated link|stropping (syntax)}}
* [[Format (Common Lisp)|format control string]]


==References==
==References==
{{reflist|refs=
{{reflist|refs=
<ref name="Hayes">{{cite web |title=Chapter 5 – AT Commands |url=https://www.perle.com/support_services/documentation_pdfs/5500158.pdf}}</ref>
<ref name="Hayes">{{Cite web|url=https://www.perle.com/support_services/documentation_pdfs/5500158.pdf|title=Chapter 5 – AT Commands}}</ref>
<ref name="CISCO">{{cite web |url=https://www.cisco.com/c/en/us/td/docs/routers/access/2600/software/notes/analogat.html |title=AT Command Set and Register Summary for Analog Modem Modules}}</ref>
<ref name="CISCO">{{Cite web|url=https://www.cisco.com/c/en/us/td/docs/routers/access/2600/software/notes/analogat.html|title=AT Command Set and Register Summary for Analog Modem Modules|website=Cisco}}</ref>
<ref name="Data_General_Terminals">{{Cite FTP |title=Data General terminals: discussion of
<ref name="Data_General_Terminals">{{Cite FTP |title=Data General terminals: discussion of
|server=FTP server
|server=FTP server
|url-status=dead
|url-status=dead
|url=ftp://ftp.invisible-island.net/shuford/terminal/data_general_news.txt}}</ref>
|url=ftp://ftp.invisible-island.net/shuford/terminal/data_general_news.txt}}</ref>
<ref name="Kermit">{{cite web |title=What's a Terminal? |url=http://www.kermitproject.org/terminals.html}}</ref>
<ref name="Kermit">{{Cite web|url=https://www.kermitproject.org/terminals.html|title=What's a Terminal?|website=www.kermitproject.org}}</ref>
<ref name="DG210">{{cite web |title=Data General DG210 DG211 Terminal Emulation Software |url=https://www.hilgraeve.com/knowledge_base/dg210-dg211-emulation}}</ref>
<ref name="DG210">{{Cite web|url=https://www.hilgraeve.com/knowledge_base/dg210-dg211-emulation/|title=Data General DG210 DG211 Terminal Emulation Software}}</ref>
<ref name="Dict">{{cite web |url=https://www6.dict.cc/wp_examples.php?lp_id=1%26lang=en%26s=escape%2520sequence
<ref name="Dict">{{cite web |url=https://www6.dict.cc/wp_examples.php?lp_id=1%26lang=en%26s=escape%2520sequence
|title=Escape sequence}}</ref>
|title=Escape sequence}}</ref>
<ref name="Terminal_Handbook">{{cite web |title=Terminals & Printers Handbook Glossary
<ref name="Terminal_Handbook">{{Cite web|url=https://vt100.net/docs/tp83/glossary.html|title=Terminals & Printers Handbook Glossary|website=vt100.net}}</ref>
|url=https://vt100.net/docs/tp83/glossary.html}}</ref>
<ref name="VI">{{cite web |quote=vi commands […] Pressing the Esc (Escape) key is how you […] |title=Twelve Useful "vi" Commands |url=http://www.eng.buffalo.edu/~yearke/unix/vi12.shtml}}</ref>
<ref name="VI">{{cite web |quote=vi commands […] Pressing the Esc (Escape) key is how you […] |title=Twelve Useful "vi" Commands |url=http://www.eng.buffalo.edu/~yearke/unix/vi12.shtml}}</ref>
<ref name="PCWorld_2009">{{cite magazine |magazine=[[PCworld]] |date=2009-10-29 |title=Five Unexpected Uses for the Esc Key |url=https://www.pcworld.com/article/174661/article.html}}</ref>
<ref name="PCWorld_2009">{{cite magazine |magazine=[[PCworld]] |date=2009-10-29 |title=Five Unexpected Uses for the Esc Key |url=https://www.pcworld.com/article/174661/article.html}}</ref>

Latest revision as of 23:15, 5 October 2025

Script error: No such module "about". Template:Short description Template:Use American English

In computing, an escape sequence is a sequence of characters that has a special semantic meaning based on an established convention that specifies an escape character prefix in addition to the syntax of the rest of the text of a sequence.[1][2] A convention can define any particular character code as a sequence prefix. Some conventions use a normal, printable character such as backslash (\) or ampersand (&). Others use a non-printable (a.k.a. control) character such as ASCII escape.

Escape sequences date back at least to the 1874 Baudot code.[3][4][5]

Examples

Data transmission

A common use of an escape sequence is to remove control characters from a data stream so that it does not cause its control function by mistake. The control character is replaced with an escape character and one or more other subsequent characters. After escaping the normal context in which the control character would have caused an action, the sequence is replaced by the removed character.[6] To transmit the escape character itself, two copies are sent.[7]

Text literal

An escape sequence is often used in character and string literals, to encode characters which are not printable or clash with the syntax of characters or strings. For example, control characters might not be allowed in a source file or may have undesirable side-effects if typed into a command.

In C and many derivative programming languages, a backslash (\) in a string literal marks the beginning of an escape sequence.[8][9] Common escape sequences include: carriage return \r, newline \n, tab \t. To account for the fact that using a printable character for escape causes that character to lose its normal meaning, a sequence of two backslash characters (\\) encodes a single backslash. An escape sequence can also specify a character by its code value. For example, the backslash can be encoded as either \x5c or \134 which specify the character code value as hexadecimal and octal, respectively.

A backslash immediately followed by a newline (which is necessarily outside of a string literal) does not mark an escape sequence. The C preprocessor joins the line with the subsequent line.[10]

Quoting escape

Template:See

When an escape character is needed within a string literal, there are two common strategies:

In C and many related languages, the escape character is the backslash (\). The single quotation mark character can be coded as '\'' since ''' is not valid. As a string literal is delimited by double-quotes (") the content cannot contain a double-quote unless it is escaped ("\"") or via a sequence that specifies the code of the double-quote character (\x22).

In Perl or Python 2, the following is invalid syntax:

print "Nancy said "Hello World!" to the crowd."

This can be fixed by inserted backslash to escape:

print "Nancy said \"Hello World!\" to the crowd."

Alternatively, the following uses "\x" to indicate the subsequent two characters are hexadecimal digits; "22" being the hexadecimal ASCII value for double-quote.

print "Nancy said \x22Hello World!\x22 to the crowd."

C, C++, Java, and Ruby allow the same two backslash escape styles. PostScript and rich text format (RTF) also use backslash escapes. The quoted-printable encoding uses the equals sign as an escape character. URL and URI use percent-encoding to quote characters with a special meaning, as for non-ASCII characters.

ANSI escape sequences

The VT52 terminal used simple digraph commands like escape-A. Without the escape character prefix, A simply meant the letter A, but as part of the escape sequence escape-A, it had a different meaning. The VT52 also supported parameters. It was not a straightforward control language encoded as substitution.

The later VT100 terminal implemented the more sophisticated ANSI escape sequences standard (now ECMA-48) for functions such as controlling cursor movement, character set, and display enhancements. The HP 2640 series had perhaps the most elaborate escape sequences for block and character modes, programming keys and their soft labels, graphics vectors, and even saving data to tape or disk files.

In Windows (and MS-DOS), a utility, ANSI.SYS,[11] can be used to enable ANSI escape sequence support. In DOS via $e in the PROMPT command), and in 16-bit Windows via a command window. In Unix and Unix-like systems, the ANSI escape sequences are generally supported by the shell. The rise of GUI applications has reduced the use of escape sequences, yet the ability to provide full-screen, text-based applications is still available.

Related

Control sequence

A control sequence is a sequence of characters that changes the state of a computer peripheral instead of conveying the normal information that the characters represent. In an ANSI escape sequence, the escape sequence prefix, called control sequence introducer, can be either ASCII ESC (decimal 27) followed by [ or CSI (decimal 155). Notable systems that did not use an escape character for control sequences include:

  • The Hayes command set defines a control sequence, +++ that is modal; switching from command to online mode. To ensure that the sequence is interpreted as a control sequence instead of embedded in content, the sender stops communication for one second before and after sending +++. When the modem detects condition, it switches from normal mode (sending characters to the phone) to a command mode in which the data is interpreted a command. Sending the O command switches back to the normal mode.[12][13][14][15]
  • Data General terminal control sequences,[16][17][18] but they often were still called escape sequences, and the very common use of "escaping" special characters in programming languages and command-line parameters today often use the "backslash" character to begin the sequence.

Escape sequences in communications are commonly used when a computer and a peripheral have only a single channel through which to send information back and forth (so escape sequences are an example of in-band signaling).[19][20] They were common when most dumb terminals used ASCII with 7 data bits for communication, and sometimes would be used to switch to a different character set for "foreign" or graphics characters that would otherwise been restricted by the 128 codes available in 7 data bits. Even relatively "dumb" terminals responded to some escape sequences, including the original mechanical Teletype printers (on which "glass Teletypes" or VDUs were based) responded to characters 27 and 31 to alternate between letters and figures modes.

Esc key

Many computer keyboards have an Esc key (where Esc is short for escape) even though it is generally not used for entering an escape sequence. The vi text editor uses the key to exit from input mode.[21] Some application use the key to cancel an operation or navigate up a level of a nested context.[22]

See also

References

Template:Reflist

  1. Script error: No such module "citation/CS1".
  2. Script error: No such module "citation/CS1".
  3. Cite error: Invalid <ref> tag; no text was provided for refs named Economist_2013
  4. Cite error: Invalid <ref> tag; no text was provided for refs named Baudot
  5. Cite error: Invalid <ref> tag; no text was provided for refs named TC304
  6. a b Cite error: Invalid <ref> tag; no text was provided for refs named CMD.a
  7. a b Cite error: Invalid <ref> tag; no text was provided for refs named IEY
  8. Script error: No such module "citation/CS1".
  9. Script error: No such module "citation/CS1".
  10. Script error: No such module "citation/CS1".
  11. Script error: No such module "citation/CS1".
  12. Cite error: Invalid <ref> tag; no text was provided for refs named Modem_2011
  13. Cite error: Invalid <ref> tag; no text was provided for refs named Modem_Programming
  14. Cite error: Invalid <ref> tag; no text was provided for refs named Hayes
  15. Cite error: Invalid <ref> tag; no text was provided for refs named CISCO
  16. Cite error: Invalid <ref> tag; no text was provided for refs named Data_General_Terminals
  17. Cite error: Invalid <ref> tag; no text was provided for refs named Kermit
  18. Cite error: Invalid <ref> tag; no text was provided for refs named DG210
  19. Cite error: Invalid <ref> tag; no text was provided for refs named Dict
  20. Cite error: Invalid <ref> tag; no text was provided for refs named Terminal_Handbook
  21. Cite error: Invalid <ref> tag; no text was provided for refs named VI
  22. Cite error: Invalid <ref> tag; no text was provided for refs named PCWorld_2009