Empty string: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>Symbol & Font Hunter
 
imported>TristanDC
Use in programming languages: Began being clear with terminology, following the formal theory concept, to avoid making the readers totally baffled. They might still need to apply some thought but the best solution eludes me.
 
Line 2: Line 2:
{{Short description|Unique string of length zero}}
{{Short description|Unique string of length zero}}
{{Refimprove|date=November 2009}}
{{Refimprove|date=November 2009}}
In [[formal language theory]], the '''empty string''', or '''empty word''', is the unique [[String (computer science)|string]] of length zero.
In [[formal language theory]], the '''empty string''', also known as the '''empty word''' or '''null string''', is the unique [[String (computer science)|string]] of length zero.


==Formal theory<span class="anchor" id="nullable symbol"></span>==
==Formal theory<span class="anchor" id="nullable symbol"></span>==
Line 16: Line 16:
* ε<sup>R</sup> = ε. Reversal of the empty string produces the empty string, so the empty string is a [[palindrome]].
* ε<sup>R</sup> = ε. Reversal of the empty string produces the empty string, so the empty string is a [[palindrome]].
* <math>\forall c \in s: P(c)</math>. Statements that are about all characters in a string are [[vacuous truth|vacuously true]].
* <math>\forall c \in s: P(c)</math>. Statements that are about all characters in a string are [[vacuous truth|vacuously true]].
* The empty string precedes any other string under [[lexicographical order]], because it is the shortest of all strings.<ref>[http://cs.fit.edu/~ryan/cse1002/lectures/lexicographic.pdf CSE1002 Lecture Notes – Lexicographic]</ref>
* The empty string precedes any other string under [[lexicographical order]], because it is the shortest of all strings.<ref>{{Cite web |url=http://cs.fit.edu/~ryan/cse1002/lectures/lexicographic.pdf |title=CSE1002 Lecture Notes – Lexicographic |access-date=2010-03-27 |archive-date=2009-12-29 |archive-url=https://web.archive.org/web/20091229212044/http://cs.fit.edu/~ryan/cse1002/lectures/lexicographic.pdf |url-status=dead }}</ref>


In [[context-free grammar]]s, a [[production (computer science)|production rule]] that allows a [[symbol (logic)|symbol]] to produce the empty string is known as an ε-production, and the symbol is said to be "nullable".
In [[context-free grammar]]s, a [[production (computer science)|production rule]] that allows a [[symbol (logic)|symbol]] to produce the empty string is known as an ε-production, and the symbol is said to be "nullable".


==Use in programming languages==
==Use in programming languages==
In most [[programming language]]s, strings are a [[data type]]. Strings are typically stored at distinct [[memory address]]es (locations). Thus, the same string (e.g., the empty string) may be stored in two or more places in memory.  
In most [[programming language]]s, the term "string" often refers to instances of a [[data type]] and thus they're a concept distinct from the one in the formal theory. Such strings are typically stored at distinct [[memory address]]es (locations) and thus have an identity. Thus, representatives of the same formal string (e.g., the empty string) may be stored in two or more places in memory and they can be taken as names of the formal empty string.  


In this way, there could be multiple empty strings in memory, in contrast with the formal theory definition, for which there is only one possible empty string. However, a string comparison function would indicate that all of these empty strings are equal to each other.
In this way, there could be multiple representatives of the empty string in memory, in contrast with the formal theory definition, for which there is only one possible empty string. However, a "string comparison function" would indicate that all of these representatives are equal to each other.


Even a string of length zero can require memory to store it, depending on the format being used. In most programming languages, the empty string is distinct from a [[null reference]] (or null pointer) because a null reference points to no string at all, not even the empty string.
Even a string of length zero can require memory to store it, depending on the format being used. In most programming languages, the empty string is distinct from a [[null reference]] (or null pointer) because a null reference points to no string at all, not even the empty string.
Line 50: Line 50:
| [[C (programming language)|C]], [[C++]], [[Objective-C]] (as a C string)
| [[C (programming language)|C]], [[C++]], [[Objective-C]] (as a C string)
|-
|-
| <code>std::string()</code>
| <code>new String()</code> (from <code>java.lang.String</code>)
| [[Java (programming language)|Java]]
|-
| <code>string()</code> (from <code>std::string</code>)
| [[C++]]
| [[C++]]
|-
|-
Line 65: Line 68:
| [[Perl]]
| [[Perl]]
|-
|-
| <code>str()</code><ref>Another way to make an empty string is multiplying a string by 0 or a negative integer.</ref>
| <code>str()</code><ref>Another way to make an empty string is multiplying a string by 0 or a negative integer.</ref> <code>""""""</code> <code>r""</code> <code>u""</code>
| [[Python (programming language)|Python]]
| [[Python (programming language)|Python]]
|-
|-
Line 71: Line 74:
| [[Ruby (programming language)|Ruby]]
| [[Ruby (programming language)|Ruby]]
|-
|-
| <code>String::new()</code><ref>{{Cite web |title=String in std::string - Rust |url=https://doc.rust-lang.org/std/string/struct.String.html#method.new |access-date=2022-11-30 |website=doc.rust-lang.org}}</ref>
| <code>String::new()</code> (from <code>std::string::String</code>)<ref>{{Cite web |title=String in std::string - Rust |url=https://doc.rust-lang.org/std/string/struct.String.html#method.new |access-date=2022-11-30 |website=doc.rust-lang.org}}</ref>
| [[Rust (programming language)|Rust]]
| [[Rust (programming language)|Rust]]
|-
|-
| <code>string.Empty</code>
| <code>String.Empty</code> (from <code>System.String</code>)
| [[C Sharp (programming language)|C#]], [[VB.NET|Visual Basic .NET]]
| [[C Sharp (programming language)|C#]], [[VB.NET|Visual Basic .NET]]
|-  
|-  
Line 88: Line 91:
|<code>“”</code>
|<code>“”</code>
<code>‘’</code>
<code>‘’</code>
<code>„”</code>
<code>‚‘</code>{{refn|All 'smart' quotation marks work as both opening and closing marks in any combination, except that single marks must be paired with single marks and double marks with double marks. For example {{code|"Hello, world„}} is valid. Guillemets are not supported. Note that the German-style low single quotation mark given here is {{Unichar|201A}}; the similar-looking character {{Unichar|2C}} does not function as a quotation mark. PowerShell's official documentation recommends using straight quotation marks.<ref>{{cite web |url=https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_quoting_rules?view=powershell-7.5 |title=about_Quoting_Rules – PowerShell |work=Microsoft Learn |access-date=26 August 2025 |url-status=live |archive-url=https://web.archive.org/web/20250814175659/https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_quoting_rules?view=powershell-7.5 |archive-date=2025-08-14 |quote=PowerShell treats smart quotation marks, also called typographic or curly quotes, as normal quotation marks for strings. Don't use smart quotation marks to enclose strings.}}</ref>}}
|[[PowerShell]]
|[[PowerShell]]
|-
|<code>.byte 0</code>
<code>.ascii ""</code>
<code>.asciz ""</code>
|[[A64 (instruction set)|A64]]
|}
|}



Latest revision as of 18:09, 16 December 2025

Script error: No such module "redirect hatnote". Template:Short description Template:Refimprove In formal language theory, the empty string, also known as the empty word or null string, is the unique string of length zero.

Formal theory

Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special case where the sequence has length zero, so there are no symbols in the string. There is only one empty string, because two strings are only different if they have different lengths or a different sequence of symbols. In formal treatments,[1] the empty string is denoted with ε or sometimes Λ or λ.

The empty string should not be confused with the empty language , which is a formal language (i.e. a set of strings) that contains no strings, not even the empty string.

The empty string has several properties:

  • |ε| = 0. Its string length is zero.
  • ε ⋅ s = s ⋅ ε = s. The empty string is the identity element of the concatenation operation. The set of all strings forms a free monoid with respect to ⋅ and ε.
  • εR = ε. Reversal of the empty string produces the empty string, so the empty string is a palindrome.
  • cs:P(c). Statements that are about all characters in a string are vacuously true.
  • The empty string precedes any other string under lexicographical order, because it is the shortest of all strings.[2]

In context-free grammars, a production rule that allows a symbol to produce the empty string is known as an ε-production, and the symbol is said to be "nullable".

Use in programming languages

In most programming languages, the term "string" often refers to instances of a data type and thus they're a concept distinct from the one in the formal theory. Such strings are typically stored at distinct memory addresses (locations) and thus have an identity. Thus, representatives of the same formal string (e.g., the empty string) may be stored in two or more places in memory and they can be taken as names of the formal empty string.

In this way, there could be multiple representatives of the empty string in memory, in contrast with the formal theory definition, for which there is only one possible empty string. However, a "string comparison function" would indicate that all of these representatives are equal to each other.

Even a string of length zero can require memory to store it, depending on the format being used. In most programming languages, the empty string is distinct from a null reference (or null pointer) because a null reference points to no string at all, not even the empty string. The empty string is a legitimate string, upon which most string operations should work. Some languages treat some or all of the following in similar ways: empty strings, null references, the integer 0, the floating point number 0, the Boolean value false, the ASCII character NUL, or other such values.

The empty string is usually represented similarly to other strings. In implementations with string terminating character (null-terminated strings or plain text lines), the empty string is indicated by the immediate use of this terminating character.

Different functions, methods, macros, or idioms exist for checking if a string is empty in different languages.Template:Examples

λ representation Programming languages
"" C, C#, C++, Go, Haskell, Java, JavaScript, Julia, Lua, M, Objective-C (as a C string), OCaml, Perl, PHP, PowerShell, Python, Ruby, Scala, Standard ML, Swift, Tcl, Visual Basic .NET
'' APL, Delphi, JavaScript, Lua, MATLAB, Pascal, Perl, PHP, PowerShell, Python, R, Ruby, Smalltalk, SQL
character(0) RTemplate:Refn
{'\0'} C, C++, Objective-C (as a C string)
new String() (from java.lang.String) Java
string() (from std::string) C++
""s C++ (since the 2014 standard)
@"" Objective-C (as a constant NSString object)
[NSString string] Objective-C (as a new NSString object)
q(), qq() Perl
str()[3] """""" r"" u"" Python
%{}
%()
Ruby
String::new() (from std::string::String)[4] Rust
String.Empty (from System.String) C#, Visual Basic .NET
String.make 0 '-' OCaml
{} Tcl
[[]] Lua
“”

‘’ „” ‚‘Template:Refn

PowerShell
.byte 0

.ascii "" .asciz ""

A64

Representations of the empty string

Script error: No such module "Unsubst". The empty string is a syntactically valid representation of zero in positional notation (in any base), which does not contain leading zeros. Since the empty string does not have a standard visual representation outside of formal language theory, the number zero is traditionally represented by a single decimal digit 0 instead.

Zero-filled memory area, interpreted as a null-terminated string, is an empty string.

Empty lines of text show the empty string. This can occur from two consecutive EOLs, as often occur in text files. This is sometimes used in text processing to separate paragraphs, e.g. in MediaWiki.

See also

References

<templatestyles src="Reflist/styles.css" />

  1. Script error: No such module "Citation/CS1".
  2. Script error: No such module "citation/CS1".
  3. Another way to make an empty string is multiplying a string by 0 or a negative integer.
  4. Script error: No such module "citation/CS1".

Script error: No such module "Check for unknown parameters".