Zero-width space

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Template:Short description The zero-width space (rendered: Template:Kbd; HTML entity: Template:Kbd or Template:Kbd), abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate where the word boundaries are, without actually displaying a visible space in the rendered text. This enables text-processing systems for scripts that do not use explicit spacing to recognize where word boundaries are for the purpose of handling line breaks appropriately.

The zero-width space is Unicode character U+200B, and is located in the Unicode General Punctuation block. In HTML, it can be represented by the character entity reference Template:As written.

Purpose

The zero-width space marks a potential line break without hyphenation. Its semantics and HTML implementation are similar to the soft hyphen, but soft hyphens display a hyphen character at the point where the line is broken.

The zero-width space can be used to mark word breaks in languages without visible space between words, such as Thai, Myanmar, Khmer, and Japanese.[1]

In justified text, the rendering engine may add inter-character spacing, also known as letter spacing, between letters separated by a zero-width space, unlike around fixed-width spaces.[1]

Example

To show the effect of the zero-width space in text, the following words have been separated with zero-width spaces:

By contrast, the following words have not been separated:

LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum

The first text is broken into lines but only at word boundaries, and resizing the browser window will re-break the text accordingly, while the second text is not broken at all.

Usage

HTML

In HTML pages, the HTML element <wbr> functions as a zero-width space. In Internet Explorer 6, the zero-width space was not supported in some fonts.[2]

Prohibition in domain names

ICANN rules prohibit domain names from containing non-displayed characters, including the zero-width space, and most browsers prohibit their use within domain names because they can be used to create a homograph attack, where a malicious URL is visually indistinguishable from a legitimate one.[3][4]

Encoding

The zero-width space character is encoded in Unicode as Template:Unichar.[5]

In HTML, it can be referenced as &ZeroWidthSpace;, &#8203; or &#x200B;. Additionally, the character entities &NegativeThickSpace;, &NegativeMediumSpace;, &NegativeThinSpace;, and &NegativeVeryThinSpace; all also refer to the zero-width space, contrary to what their names suggest.[6]

In HTML mailto: Template:Clarify span, %E2%80%8B renders a zero-width space (but may interfere with correctly copying the email link).Script error: No such module "Unsubst".

The TeX representation is \hskip0pt; the LaTeX representation is \hspace{0pt};[7] and the groff representation is \:.[8]

See also

References

Citations

<templatestyles src="Reflist/styles.css" />

  1. a b Script error: No such module "citation/CS1".
  2. Script error: No such module "citation/CS1".
  3. Script error: No such module "citation/CS1".
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "citation/CS1".
  6. Entities/ZeroWidthSpace in MathML Version 2.0
  7. Script error: No such module "citation/CS1".
  8. Script error: No such module "citation/CS1".

Script error: No such module "Check for unknown parameters".

Sources

<templatestyles src="Refbegin/styles.css" />

  • Script error: No such module "citation/CS1".

Template:Unicode navigation