Tab-separated values

From Wikipedia, the free encyclopedia
Revision as of 09:35, 9 June 2025 by imported>GhostInTheMachine (Changing short description from "Text file format" to "Text file format for tabular data")
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Short description Script error: No such module "Infobox".Template:Template otherScript error: No such module "Check for unknown parameters". Template:Use dmy datesTab-separated values (TSV) is a simple, text-based file format for storing tabular data.[1] Records are separated by newlines, and values within a record are separated by tab characters. The TSV format is thus a delimiter-separated values format, similar to comma-separated values.

TSV is a simple file format that is widely supported, so it is often used in data exchange to move tabular data between different computer programs that support the format. For example, a TSV file might be used to transfer information from a database to a spreadsheet.

Example

The head of the Iris flower data set can be stored as a TSV using the following plain text (note that the HTML rendering may convert tabs to spaces):

Template:Not a typo

The TSV plain text above corresponds to the following tabular data:

Sepal length Sepal width Petal length Petal width Species
5.1 3.5 1.4 0.2 I. setosa
4.9 3.0 1.4 0.2 I. setosa
4.7 3.2 1.3 0.2 I. setosa
4.6 3.1 1.5 0.2 I. setosa
5.0 3.6 1.4 0.2 I. setosa

Character escaping

The IANA media type standard for TSV achieves simplicity by simply disallowing tabs within fields.Template:Sfn

Since the values in the TSV format cannot contain literal tabs or newline characters, a convention is necessary for lossless conversion of text values with these characters. A common convention is to perform the following escapes:[2][3]

escape sequence meaning
\n line feed
\t tab
\r carriage return
\\ backslash

Another common convention is to use the CSV convention from Template:IETF RFC and enclose values containing tabs or newlines in double quotes. This can lead to ambiguities.[4][5]

Line endings

Records are typically separated by a line feed, as is typical for Unix platforms, or a carriage return and line feed, as is typical for Microsoft platforms. Some programs may expect the latter. The de-facto specification[6] specifies that records are separated by an Template:Mono, but does not specify any specific newline.

See also

References

Template:Reflist

Sources

  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1".

Further reading

  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1".
  1. Script error: No such module "citation/CS1".
  2. Script error: No such module "citation/CS1".
  3. Script error: No such module "citation/CS1".
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "citation/CS1".
  6. Script error: No such module "citation/CS1".