Comparison of data-serialization formats

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Template:Short description This is a comparison of data serialization formats, various ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.

Overview

Template:Sort-under Template:Sticky table start

Name Creator-maintainer Based on Standardized?Template:Definition needed Specification Binary? Human-readable? Supports references?e Schema-IDL? Standard APIs Supports zero-copy operations
Apache Arrow Apache Software Foundation De facto Arrow Columnar Format Yes No Yes Built-in C, C++, C#, Go, Java, JavaScript, Julia, Matlab, Python, R, Ruby, Rust, Swift Yes
Apache Avro Apache Software Foundation No Apache Avro™ Specification Yes Partialg Built-in C, C#, C++, Java, PHP, Python, Ruby
Apache Parquet Apache Software Foundation No Apache Parquet Yes No No Java, Python, C++ No
Apache Thrift Facebook (creator)
Apache (maintainer)
No Original whitepaper Yes Partialc No Built-in C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages[1]
ASN.1 ISO, IEC, ITU-T Yes ISO/IEC 8824 / ITU-T X.680 (syntax) and ISO/IEC 8825 / ITU-T X.690 (encoding rules) series. X.680, X.681, and X.683 define syntax and semantics. BER, DER, PER, OER, or custom via ECN XER, JER, GSER, or custom via ECN Yesf Built-in OER
Bencode Bram Cohen (creator)
BitTorrent, Inc. (maintainer)
De facto as BEP Part of BitTorrent protocol specification Except numbers and delimiters, being ASCII No No No No No
BSON MongoDB JSON No BSON Specification Yes No No No No No
Cap'n Proto Kenton Varda No Cap'n Proto Encoding Spec Yes Partialh No Yes No Yes
CBOR Carsten Bormann, P. Hoffman MessagePack[2] Yes RFC 8949 Yes No Yes,
through tagging
CDDL FIDO2 No
Comma-separated values (CSV) RFC author:
Yakov Shafranovich
Myriad informal variants RFC 4180
(among others)
No Yes No No No No
Common Data Representation (CDR) Object Management Group Yes General Inter-ORB Protocol Yes No Yes Yes Ada, C, C++, Java, Cobol, Lisp, Python, Ruby, Smalltalk
D-Bus Message Protocol freedesktop.org Yes D-Bus Specification Yes No No Partial
(Signature strings)
Yes
Efficient XML Interchange (EXI) W3C XML, Efficient XML Yes Efficient XML Interchange (EXI) Format 1.0 Yes XML XPointer, XPath XML Schema DOM, SAX, StAX, XQuery, XPath
Extensible Data Notation (edn) Rich Hickey / Clojure community Clojure Yes Official edn spec No Yes No No Clojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python[3] No
FlatBuffers Google No Flatbuffers GitHub Yes Apache Arrow Partial
(internal to the buffer)
Yes C++, Java, C#, Go, Python, Rust, JavaScript, PHP, C, Dart, Lua, TypeScript Yes
Fast Infoset ISO, IEC, ITU-T XML Yes ITU-T X.891 and ISO/IEC 24824-1:2007 Yes No XPointer, XPath XML schema DOM, SAX, XQuery, XPath
FHIR Health Level 7 REST basics Yes Fast Healthcare Interoperability Resources Yes Yes Yes Yes Hapi for FHIR[4] JSON, XML, Turtle No
Ion Amazon JSON No The Amazon Ion Specification Yes Yes No Ion schema C, C#, Go, Java, JavaScript, Python, Rust
Java serialization Oracle Corporation Yes Java Object Serialization Yes No Yes No Yes
JSON Douglas Crockford JavaScript syntax Yes STD 90/RFC 8259
(ancillary:
RFC 6901,
RFC 6902), ECMA-404, ISO/IEC 21778:2017
No, but see BSON, Smile, UBJSON Yes JSON Pointer (RFCScript error: No such module "String".6901), or alternately, JSONPath, JPath, JSPON, json:select(); and JSON-LD Partial
(JSON Schema Proposal, ASN.1 with JER, Kwalify Script error: No such module "webarchive"., Rx, JSON-LD
Partial
(Clarinet, JSONQuery / RQL, JSONPath), JSON-LD
No
MessagePack Sadayuki Furuhashi JSON (loosely) No MessagePack format specification Yes No No No No Yes
Netstrings Dan Bernstein No netstrings.txt Except ASCII delimiters Yes No No No Yes
OGDL Rolf Veen ? No Specification Binary specification Yes Path specification Schema WD
OPC-UA Binary OPC Foundation No opcfoundation.org Yes No Yes No No
OpenDDL Eric Lengyel C, PHP No OpenDDL.org No Yes Yes No OpenDDL library
PHP serialization format PHP Group Yes No Yes Yes Yes No Yes
Pickle (Python) Guido van Rossum Python De facto as PEPs PEP 3154 – Pickle protocol version 4 Yes No Yes[5] No Yes No
Property list NeXT (creator)
Apple (maintainer)
? Partial Public DTD for XML format Yesa Yesb No ? Cocoa, CoreFoundation, OpenStep, GnuStep No
Protocol Buffers (protobuf) Google No Developer Guide: Encoding, proto2 specification, and proto3 specification Yes Yesd No Built-in C++, Java, C#, Python, Go, Ruby, Objective-C, C, Dart, Perl, PHP, R, Rust, Scala, Swift, Julia, Erlang, D, Haskell, ActionScript, Delphi, Elixir, Elm, Erlang, GopherJS, Haskell, Haxe, JavaScript, Kotlin, Lua, Matlab, Mercurt, OCaml, Prolog, Solidity, Typescript, Vala, Visual Basic No
S-expressions John McCarthy (original)
Ron Rivest (internet draft)
Lisp, Netstrings Largely de facto "S-Expressions" Script error: No such module "webarchive". Internet Draft Yes, canonical representation Yes, advanced transport representation No No
Smile Tatu Saloranta JSON No Smile Format Specification Yes No Yes Partial
(JSON Schema Proposal, other JSON schemas/IDLs)
Partial
(via JSON APIs implemented with Smile backend, on Jackson, Python)
SOAP W3C XML Yes W3C Recommendations:
SOAP/1.1
SOAP/1.2
Partial
(Efficient XML Interchange, Binary XML, Fast Infoset, MTOM, XSD base64 data)
Yes Built-in id/ref, XPointer, XPath WSDL, XML schema DOM, SAX, XQuery, XPath
Structured Data eXchange Formats Max Wildgrube Yes RFC 3072 Yes No No No
UBJSON The Buzz Media, LLC JSON, BSON No ubjson.org Yes No No No No
eXternal Data Representation (XDR) Sun Microsystems (creator)
IETF (maintainer)
Yes STD 67/RFC 4506 Yes No Yes Yes Yes
XML W3C SGML Yes W3C Recommendations:
1.0 (Fifth Edition)
1.1 (Second Edition)
Partial
(Efficient XML Interchange, Binary XML, Fast Infoset, XSD base64 data)
Yes XPointer, XPath XML schema, RELAX NG DOM, SAX, XQuery, XPath
XML-RPC Dave Winer[6] XML No XML-RPC Specification No Yes No No No No
YAML Clark Evans,
Ingy döt Net,
and Oren Ben-Kiki
C, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON[7] No Version 1.2 No Yes Yes Partial
(Kwalify Script error: No such module "webarchive"., Rx, built-in language type-defs)
No No
Name Creator-maintainer Based on Standardized? Specification Binary? Human-readable? Supports references?e Schema-IDL? Standard APIs Supports zero-copy operations

Template:Sticky table end

Template:Ordered list

Syntax comparison of human-readable formats

Template:Sticky table start

Format Null Boolean true Boolean false Integer Floating-point String Array Associative array/Object
ASN.1
(XML Encoding Rules)
<foo /> <foo>true</foo> <foo>false</foo> <foo>685230</foo> <foo>6.8523015e+5</foo> <foo>A to Z</foo>
<SeqOfUnrelatedDatatypes>
    <isMarried>true</isMarried>
    <hobby />
    <velocity>-42.1e7</velocity>
    <bookname>A to Z</bookname>
    <bookname>We said, "no".</bookname>
</SeqOfUnrelatedDatatypes>
An object (the key is a field name):
<person>
    <isMarried>true</isMarried>
    <hobby />
    <height>1.85</height>
    <name>Bob Peterson</name>
</person>

A data mapping (the key is a data value):

<competition>
    <measurement>
        <name>John</name>
        <height>3.14</height>
    </measurement>
    <measurement>
        <name>Jane</name>
        <height>2.718</height>
    </measurement>
</competition>

a

CSVb nulla
(or an empty element in the row)a
1a
truea
0a
falsea
685230
-685230a
6.8523015e+5a A to Z
"We said, ""no""."
true,,-42.1e7,"A to Z"
42,1
A to Z,1,2,3
edn nil true false 685230
-685230
6.8523015e+5 "A to Z", "A \"up to\" Z" [true nil -42.1e7 "A to Z"] {:kw 1, "42" true, "A to Z" [1 2 3]}
Ion

null
null.null
null.bool
null.int
null.float
null.decimal
null.timestamp
null.string
null.symbol
null.blob
null.clob
null.struct
null.list
null.sexp

true false 685230
-685230
0xA74AE
0b111010010101110
6.8523015e5 "A to Z"

'''
A
to
Z
'''
[true, null, -42.1e7, "A to Z"]
{'42': true, 'A to Z': [1, 2, 3]}
Netstringsc 0:,a
4:null,a
1:1,a
4:true,a
1:0,a
5:false,a
6:685230,a 9:6.8523e+5,a 6:A to Z, 29:4:true,0:,7:-42.1e7,6:A to Z,, 41:9:2:42,1:1,,25:6:A to Z,12:1:1,1:2,1:3,,,,a
JSON null true false 685230
-685230
6.8523015e+5 "A to Z"
[true, null, -42.1e7, "A to Z"]
{"42": true, "A to Z": [1, 2, 3]}
OGDLScript error: No such module "Unsubst". nulla truea falsea 685230a 6.8523015e+5a "A to Z"
'A to Z'
NoSpaces
true
null
-42.1e7
"A to Z"

(true, null, -42.1e7, "A to Z")

42
  true
"A to Z"
  1
  2
  3
42
  true
"A to Z", (1, 2, 3)
OpenDDL ref {null} bool {true} bool {false} int32 {685230}
int32 {0x74AE}
int32 {0b111010010101110}
float {6.8523015e+5} string {"A to Z"} Homogeneous array:
int32 {1, 2, 3, 4, 5}

Heterogeneous array:

array
{
    bool {true}
    ref {null}
    float {-42.1e7}
    string {"A to Z"}
}
dict
{
    value (key = "42") {bool {true}}
    value (key = "A to Z") {int32 {1, 2, 3}}
}
PHP serialization format N; b:1; b:0; i:685230;
i:-685230;
d:685230.15;d
d:INF;
d:-INF;
d:NAN;
s:6:"A to Z"; a:4:{i:0;b:1;i:1;N;i:2;d:-421000000;i:3;s:6:"A to Z";} Associative array:
a:2:{i:42;b:1;s:6:"A to Z";a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}}
Object:
O:8:"stdClass":2:{s:4:"John";d:3.14;s:4:"Jane";d:2.718;}d
Pickle (Python) N. I01\n. I00\n. I685230\n. F685230.15\n. S'A to Z'\n. (lI01\na(laF-421000000.0\naS'A to Z'\na. (dI42\nI01\nsS'A to Z'\n(lI1\naI2\naI3\nas.
Property list
(plain text format)[8]
<*BY> <*BN> <*I685230> <*R6.8523015e+5> "A to Z" ( <*BY>, <*R-42.1e7>, "A to Z" )
{
    "42" = <*BY>;
    "A to Z" = ( <*I1>, <*I2>, <*I3> );
}
Property list
(XML format)[9]
<true /> <false /> <integer>685230</integer> <real>6.8523015e+5</real> <string>A to Z</string>
<array>
    <true />
    <real>-42.1e7</real>
    <string>A to Z</string>
</array>
<dict>
    <key>42</key>
    <true />
    <key>A to Z</key>
    <array>
        <integer>1</integer>
        <integer>2</integer>
        <integer>3</integer>
    </array>
</dict>
Protocol Buffers true false 685230
-685230
20.0855369 "A to Z"
"sdfff2 \000\001\002\377\376\375"
"q\tqq<>q2&\001\377"
field1: "value1"
field1: "value2"
field1: "value3
anotherfield {
  foo: 123
  bar: 456
}
anotherfield {
  foo: 222
  bar: 333
}
thing1: "blahblah"
thing2: 18923743
thing3: -44
thing4 {
  submessage_field1: "foo"
  submessage_field2: false
}
enumeratedThing: SomeEnumeratedValue
thing5: 123.456
[extensionFieldFoo]: "etc"
[extensionFieldThatIsAnEnum]: EnumValue
S-expressions NIL
nil
T
#tf
true
NIL
#ff
false
685230 6.8523015e+5 abc
"abc"
#616263#
3:abc
{MzphYmM=}
|YWJj|
(T NIL -42.1e7 "A to Z") ((42 T) ("A to Z" (1 2 3)))
YAML ~
null
Null
NULL[10]
y
Y
yes
Yes
YES
on
On
ON
true
True
TRUE[11]
n
N
no
No
NO
off
Off
OFF
false
False
FALSE[11]
685230
+685_230
-685230
02472256
0x_0A_74_AE
0b1010_0111_0100_1010_1110
190:20:30[12]
6.8523015e+5
685.230_15e+03
685_230.15
190:20:30.15
.inf
-.inf
.Inf
.INF
.NaN
.nan
.NAN[13]
A to Z
"A to Z"
'A to Z'
[y, ~, -42.1e7, "A to Z"]
- y
-
- -42.1e7
- A to Z
{"John":3.14, "Jane":2.718}
42: y
A to Z: [1, 2, 3]
XMLe and SOAP <null />a true false 685230 6.8523015e+5 A to Z
<item>true</item>
<item xsi:nil="true"/>
<item>-42.1e7</item>
<item>A to Z<item>
<map>
  <entry key="42">true</entry>
  <entry key="A to Z">
    <item val="1"/>
    <item val="2"/>
    <item val="3"/>
  </entry>
</map>
XML-RPC <value><boolean>1</boolean></value> <value><boolean>0</boolean></value> <value><int>685230</int></value> <value><double>6.8523015e+5</double></value> <value><string>A to Z</string></value>
<value><array>
  <data>
  <value><boolean>1</boolean></value>
  <value><double>-42.1e7</double></value>
  <value><string>A to Z</string></value>
  </data>
  </array></value>
<value><struct>
  <member>
    <name>42</name>
    <value><boolean>1</boolean></value>
    </member>
  <member>
    <name>A to Z</name>
    <value>
      <array>
        <data>
          <value><int>1</int></value>
          <value><int>2</int></value>
          <value><int>3</int></value>
          </data>
        </array>
      </value>
    </member>
</struct>

Template:Sticky table end Template:Ordered list

Comparison of binary formats

Template:Sticky table start

Format Null Booleans Integer Floating-point String Array Associative array/object
ASN.1
(BER, PER or OER encoding)
<templatestyles src="Mono/styles.css" />NULL type <templatestyles src="Mono/styles.css" />BOOLEAN: Template:Ubli <templatestyles src="Mono/styles.css" />INTEGER: Template:Ubli <templatestyles src="Mono/styles.css" />REAL:Template:Ubli Multiple valid types (<templatestyles src="Mono/styles.css" />VisibleString, PrintableString, GeneralString, UniversalString, UTF8String) Data specifications <templatestyles src="Mono/styles.css" />SET OF (unordered) and <templatestyles src="Mono/styles.css" />SEQUENCE OF (guaranteed order) User definable type
BSON \x0A
(1 byte)
True: \x08\x01
False: \x08\x00
(2 bytes)
int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complement Double: little-endian binary64 UTF-8-encoded, preceded by int32-encoded string length in bytes BSON embedded document with numeric keys BSON embedded document
Concise Binary Object Representation (CBOR) \xf6
(1 byte)
Template:Ubli

(1 byte)

Template:Ubli Template:Ubli Template:Ubli Template:Ubli Template:Ubli
Efficient XML Interchange (EXI)Template:Efn

(Unpreserved lexical values format)

xsi:nil is not allowed in binary context. 1–2 bit integer interpreted as boolean. Boolean sign, plus arbitrary length 7-bit octets, parsed until most-significant bit is 0, in little-endian. The schema can set the zero-point to any arbitrary number.

Unsigned skips the boolean flag.

Template:Ubli Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead. Length prefixed set of items. Not in protocol.
FlatBuffers Encoded as absence of field in parent object Template:Ubli

(1 byte)

Little-endian 2's complement signed and unsigned 8/16/32/64 bits Template:Ubli UTF-8-encoded, preceded by 32-bit integer length of string in bytes Vectors of any other type, preceded by 32-bit integer length of number of elements Tables (schema defined types) or Vectors sorted by key (maps / dictionaries)
Ion[14] \x0fTemplate:Efn Template:Ubli Template:Ubli Template:Ubli Template:Ubli \xbx Arbitrary length and overhead. Length in octets. Template:Ubli
MessagePack \xc0 Template:Ubli Template:Ubli Typecode (1 byte) + IEEE single/double Template:Ubli

encoding is unspecified[15]

Template:Ubli Template:Ubli
NetstringsTemplate:Efn Not in protocol. Not in protocol. Not in protocol. Not in protocol. Length-encoded as an ASCII string + ':' + data + ','

Length counts only octets between ':' and ','

Not in protocol. Not in protocol.
OGDL Binary
Property list
(binary format)
Protocol Buffers Template:Ubli Template:Ubli UTF-8-encoded, preceded by varint-encoded integer length of string in bytes Repeated value with the same tag or, for varint-encoded integers only, values packed contiguously and prefixed by tag and total byte length
Smile \x21 Template:Ubli Template:Ubli IEEE single/double, BigDecimal Length-prefixed "short" Strings (up to 64 bytes), marker-terminated "long" Strings and (optional) back-references Arbitrary-length heterogenous arrays with end-marker Arbitrary-length key/value pairs with end-marker
Structured Data eXchange Formats (SDXF) Big-endian signed 24-bit or 32-bit integer Big-endian IEEE double Either UTF-8 or ISO 8859-1 encoded List of elements with identical ID and size, preceded by array header with int16 length Chunks can contain other chunks to arbitrary depth.
Thrift

Template:Sticky table end Template:Notelist

See also

References

<templatestyles src="Reflist/styles.css" />

  1. Apache Thrift
  2. Script error: No such module "citation/CS1".
  3. Script error: No such module "citation/CS1".
  4. Script error: No such module "citation/CS1".
  5. cpython/Lib/pickle.py
  6. Script error: No such module "citation/CS1".
  7. Script error: No such module "citation/CS1".
  8. Script error: No such module "citation/CS1".
  9. Script error: No such module "citation/CS1".
  10. Script error: No such module "citation/CS1".
  11. a b Script error: No such module "citation/CS1".
  12. Script error: No such module "citation/CS1".
  13. Script error: No such module "citation/CS1".
  14. Ion Binary Encoding
  15. Script error: No such module "citation/CS1".

Script error: No such module "Check for unknown parameters".

External links