Comparison of data-serialization formats
Template:Short description This is a comparison of data serialization formats, various ways to convert complex objects to sequences of bits. It does not include markup languages used exclusively as document file formats.
Overview
Template:Sort-under Template:Sticky table start
| Name | Creator-maintainer | Based on | Standardized?Template:Definition needed | Specification | Binary? | Human-readable? | Supports references?e | Schema-IDL? | Standard APIs | Supports zero-copy operations |
|---|---|---|---|---|---|---|---|---|---|---|
| Apache Arrow | Apache Software Foundation | — | De facto | Arrow Columnar Format | Yes | No | Yes | Built-in | C, C++, C#, Go, Java, JavaScript, Julia, Matlab, Python, R, Ruby, Rust, Swift | Yes |
| Apache Avro | Apache Software Foundation | — | No | Apache Avro™ Specification | Yes | Partialg | — | Built-in | C, C#, C++, Java, PHP, Python, Ruby | — |
| Apache Parquet | Apache Software Foundation | — | No | Apache Parquet | Yes | No | No | — | Java, Python, C++ | No |
| Apache Thrift | Facebook (creator) Apache (maintainer) |
— | No | Original whitepaper | Yes | Partialc | No | Built-in | C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi and other languages[1] | — |
| ASN.1 | ISO, IEC, ITU-T | — | Yes | ISO/IEC 8824 / ITU-T X.680 (syntax) and ISO/IEC 8825 / ITU-T X.690 (encoding rules) series. X.680, X.681, and X.683 define syntax and semantics. | BER, DER, PER, OER, or custom via ECN | XER, JER, GSER, or custom via ECN | Yesf | Built-in | — | OER |
| Bencode | Bram Cohen (creator) BitTorrent, Inc. (maintainer) |
— | De facto as BEP | Part of BitTorrent protocol specification | Except numbers and delimiters, being ASCII | No | No | No | No | No |
| BSON | MongoDB | JSON | No | BSON Specification | Yes | No | No | No | No | No |
| Cap'n Proto | Kenton Varda | — | No | Cap'n Proto Encoding Spec | Yes | Partialh | No | Yes | No | Yes |
| CBOR | Carsten Bormann, P. Hoffman | MessagePack[2] | Yes | RFC 8949 | Yes | No | Yes, through tagging |
CDDL | FIDO2 | No |
| Comma-separated values (CSV) | RFC author: Yakov Shafranovich |
— | Myriad informal variants | RFC 4180 (among others) |
No | Yes | No | No | No | No |
| Common Data Representation (CDR) | Object Management Group | — | Yes | General Inter-ORB Protocol | Yes | No | Yes | Yes | Ada, C, C++, Java, Cobol, Lisp, Python, Ruby, Smalltalk | — |
| D-Bus Message Protocol | freedesktop.org | — | Yes | D-Bus Specification | Yes | No | No | Partial (Signature strings) |
Yes | — |
| Efficient XML Interchange (EXI) | W3C | XML, Efficient XML | Yes | Efficient XML Interchange (EXI) Format 1.0 | Yes | XML | XPointer, XPath | XML Schema | DOM, SAX, StAX, XQuery, XPath | — |
| Extensible Data Notation (edn) | Rich Hickey / Clojure community | Clojure | Yes | Official edn spec | No | Yes | No | No | Clojure, Ruby, Go, C++, Javascript, Java, CLR, ObjC, Python[3] | No |
| FlatBuffers | — | No | Flatbuffers GitHub | Yes | Apache Arrow | Partial (internal to the buffer) |
Yes | C++, Java, C#, Go, Python, Rust, JavaScript, PHP, C, Dart, Lua, TypeScript | Yes | |
| Fast Infoset | ISO, IEC, ITU-T | XML | Yes | ITU-T X.891 and ISO/IEC 24824-1:2007 | Yes | No | XPointer, XPath | XML schema | DOM, SAX, XQuery, XPath | — |
| FHIR | Health Level 7 | REST basics | Yes | Fast Healthcare Interoperability Resources | Yes | Yes | Yes | Yes | Hapi for FHIR[4] JSON, XML, Turtle | No |
| Ion | Amazon | JSON | No | The Amazon Ion Specification | Yes | Yes | No | Ion schema | C, C#, Go, Java, JavaScript, Python, Rust | — |
| Java serialization | Oracle Corporation | — | Yes | Java Object Serialization | Yes | No | Yes | No | Yes | — |
| JSON | Douglas Crockford | JavaScript syntax | Yes | STD 90/RFC 8259 (ancillary: RFC 6901, RFC 6902), ECMA-404, ISO/IEC 21778:2017 |
No, but see BSON, Smile, UBJSON | Yes | JSON Pointer (RFCScript error: No such module "String".6901), or alternately, JSONPath, JPath, JSPON, json:select(); and JSON-LD | Partial (JSON Schema Proposal, ASN.1 with JER, Kwalify Script error: No such module "webarchive"., Rx, JSON-LD |
Partial (Clarinet, JSONQuery / RQL, JSONPath), JSON-LD |
No |
| MessagePack | Sadayuki Furuhashi | JSON (loosely) | No | MessagePack format specification | Yes | No | No | No | No | Yes |
| Netstrings | Dan Bernstein | — | No | netstrings.txt | Except ASCII delimiters | Yes | No | No | No | Yes |
| OGDL | Rolf Veen | ? | No | Specification | Binary specification | Yes | Path specification | Schema WD | — | |
| OPC-UA Binary | OPC Foundation | — | No | opcfoundation.org | Yes | No | Yes | No | No | — |
| OpenDDL | Eric Lengyel | C, PHP | No | OpenDDL.org | No | Yes | Yes | No | OpenDDL library | — |
| PHP serialization format | PHP Group | — | Yes | No | Yes | Yes | Yes | No | Yes | — |
| Pickle (Python) | Guido van Rossum | Python | De facto as PEPs | PEP 3154 – Pickle protocol version 4 | Yes | No | Yes[5] | No | Yes | No |
| Property list | NeXT (creator) Apple (maintainer) |
? | Partial | Public DTD for XML format | Yesa | Yesb | No | ? | Cocoa, CoreFoundation, OpenStep, GnuStep | No |
| Protocol Buffers (protobuf) | — | No | Developer Guide: Encoding, proto2 specification, and proto3 specification | Yes | Yesd | No | Built-in | C++, Java, C#, Python, Go, Ruby, Objective-C, C, Dart, Perl, PHP, R, Rust, Scala, Swift, Julia, Erlang, D, Haskell, ActionScript, Delphi, Elixir, Elm, Erlang, GopherJS, Haskell, Haxe, JavaScript, Kotlin, Lua, Matlab, Mercurt, OCaml, Prolog, Solidity, Typescript, Vala, Visual Basic | No | |
| S-expressions | John McCarthy (original) Ron Rivest (internet draft) |
Lisp, Netstrings | Largely de facto | "S-Expressions" Script error: No such module "webarchive". Internet Draft | Yes, canonical representation | Yes, advanced transport representation | No | No | — | |
| Smile | Tatu Saloranta | JSON | No | Smile Format Specification | Yes | No | Yes | Partial (JSON Schema Proposal, other JSON schemas/IDLs) |
Partial (via JSON APIs implemented with Smile backend, on Jackson, Python) |
— |
| SOAP | W3C | XML | Yes | W3C Recommendations: SOAP/1.1 SOAP/1.2 |
Partial (Efficient XML Interchange, Binary XML, Fast Infoset, MTOM, XSD base64 data) |
Yes | Built-in id/ref, XPointer, XPath | WSDL, XML schema | DOM, SAX, XQuery, XPath | — |
| Structured Data eXchange Formats | Max Wildgrube | — | Yes | RFC 3072 | Yes | No | No | No | — | |
| UBJSON | The Buzz Media, LLC | JSON, BSON | No | ubjson.org | Yes | No | No | No | No | — |
| eXternal Data Representation (XDR) | Sun Microsystems (creator) IETF (maintainer) |
— | Yes | STD 67/RFC 4506 | Yes | No | Yes | Yes | Yes | — |
| XML | W3C | SGML | Yes | W3C Recommendations: 1.0 (Fifth Edition) 1.1 (Second Edition) |
Partial (Efficient XML Interchange, Binary XML, Fast Infoset, XSD base64 data) |
Yes | XPointer, XPath | XML schema, RELAX NG | DOM, SAX, XQuery, XPath | — |
| XML-RPC | Dave Winer[6] | XML | No | XML-RPC Specification | No | Yes | No | No | No | No |
| YAML | Clark Evans, Ingy döt Net, and Oren Ben-Kiki |
C, Java, Perl, Python, Ruby, Email, HTML, MIME, URI, XML, SAX, SOAP, JSON[7] | No | Version 1.2 | No | Yes | Yes | Partial (Kwalify Script error: No such module "webarchive"., Rx, built-in language type-defs) |
No | No |
| Name | Creator-maintainer | Based on | Standardized? | Specification | Binary? | Human-readable? | Supports references?e | Schema-IDL? | Standard APIs | Supports zero-copy operations |
Syntax comparison of human-readable formats
| Format | Null | Boolean true | Boolean false | Integer | Floating-point | String | Array | Associative array/Object |
|---|---|---|---|---|---|---|---|---|
| ASN.1 (XML Encoding Rules) |
<foo />
|
<foo>true</foo>
|
<foo>false</foo>
|
<foo>685230</foo>
|
<foo>6.8523015e+5</foo>
|
<foo>A to Z</foo>
|
<SeqOfUnrelatedDatatypes>
<isMarried>true</isMarried>
<hobby />
<velocity>-42.1e7</velocity>
<bookname>A to Z</bookname>
<bookname>We said, "no".</bookname>
</SeqOfUnrelatedDatatypes>
|
An object (the key is a field name):
<person>
<isMarried>true</isMarried>
<hobby />
<height>1.85</height>
<name>Bob Peterson</name>
</person>
A data mapping (the key is a data value): <competition>
<measurement>
<name>John</name>
<height>3.14</height>
</measurement>
<measurement>
<name>Jane</name>
<height>2.718</height>
</measurement>
</competition>
|
| CSVb | nulla(or an empty element in the row)a |
1atruea
|
0afalsea
|
685230-685230a
|
6.8523015e+5a
|
A to Z"We said, ""no""."
|
true,,-42.1e7,"A to Z"
|
42,1 A to Z,1,2,3 |
| edn | nil
|
true
|
false
|
685230-685230
|
6.8523015e+5
|
"A to Z", "A \"up to\" Z"
|
[true nil -42.1e7 "A to Z"]
|
{:kw 1, "42" true, "A to Z" [1 2 3]}
|
| Ion |
|
true
|
false
|
685230-6852300xA74AE0b111010010101110
|
6.8523015e5
|
"A to Z"'''
|
[true, null, -42.1e7, "A to Z"]
|
{'42': true, 'A to Z': [1, 2, 3]}
|
| Netstringsc | 0:,a4:null,a
|
1:1,a4:true,a
|
1:0,a5:false,a
|
6:685230,a
|
9:6.8523e+5,a
|
6:A to Z,
|
29:4:true,0:,7:-42.1e7,6:A to Z,,
|
41:9:2:42,1:1,,25:6:A to Z,12:1:1,1:2,1:3,,,,a
|
| JSON | null
|
true
|
false
|
685230-685230
|
6.8523015e+5
|
"A to Z"
|
[true, null, -42.1e7, "A to Z"]
|
{"42": true, "A to Z": [1, 2, 3]}
|
| OGDLScript error: No such module "Unsubst". | nulla
|
truea
|
falsea
|
685230a
|
6.8523015e+5a
|
"A to Z"'A to Z'NoSpaces
|
true null -42.1e7 "A to Z"
|
42 true "A to Z" 1 2 3 42 true "A to Z", (1, 2, 3) |
| OpenDDL | ref {null}
|
bool {true}
|
bool {false}
|
int32 {685230}int32 {0x74AE}int32 {0b111010010101110}
|
float {6.8523015e+5}
|
string {"A to Z"}
|
Homogeneous array:
int32 {1, 2, 3, 4, 5}
Heterogeneous array: array
{
bool {true}
ref {null}
float {-42.1e7}
string {"A to Z"}
}
|
dict
{
value (key = "42") {bool {true}}
value (key = "A to Z") {int32 {1, 2, 3}}
}
|
| PHP serialization format | N;
|
b:1;
|
b:0;
|
i:685230;i:-685230;
|
d:685230.15;dd:INF;d:-INF;d:NAN;
|
s:6:"A to Z";
|
a:4:{i:0;b:1;i:1;N;i:2;d:-421000000;i:3;s:6:"A to Z";}
|
Associative array:a:2:{i:42;b:1;s:6:"A to Z";a:3:{i:0;i:1;i:1;i:2;i:2;i:3;}}Object: O:8:"stdClass":2:{s:4:"John";d:3.14;s:4:"Jane";d:2.718;}d
|
| Pickle (Python) | N.
|
I01\n.
|
I00\n.
|
I685230\n.
|
F685230.15\n.
|
S'A to Z'\n.
|
(lI01\na(laF-421000000.0\naS'A to Z'\na.
|
(dI42\nI01\nsS'A to Z'\n(lI1\naI2\naI3\nas.
|
| Property list (plain text format)[8] |
— | <*BY>
|
<*BN>
|
<*I685230>
|
<*R6.8523015e+5>
|
"A to Z"
|
( <*BY>, <*R-42.1e7>, "A to Z" )
|
{
"42" = <*BY>;
"A to Z" = ( <*I1>, <*I2>, <*I3> );
}
|
| Property list (XML format)[9] |
— | <true />
|
<false />
|
<integer>685230</integer>
|
<real>6.8523015e+5</real>
|
<string>A to Z</string>
|
<array>
<true />
<real>-42.1e7</real>
<string>A to Z</string>
</array>
|
<dict>
<key>42</key>
<true />
<key>A to Z</key>
<array>
<integer>1</integer>
<integer>2</integer>
<integer>3</integer>
</array>
</dict>
|
| Protocol Buffers | — | true
|
false
|
685230-685230
|
20.0855369
|
"A to Z"
|
field1: "value1" field1: "value2" field1: "value3 anotherfield {
foo: 123
bar: 456
}
anotherfield {
foo: 222
bar: 333
}
|
thing1: "blahblah"
thing2: 18923743
thing3: -44
thing4 {
submessage_field1: "foo"
submessage_field2: false
}
enumeratedThing: SomeEnumeratedValue
thing5: 123.456
[extensionFieldFoo]: "etc"
[extensionFieldThatIsAnEnum]: EnumValue
|
| S-expressions | NILnil
|
T#tftrue
|
NIL#fffalse
|
685230
|
6.8523015e+5
|
abc"abc"#616263#3:abc{MzphYmM=}|YWJj|
|
(T NIL -42.1e7 "A to Z")
|
((42 T) ("A to Z" (1 2 3)))
|
| YAML | ~nullNullNULL[10]
|
yYyesYesYESonOnONtrueTrueTRUE[11]
|
nNnoNoNOoffOffOFFfalseFalseFALSE[11]
|
685230+685_230-685230024722560x_0A_74_AE0b1010_0111_0100_1010_1110190:20:30[12]
|
6.8523015e+5685.230_15e+03685_230.15190:20:30.15.inf-.inf.Inf.INF.NaN.nan.NAN[13]
|
A to Z"A to Z"'A to Z'
|
[y, ~, -42.1e7, "A to Z"]
- y - - -42.1e7 - A to Z |
{"John":3.14, "Jane":2.718}
42: y A to Z: [1, 2, 3] |
| XMLe and SOAP | <null />a
|
true
|
false
|
685230
|
6.8523015e+5
|
A to Z
|
<item>true</item>
<item xsi:nil="true"/>
<item>-42.1e7</item>
<item>A to Z<item>
|
<map>
<entry key="42">true</entry>
<entry key="A to Z">
<item val="1"/>
<item val="2"/>
<item val="3"/>
</entry>
</map>
|
| XML-RPC | <value><boolean>1</boolean></value>
|
<value><boolean>0</boolean></value>
|
<value><int>685230</int></value>
|
<value><double>6.8523015e+5</double></value>
|
<value><string>A to Z</string></value>
|
<value><array>
<data>
<value><boolean>1</boolean></value>
<value><double>-42.1e7</double></value>
<value><string>A to Z</string></value>
</data>
</array></value>
|
<value><struct>
<member>
<name>42</name>
<value><boolean>1</boolean></value>
</member>
<member>
<name>A to Z</name>
<value>
<array>
<data>
<value><int>1</int></value>
<value><int>2</int></value>
<value><int>3</int></value>
</data>
</array>
</value>
</member>
</struct>
|
Comparison of binary formats
| Format | Null | Booleans | Integer | Floating-point | String | Array | Associative array/object |
|---|---|---|---|---|---|---|---|
| ASN.1 (BER, PER or OER encoding) |
<templatestyles src="Mono/styles.css" />NULL type | <templatestyles src="Mono/styles.css" />BOOLEAN: Template:Ubli | <templatestyles src="Mono/styles.css" />INTEGER: Template:Ubli | <templatestyles src="Mono/styles.css" />REAL:Template:Ubli | Multiple valid types (<templatestyles src="Mono/styles.css" />VisibleString, PrintableString, GeneralString, UniversalString, UTF8String) | Data specifications <templatestyles src="Mono/styles.css" />SET OF (unordered) and <templatestyles src="Mono/styles.css" />SEQUENCE OF (guaranteed order) | User definable type |
| BSON | \x0A(1 byte) |
True: \x08\x01False: \x08\x00(2 bytes) |
int32: 32-bit little-endian 2's complement or int64: 64-bit little-endian 2's complement | Double: little-endian binary64 | UTF-8-encoded, preceded by int32-encoded string length in bytes | BSON embedded document with numeric keys | BSON embedded document |
| Concise Binary Object Representation (CBOR) | \xf6(1 byte) |
Template:Ubli
(1 byte) |
Template:Ubli | Template:Ubli | Template:Ubli | Template:Ubli | Template:Ubli |
| Efficient XML Interchange (EXI)Template:Efn (Unpreserved lexical values format) |
xsi:nil is not allowed in binary context. | 1–2 bit integer interpreted as boolean. | Boolean sign, plus arbitrary length 7-bit octets, parsed until most-significant bit is 0, in little-endian. The schema can set the zero-point to any arbitrary number. Unsigned skips the boolean flag. |
Template:Ubli | Length prefixed integer-encoded Unicode. Integers may represent enumerations or string table entries instead. | Length prefixed set of items. | Not in protocol. |
| FlatBuffers | Encoded as absence of field in parent object | Template:Ubli
(1 byte) |
Little-endian 2's complement signed and unsigned 8/16/32/64 bits | Template:Ubli | UTF-8-encoded, preceded by 32-bit integer length of string in bytes | Vectors of any other type, preceded by 32-bit integer length of number of elements | Tables (schema defined types) or Vectors sorted by key (maps / dictionaries) |
| Ion[14] | \x0fTemplate:Efn
|
Template:Ubli | Template:Ubli | Template:Ubli | Template:Ubli | \xbx Arbitrary length and overhead. Length in octets.
|
Template:Ubli |
| MessagePack | \xc0
|
Template:Ubli | Template:Ubli | Typecode (1 byte) + IEEE single/double | Template:Ubli
encoding is unspecified[15] |
Template:Ubli | Template:Ubli |
| NetstringsTemplate:Efn | Not in protocol. | Not in protocol. | Not in protocol. | Not in protocol. | Length-encoded as an ASCII string + ':' + data + ',' Length counts only octets between ':' and ',' |
Not in protocol. | Not in protocol. |
| OGDL Binary | |||||||
| Property list (binary format) |
|||||||
| Protocol Buffers | Template:Ubli | Template:Ubli | UTF-8-encoded, preceded by varint-encoded integer length of string in bytes | Repeated value with the same tag or, for varint-encoded integers only, values packed contiguously and prefixed by tag and total byte length | — | ||
| Smile | \x21
|
Template:Ubli | Template:Ubli | IEEE single/double, BigDecimal
|
Length-prefixed "short" Strings (up to 64 bytes), marker-terminated "long" Strings and (optional) back-references | Arbitrary-length heterogenous arrays with end-marker | Arbitrary-length key/value pairs with end-marker |
| Structured Data eXchange Formats (SDXF) | Big-endian signed 24-bit or 32-bit integer | Big-endian IEEE double | Either UTF-8 or ISO 8859-1 encoded | List of elements with identical ID and size, preceded by array header with int16 length | Chunks can contain other chunks to arbitrary depth. | ||
| Thrift |
Template:Sticky table end Template:Notelist
See also
References
<templatestyles src="Reflist/styles.css" />
- ↑ Apache Thrift
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ cpython/Lib/pickle.py
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ a b Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Ion Binary Encoding
- ↑ Script error: No such module "citation/CS1".
Script error: No such module "Check for unknown parameters".