Package org.apache.hadoop.typedbytes


package org.apache.hadoop.typedbytes
Typed bytes are sequences of bytes in which the first byte is a type code. They are especially useful as a (simple and very straightforward) binary format for transferring data to and from Hadoop Streaming programs.

Type Codes

Each typed bytes sequence starts with an unsigned byte that contains the type code. Possible values are:
"Type Codes"
CodeType
0A sequence of bytes.
1A byte.
2A boolean.
3An integer.
4A long.
5A float.
6A double.
7A string.
8A vector.
9A list.
10A map.
The type codes 50 to 200 are treated as aliases for 0, and can thus be used for application-specific serialization.

Subsequent Bytes

These are the subsequent bytes for the different type codes (everything is big-endian and unpadded):
"Subsequent Bytes"
CodeSubsequent Bytes
0<32-bit signed integer> <as many bytes as indicated by the integer>
1<signed byte>
2<signed byte (0 = false and 1 = true)>
3<32-bit signed integer>
4<64-bit signed integer>
5<32-bit IEEE floating point number>
6<64-bit IEEE floating point number>
7<32-bit signed integer> <as many UTF-8 bytes as indicated by the integer>
8<32-bit signed integer> <as many typed bytes sequences as indicated by the integer>
9<variable number of typed bytes sequences> <255 written as an unsigned byte>
10<32-bit signed integer> <as many (key-value) pairs of typed bytes sequences as indicated by the integer>
  • Class
    Description
    The possible type codes.
    org.apache.hadoop.typedbytes.TypedBytesInput
    Provides functionality for reading typed bytes.
    org.apache.hadoop.typedbytes.TypedBytesOutput
    Provides functionality for writing typed bytes.
    org.apache.hadoop.typedbytes.TypedBytesRecordInput
    Serializer for records that writes typed bytes.
    org.apache.hadoop.typedbytes.TypedBytesRecordOutput
    Deserialized for records that reads typed bytes.
    org.apache.hadoop.typedbytes.TypedBytesWritable
    Writable for typed bytes.
    org.apache.hadoop.typedbytes.TypedBytesWritableInput
    Provides functionality for reading typed bytes as Writable objects.
    org.apache.hadoop.typedbytes.TypedBytesWritableOutput
    Provides functionality for writing Writable objects as typed bytes.