Class FSOutputSummer

java.lang.Object
java.io.OutputStream
org.apache.hadoop.fs.FSOutputSummer
All Implemented Interfaces:
Closeable, Flushable, AutoCloseable, StreamCapabilities

@LimitedPrivate("HDFS") @Unstable public abstract class FSOutputSummer extends OutputStream implements StreamCapabilities
This is a generic output stream for generating checksums for data before it is written to the underlying stream
  • Constructor Details

    • FSOutputSummer

      protected FSOutputSummer(DataChecksum sum)
  • Method Details

    • writeChunk

      protected abstract void writeChunk(byte[] b, int bOffset, int bLen, byte[] checksum, int checksumOffset, int checksumLen) throws IOException
      Throws:
      IOException
    • checkClosed

      protected abstract void checkClosed() throws IOException
      Check if the implementing OutputStream is closed and should no longer accept writes. Implementations should do nothing if this stream is not closed, and should throw an IOException if it is closed.
      Throws:
      IOException - if this stream is already closed.
    • write

      public void write(int b) throws IOException
      Write one byte
      Specified by:
      write in class OutputStream
      Throws:
      IOException
    • write

      public void write(byte[] b, int off, int len) throws IOException
      Writes len bytes from the specified byte array starting at offset off and generate a checksum for each data chunk.

      This method stores bytes from the given array into this stream's buffer before it gets checksumed. The buffer gets checksumed and flushed to the underlying output stream when all data in a checksum chunk are in the buffer. If the buffer is empty and requested length is at least as large as the size of next checksum chunk size, this method will checksum and write the chunk directly to the underlying output stream. Thus it avoids unnecessary data copy.

      Overrides:
      write in class OutputStream
      Parameters:
      b - the data.
      off - the start offset in the data.
      len - the number of bytes to write.
      Throws:
      IOException - if an I/O error occurs.
    • flushBuffer

      protected void flushBuffer() throws IOException
      Throws:
      IOException
    • flushBuffer

      protected int flushBuffer(boolean keep, boolean flushPartial) throws IOException
      Throws:
      IOException
    • flush

      public void flush() throws IOException
      Checksums all complete data chunks and flushes them to the underlying stream. If there is a trailing partial chunk, it is not flushed and is maintained in the buffer.
      Specified by:
      flush in interface Flushable
      Overrides:
      flush in class OutputStream
      Throws:
      IOException
    • getBufferedDataSize

      protected int getBufferedDataSize()
      Return the number of valid bytes currently in the buffer.
      Returns:
      buffer data size.
    • getChecksumSize

      protected int getChecksumSize()
      Returns:
      the size for a checksum.
    • getDataChecksum

      protected DataChecksum getDataChecksum()
    • createWriteTraceScope

      protected org.apache.hadoop.tracing.TraceScope createWriteTraceScope()
    • convertToByteStream

      public static byte[] convertToByteStream(Checksum sum, int checksumSize)
      Converts a checksum integer value to a byte stream
      Parameters:
      sum - check sum.
      checksumSize - check sum size.
      Returns:
      byte stream.
    • setChecksumBufSize

      protected void setChecksumBufSize(int size)
      Resets existing buffer with a new one of the specified size.
      Parameters:
      size - size.
    • resetChecksumBufSize

      protected void resetChecksumBufSize()
    • hasCapability

      public boolean hasCapability(String capability)
      Description copied from interface: StreamCapabilities
      Query the stream for a specific capability.
      Specified by:
      hasCapability in interface StreamCapabilities
      Parameters:
      capability - string to query the stream support for.
      Returns:
      True if the stream supports capability.