Class TFile.Reader

java.lang.Object
org.apache.hadoop.io.file.tfile.TFile.Reader
All Implemented Interfaces:
Closeable, AutoCloseable
Enclosing class:
TFile

@Evolving public static class TFile.Reader extends Object implements Closeable
TFile Reader. Users may only read TFiles by creating TFile.Reader.Scanner. objects. A scanner may scan the whole TFile (createScanner() ) , a portion of TFile based on byte offsets ( createScannerByByteRange(long, long)), or a portion of TFile with keys fall in a certain key range (for sorted TFile only, createScannerByKey(byte[], byte[]) or createScannerByKey(RawComparable, RawComparable)).
  • Constructor Details

    • Reader

      public Reader(FSDataInputStream fsdis, long fileLength, Configuration conf) throws IOException
      Constructor
      Parameters:
      fsdis - FS input stream of the TFile.
      fileLength - The length of TFile. This is required because we have no easy way of knowing the actual size of the input file through the File input stream.
      conf - configuration.
      Throws:
      IOException - raised on errors performing I/O.
  • Method Details

    • close

      public void close() throws IOException
      Close the reader. The state of the Reader object is undefined after close. Calling close() for multiple times has no effect.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • getComparatorName

      public String getComparatorName()
      Get the string representation of the comparator.
      Returns:
      If the TFile is not sorted by keys, an empty string will be returned. Otherwise, the actual comparator string that is provided during the TFile creation time will be returned.
    • isSorted

      public boolean isSorted()
      Is the TFile sorted?
      Returns:
      true if TFile is sorted.
    • getEntryCount

      public long getEntryCount()
      Get the number of key-value pair entries in TFile.
      Returns:
      the number of key-value pairs in TFile
    • getFirstKey

      public RawComparable getFirstKey() throws IOException
      Get the first key in the TFile.
      Returns:
      The first key in the TFile.
      Throws:
      IOException - raised on errors performing I/O.
    • getLastKey

      public RawComparable getLastKey() throws IOException
      Get the last key in the TFile.
      Returns:
      The last key in the TFile.
      Throws:
      IOException - raised on errors performing I/O.
    • getEntryComparator

      public Comparator<TFile.Reader.Scanner.Entry> getEntryComparator()
      Get a Comparator object to compare Entries. It is useful when you want stores the entries in a collection (such as PriorityQueue) and perform sorting or comparison among entries based on the keys without copying out the key.
      Returns:
      An Entry Comparator..
    • getComparator

      public Comparator<RawComparable> getComparator()
      Get an instance of the RawComparator that is constructed based on the string comparator representation.
      Returns:
      a Comparator that can compare RawComparable's.
    • getMetaBlock

      public DataInputStream getMetaBlock(String name) throws IOException, MetaBlockDoesNotExist
      Stream access to a meta block.``
      Parameters:
      name - The name of the meta block.
      Returns:
      The input stream.
      Throws:
      IOException - on I/O error.
      MetaBlockDoesNotExist - If the meta block with the name does not exist.
    • getRecordNumNear

      public long getRecordNumNear(long offset) throws IOException
      Get the RecordNum for the first key-value pair in a compressed block whose byte offset in the TFile is greater than or equal to the specified offset.
      Parameters:
      offset - the user supplied offset.
      Returns:
      the RecordNum to the corresponding entry. If no such entry exists, it returns the total entry count.
      Throws:
      IOException - raised on errors performing I/O.
    • getKeyNear

      public RawComparable getKeyNear(long offset) throws IOException
      Get a sample key that is within a block whose starting offset is greater than or equal to the specified offset.
      Parameters:
      offset - The file offset.
      Returns:
      the key that fits the requirement; or null if no such key exists (which could happen if the offset is close to the end of the TFile).
      Throws:
      IOException - raised on errors performing I/O.
    • createScanner

      public TFile.Reader.Scanner createScanner() throws IOException
      Get a scanner than can scan the whole TFile.
      Returns:
      The scanner object. A valid Scanner is always returned even if the TFile is empty.
      Throws:
      IOException - raised on errors performing I/O.
    • createScannerByByteRange

      public TFile.Reader.Scanner createScannerByByteRange(long offset, long length) throws IOException
      Get a scanner that covers a portion of TFile based on byte offsets.
      Parameters:
      offset - The beginning byte offset in the TFile.
      length - The length of the region.
      Returns:
      The actual coverage of the returned scanner tries to match the specified byte-region but always round up to the compression block boundaries. It is possible that the returned scanner contains zero key-value pairs even if length is positive.
      Throws:
      IOException - raised on errors performing I/O.
    • createScanner

      @Deprecated public TFile.Reader.Scanner createScanner(byte[] beginKey, byte[] endKey) throws IOException
      Deprecated.
      Get a scanner that covers a portion of TFile based on keys.
      Parameters:
      beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
      endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
      Returns:
      The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
      Throws:
      IOException - raised on errors performing I/O.
    • createScannerByKey

      public TFile.Reader.Scanner createScannerByKey(byte[] beginKey, byte[] endKey) throws IOException
      Get a scanner that covers a portion of TFile based on keys.
      Parameters:
      beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
      endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
      Returns:
      The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
      Throws:
      IOException - raised on errors performing I/O.
    • createScanner

      @Deprecated public TFile.Reader.Scanner createScanner(RawComparable beginKey, RawComparable endKey) throws IOException
      Get a scanner that covers a specific key range.
      Parameters:
      beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
      endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
      Returns:
      The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
      Throws:
      IOException - raised on errors performing I/O.
    • createScannerByKey

      public TFile.Reader.Scanner createScannerByKey(RawComparable beginKey, RawComparable endKey) throws IOException
      Get a scanner that covers a specific key range.
      Parameters:
      beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
      endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
      Returns:
      The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
      Throws:
      IOException - raised on errors performing I/O.
    • createScannerByRecordNum

      public TFile.Reader.Scanner createScannerByRecordNum(long beginRecNum, long endRecNum) throws IOException
      Create a scanner that covers a range of records.
      Parameters:
      beginRecNum - The RecordNum for the first record (inclusive).
      endRecNum - The RecordNum for the last record (exclusive). To scan the whole file, either specify endRecNum==-1 or endRecNum==getEntryCount().
      Returns:
      The TFile scanner that covers the specified range of records.
      Throws:
      IOException - raised on errors performing I/O.