Class VectoredReadUtils

java.lang.Object
org.apache.hadoop.fs.VectoredReadUtils

@LimitedPrivate("Filesystems") @Unstable public final class VectoredReadUtils extends Object
Utility class which implements helper methods used in vectored IO implementation.
  • Field Details

    • LOG_BYTE_BUFFER_RELEASED

      public static final Consumer<ByteBuffer> LOG_BYTE_BUFFER_RELEASED
      This releaser just logs at debug that the buffer was released.
  • Method Details

    • validateRangeRequest

      public static <T extends FileRange> T validateRangeRequest(T range) throws EOFException
      Validate a single range.
      Type Parameters:
      T - range type
      Parameters:
      range - range to validate.
      Returns:
      the range.
      Throws:
      IllegalArgumentException - the range length is negative or other invalid condition is met other than the those which raise EOFException or NullPointerException.
      EOFException - the range offset is negative
      NullPointerException - if the range is null.
    • validateVectoredReadRanges

      public static void validateVectoredReadRanges(List<? extends FileRange> ranges) throws EOFException
      Validate a list of vectored read ranges.
      Parameters:
      ranges - list of ranges.
      Throws:
      EOFException - any EOF exception.
    • readVectored

      public static void readVectored(PositionedReadable stream, List<? extends FileRange> ranges, IntFunction<ByteBuffer> allocate) throws EOFException
      This is the default implementation which iterates through the ranges to read each synchronously, but the intent is that subclasses can make more efficient readers. The data or exceptions are pushed into FileRange.getData().
      Parameters:
      stream - the stream to read the data from
      ranges - the byte ranges to read
      allocate - the byte buffer allocation
      Throws:
      IllegalArgumentException - if there are overlapping ranges or a range is invalid
      EOFException - the range offset is negative
    • readVectored

      public static void readVectored(PositionedReadable stream, List<? extends FileRange> ranges, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) throws EOFException
      Variant of readVectored(PositionedReadable, List, IntFunction) where a release() function is invoked if problems surface during reads.
      Parameters:
      stream - the stream to read the data from
      ranges - the byte ranges to read
      allocate - the function to allocate ByteBuffer
      release - the function to release a ByteBuffer.
      Throws:
      IllegalArgumentException - if the any of ranges are invalid, or they overlap.
      EOFException - the range offset is negative
    • readRangeFrom

      public static CompletableFuture<ByteBuffer> readRangeFrom(PositionedReadable stream, FileRange range, IntFunction<ByteBuffer> allocate) throws EOFException
      Synchronously reads a range from the stream dealing with the combinations of ByteBuffers buffers and PositionedReadable streams.
      Parameters:
      stream - the stream to read from
      range - the range to read
      allocate - the function to allocate ByteBuffers
      Returns:
      the CompletableFuture that contains the read data or an exception.
      Throws:
      IllegalArgumentException - the range is invalid other than by offset or being null.
      EOFException - the range offset is negative
      NullPointerException - if the range is null.
    • readRangeFrom

      public static CompletableFuture<ByteBuffer> readRangeFrom(PositionedReadable stream, FileRange range, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) throws EOFException
      Synchronously reads a range from the stream dealing with the combinations of ByteBuffers buffers and PositionedReadable streams.
      Parameters:
      stream - the stream to read from
      range - the range to read
      allocate - the function to allocate ByteBuffers
      release - the function to release a ByteBuffer.
      Returns:
      the CompletableFuture that contains the read data or an exception.
      Throws:
      IllegalArgumentException - the range is invalid other than by offset or being null.
      EOFException - the range offset is negative
      NullPointerException - if the range is null.
    • readInDirectBuffer

      public static void readInDirectBuffer(FileRange range, ByteBuffer buffer, Function4RaisingIOE<Long,byte[],Integer,Integer,Void> operation) throws IOException
      Read bytes from stream into a byte buffer using an intermediate byte array.
           (position, buffer, buffer-offset, length): Void
           position:= the position within the file to read data.
           buffer := a buffer to read fully `length` bytes into.
           buffer-offset := the offset within the buffer to write data
           length := the number of bytes to read.
         
      The passed in function MUST block until the required length of data is read, or an exception is thrown.
      Parameters:
      range - range to read
      buffer - buffer to fill.
      operation - operation to use for reading data.
      Throws:
      IOException - any IOE.
    • isOrderedDisjoint

      public static boolean isOrderedDisjoint(List<? extends FileRange> input, int chunkSize, int minimumSeek)
      Is the given input list.
      • already sorted by offset
      • each range is more than minimumSeek apart
      • the start and end of each range is a multiple of chunkSize
      Parameters:
      input - the list of input ranges.
      chunkSize - the size of the chunks that the offset and end must align to.
      minimumSeek - the minimum distance between ranges.
      Returns:
      true if we can use the input list as is.
    • roundDown

      public static long roundDown(long offset, int chunkSize)
      Calculates floor value of offset based on chunk size.
      Parameters:
      offset - file offset.
      chunkSize - file chunk size.
      Returns:
      floor value.
    • roundUp

      public static long roundUp(long offset, int chunkSize)
      Calculates the ceiling value of offset based on chunk size.
      Parameters:
      offset - file offset.
      chunkSize - file chunk size.
      Returns:
      ceil value.
    • validateAndSortRanges

      public static List<? extends FileRange> validateAndSortRanges(List<? extends FileRange> input, Optional<Long> fileLength) throws EOFException
      Validate a list of ranges (including overlapping checks) and return the sorted list.

      Two ranges overlap when the start offset of second is less than the end offset of first. End offset is calculated as start offset + length.

      Parameters:
      input - input list
      fileLength - file length if known
      Returns:
      a new sorted list.
      Throws:
      IllegalArgumentException - if there are overlapping ranges or a range element is invalid (other than with negative offset)
      EOFException - if the last range extends beyond the end of the file supplied or a range offset is negative
    • sortRangeList

      public static List<? extends FileRange> sortRangeList(List<? extends FileRange> input)
      Sort the input ranges by offset; no validation is done.
      Parameters:
      input - input ranges.
      Returns:
      a new list of the ranges, sorted by offset.
    • sortRanges

      @Stable public static FileRange[] sortRanges(List<? extends FileRange> input)
      Sort the input ranges by offset; no validation is done.

      This method is used externally and must be retained with the signature unchanged.

      Parameters:
      input - input ranges.
      Returns:
      a new list of the ranges, sorted by offset.
    • mergeSortedRanges

      public static List<CombinedFileRange> mergeSortedRanges(List<? extends FileRange> sortedRanges, int chunkSize, int minimumSeek, int maxSize)
      Merge sorted ranges to optimize the access from the underlying file system. The motivations are that:
      • Upper layers want to pass down logical file ranges.
      • Fewer reads have better performance.
      • Applications want callbacks as ranges are read.
      • Some file systems want to round ranges to be at checksum boundaries.
      Parameters:
      sortedRanges - already sorted list of ranges based on offset.
      chunkSize - round the start and end points to multiples of chunkSize
      minimumSeek - the smallest gap that we should seek over in bytes
      maxSize - the largest combined file range in bytes
      Returns:
      the list of sorted CombinedFileRanges that cover the input
    • sliceTo

      public static ByteBuffer sliceTo(ByteBuffer readData, long readOffset, FileRange request)
      Slice the data that was read to the user's request. This function assumes that the user's request is completely subsumed by the read data. This always creates a new buffer pointing to the same underlying data but with its own mark and position fields such that reading one buffer can't effect other's mark and position.
      Parameters:
      readData - the buffer with the readData
      readOffset - the offset in the file for the readData
      request - the user's request
      Returns:
      the readData buffer that is sliced to the user's request
    • hasVectorIOCapability

      public static boolean hasVectorIOCapability(String capability)
      Default vector IO probes. These are capabilities which streams that leave vector IO to the default methods should return when queried for vector capabilities.
      Parameters:
      capability - capability to probe for.
      Returns:
      true if the given capability holds for vectored IO features.