Package org.apache.hadoop.fs
Class VectoredReadUtils
java.lang.Object
org.apache.hadoop.fs.VectoredReadUtils
Utility class which implements helper methods used
in vectored IO implementation.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Consumer<ByteBuffer>This releaser just logs at debug that the buffer was released. -
Method Summary
Modifier and TypeMethodDescriptionstatic booleanhasVectorIOCapability(String capability) Default vector IO probes.static booleanisOrderedDisjoint(List<? extends FileRange> input, int chunkSize, int minimumSeek) Is the given input list.static List<CombinedFileRange>mergeSortedRanges(List<? extends FileRange> sortedRanges, int chunkSize, int minimumSeek, int maxSize) Merge sorted ranges to optimize the access from the underlying file system.static voidreadInDirectBuffer(FileRange range, ByteBuffer buffer, Function4RaisingIOE<Long, byte[], Integer, Integer, Void> operation) Read bytes from stream into a byte buffer using an intermediate byte array.static CompletableFuture<ByteBuffer>readRangeFrom(PositionedReadable stream, FileRange range, IntFunction<ByteBuffer> allocate) Synchronously reads a range from the stream dealing with the combinations of ByteBuffers buffers and PositionedReadable streams.static CompletableFuture<ByteBuffer>readRangeFrom(PositionedReadable stream, FileRange range, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) Synchronously reads a range from the stream dealing with the combinations of ByteBuffers buffers and PositionedReadable streams.static voidreadVectored(PositionedReadable stream, List<? extends FileRange> ranges, IntFunction<ByteBuffer> allocate) This is the default implementation which iterates through the ranges to read each synchronously, but the intent is that subclasses can make more efficient readers.static voidreadVectored(PositionedReadable stream, List<? extends FileRange> ranges, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) Variant ofreadVectored(PositionedReadable, List, IntFunction)where a release() function is invoked if problems surface during reads.static longroundDown(long offset, int chunkSize) Calculates floor value of offset based on chunk size.static longroundUp(long offset, int chunkSize) Calculates the ceiling value of offset based on chunk size.static ByteBuffersliceTo(ByteBuffer readData, long readOffset, FileRange request) Slice the data that was read to the user's request.sortRangeList(List<? extends FileRange> input) Sort the input ranges by offset; no validation is done.static FileRange[]sortRanges(List<? extends FileRange> input) Sort the input ranges by offset; no validation is done.validateAndSortRanges(List<? extends FileRange> input, Optional<Long> fileLength) Validate a list of ranges (including overlapping checks) and return the sorted list.static <T extends FileRange>
TvalidateRangeRequest(T range) Validate a single range.static voidvalidateVectoredReadRanges(List<? extends FileRange> ranges) Validate a list of vectored read ranges.
-
Field Details
-
LOG_BYTE_BUFFER_RELEASED
This releaser just logs at debug that the buffer was released.
-
-
Method Details
-
validateRangeRequest
Validate a single range.- Type Parameters:
T- range type- Parameters:
range- range to validate.- Returns:
- the range.
- Throws:
IllegalArgumentException- the range length is negative or other invalid condition is met other than the those which raise EOFException or NullPointerException.EOFException- the range offset is negativeNullPointerException- if the range is null.
-
validateVectoredReadRanges
Validate a list of vectored read ranges.- Parameters:
ranges- list of ranges.- Throws:
EOFException- any EOF exception.
-
readVectored
public static void readVectored(PositionedReadable stream, List<? extends FileRange> ranges, IntFunction<ByteBuffer> allocate) throws EOFException This is the default implementation which iterates through the ranges to read each synchronously, but the intent is that subclasses can make more efficient readers. The data or exceptions are pushed intoFileRange.getData().- Parameters:
stream- the stream to read the data fromranges- the byte ranges to readallocate- the byte buffer allocation- Throws:
IllegalArgumentException- if there are overlapping ranges or a range is invalidEOFException- the range offset is negative
-
readVectored
public static void readVectored(PositionedReadable stream, List<? extends FileRange> ranges, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) throws EOFException Variant ofreadVectored(PositionedReadable, List, IntFunction)where a release() function is invoked if problems surface during reads.- Parameters:
stream- the stream to read the data fromranges- the byte ranges to readallocate- the function to allocate ByteBufferrelease- the function to release a ByteBuffer.- Throws:
IllegalArgumentException- if the any of ranges are invalid, or they overlap.EOFException- the range offset is negative
-
readRangeFrom
public static CompletableFuture<ByteBuffer> readRangeFrom(PositionedReadable stream, FileRange range, IntFunction<ByteBuffer> allocate) throws EOFException Synchronously reads a range from the stream dealing with the combinations of ByteBuffers buffers and PositionedReadable streams.- Parameters:
stream- the stream to read fromrange- the range to readallocate- the function to allocate ByteBuffers- Returns:
- the CompletableFuture that contains the read data or an exception.
- Throws:
IllegalArgumentException- the range is invalid other than by offset or being null.EOFException- the range offset is negativeNullPointerException- if the range is null.
-
readRangeFrom
public static CompletableFuture<ByteBuffer> readRangeFrom(PositionedReadable stream, FileRange range, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) throws EOFException Synchronously reads a range from the stream dealing with the combinations of ByteBuffers buffers and PositionedReadable streams.- Parameters:
stream- the stream to read fromrange- the range to readallocate- the function to allocate ByteBuffersrelease- the function to release a ByteBuffer.- Returns:
- the CompletableFuture that contains the read data or an exception.
- Throws:
IllegalArgumentException- the range is invalid other than by offset or being null.EOFException- the range offset is negativeNullPointerException- if the range is null.
-
readInDirectBuffer
public static void readInDirectBuffer(FileRange range, ByteBuffer buffer, Function4RaisingIOE<Long, byte[], throws IOExceptionInteger, Integer, Void> operation) Read bytes from stream into a byte buffer using an intermediate byte array.(position, buffer, buffer-offset, length): Void position:= the position within the file to read data. buffer := a buffer to read fully `length` bytes into. buffer-offset := the offset within the buffer to write data length := the number of bytes to read.The passed in function MUST block until the required length of data is read, or an exception is thrown.- Parameters:
range- range to readbuffer- buffer to fill.operation- operation to use for reading data.- Throws:
IOException- any IOE.
-
isOrderedDisjoint
public static boolean isOrderedDisjoint(List<? extends FileRange> input, int chunkSize, int minimumSeek) Is the given input list.- already sorted by offset
- each range is more than minimumSeek apart
- the start and end of each range is a multiple of chunkSize
- Parameters:
input- the list of input ranges.chunkSize- the size of the chunks that the offset and end must align to.minimumSeek- the minimum distance between ranges.- Returns:
- true if we can use the input list as is.
-
roundDown
public static long roundDown(long offset, int chunkSize) Calculates floor value of offset based on chunk size.- Parameters:
offset- file offset.chunkSize- file chunk size.- Returns:
- floor value.
-
roundUp
public static long roundUp(long offset, int chunkSize) Calculates the ceiling value of offset based on chunk size.- Parameters:
offset- file offset.chunkSize- file chunk size.- Returns:
- ceil value.
-
validateAndSortRanges
public static List<? extends FileRange> validateAndSortRanges(List<? extends FileRange> input, Optional<Long> fileLength) throws EOFException Validate a list of ranges (including overlapping checks) and return the sorted list.Two ranges overlap when the start offset of second is less than the end offset of first. End offset is calculated as start offset + length.
- Parameters:
input- input listfileLength- file length if known- Returns:
- a new sorted list.
- Throws:
IllegalArgumentException- if there are overlapping ranges or a range element is invalid (other than with negative offset)EOFException- if the last range extends beyond the end of the file supplied or a range offset is negative
-
sortRangeList
Sort the input ranges by offset; no validation is done.- Parameters:
input- input ranges.- Returns:
- a new list of the ranges, sorted by offset.
-
sortRanges
Sort the input ranges by offset; no validation is done.This method is used externally and must be retained with the signature unchanged.
- Parameters:
input- input ranges.- Returns:
- a new list of the ranges, sorted by offset.
-
mergeSortedRanges
public static List<CombinedFileRange> mergeSortedRanges(List<? extends FileRange> sortedRanges, int chunkSize, int minimumSeek, int maxSize) Merge sorted ranges to optimize the access from the underlying file system. The motivations are that:- Upper layers want to pass down logical file ranges.
- Fewer reads have better performance.
- Applications want callbacks as ranges are read.
- Some file systems want to round ranges to be at checksum boundaries.
- Parameters:
sortedRanges- already sorted list of ranges based on offset.chunkSize- round the start and end points to multiples of chunkSizeminimumSeek- the smallest gap that we should seek over in bytesmaxSize- the largest combined file range in bytes- Returns:
- the list of sorted CombinedFileRanges that cover the input
-
sliceTo
Slice the data that was read to the user's request. This function assumes that the user's request is completely subsumed by the read data. This always creates a new buffer pointing to the same underlying data but with its own mark and position fields such that reading one buffer can't effect other's mark and position.- Parameters:
readData- the buffer with the readDatareadOffset- the offset in the file for the readDatarequest- the user's request- Returns:
- the readData buffer that is sliced to the user's request
-
hasVectorIOCapability
Default vector IO probes. These are capabilities which streams that leave vector IO to the default methods should return when queried for vector capabilities.- Parameters:
capability- capability to probe for.- Returns:
- true if the given capability holds for vectored IO features.
-