Class KnnVectorsWriter

java.lang.Object
org.apache.lucene.codecs.KnnVectorsWriter
All Implemented Interfaces:
Closeable, AutoCloseable, Accountable
Direct Known Subclasses:
BufferingKnnVectorsWriter, FlatVectorsWriter, Lucene99HnswVectorsWriter

public abstract class KnnVectorsWriter extends Object implements Accountable, Closeable
Writes vectors to an index.
  • Constructor Details

    • KnnVectorsWriter

      protected KnnVectorsWriter()
      Sole constructor
  • Method Details

    • addField

      public abstract KnnFieldVectorsWriter<?> addField(FieldInfo fieldInfo) throws IOException
      Add new field for indexing
      Throws:
      IOException
    • flush

      public abstract void flush(int maxDoc, Sorter.DocMap sortMap) throws IOException
      Flush all buffered data on disk *
      Throws:
      IOException
    • finish

      public abstract void finish() throws IOException
      Called once at the end before close
      Throws:
      IOException
    • mergeOneField

      public IORunnable mergeOneField(FieldInfo fieldInfo, MergeState mergeState) throws IOException
      Merges vectors for a single field, returning a runnable for any deferred work (e.g., HNSW graph construction). The default implementation merges naively the vectors and returns null (no deferred work).

      Subclasses should override this method may implement a two-phase merge strategy where flat vectors are written in the first phase and additional indexing structures (like HNSW graphs) are built in the second phase using the already-written flat vector data.

      Parameters:
      fieldInfo - the field to merge
      mergeState - the merge state
      Returns:
      a runnable to execute in phase 2, or null if there is no deferred work
      Throws:
      IOException - if an I/O error occurs
    • merge

      public final void merge(MergeState mergeState) throws IOException
      Merges the segment vectors for all fields using a two-phase strategy:
      1. Phase 1: Merge flat vectors for all fields by calling mergeOneField(FieldInfo, MergeState), collecting deferred work (runnables) for each field.
      2. Phase 2: Execute the deferred runnables (e.g., HNSW graph construction) using the flat vector data written in phase 1.
      Throws:
      IOException
    • mapOldOrdToNewOrd

      public static void mapOldOrdToNewOrd(DocsWithFieldSet oldDocIds, Sorter.DocMap sortMap, int[] old2NewOrd, int[] new2OldOrd, DocsWithFieldSet newDocsWithField) throws IOException
      Given old doc ids and an id mapping, maps old ordinal to new ordinal. Note: this method return nothing and output are written to parameters
      Parameters:
      oldDocIds - the old or current document ordinals. Must not be null.
      sortMap - the document sorting map for how to make the new ordinals. Must not be null.
      old2NewOrd - int[] maps from old ord to new ord
      new2OldOrd - int[] maps from new ord to old ord
      newDocsWithField - set of new doc ids which has the value
      Throws:
      IOException