This project has retired. For details please refer to its Attic page.
Lucy::Index::DataWriter – C API Documentation
Apache Lucy™

Lucy::Index::DataWriter

parcel Lucy
class variable LUCY_DATAWRITER
struct symbol lucy_DataWriter
class nickname lucy_DataWriter
header file Lucy/Index/DataWriter.h

Name

Lucy::Index::DataWriter – Write data to an index.

Description

DataWriter is an abstract base class for writing index data, generally in segment-sized chunks. Each component of an index – e.g. stored fields, lexicon, postings, deletions – is represented by a DataWriter/DataReader pair.

Components may be specified per index by subclassing Architecture.

Functions

init
lucy_DataWriter*
lucy_DataWriter_init(
    lucy_DataWriter *self,
    lucy_Schema *schema,
    lucy_Snapshot *snapshot,
    lucy_Segment *segment,
    lucy_PolyReader *polyreader
);

Abstract initializer.

snapshot

The Snapshot that will be committed at the end of the indexing session.

segment

The Segment in progress.

polyreader

A PolyReader representing all existing data in the index. (If the index is brand new, the PolyReader will have no sub-readers).

Methods

Add_Segment (abstract)
void
lucy_DataWriter_Add_Segment(
    lucy_DataWriter *self,
    lucy_SegReader *reader,
    lucy_I32Array *doc_map
);

Add content from an existing segment into the one currently being written.

reader

The SegReader containing content to add.

doc_map

An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.

Delete_Segment
void
lucy_DataWriter_Delete_Segment(
    lucy_DataWriter *self,
    lucy_SegReader *reader
);

Remove a segment’s data. The default implementation is a no-op, as all files within the segment directory will be automatically deleted. Subclasses which manage their own files outside of the segment system should override this method and use it as a trigger for cleaning up obsolete data.

reader

The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.

Merge_Segment
void
lucy_DataWriter_Merge_Segment(
    lucy_DataWriter *self,
    lucy_SegReader *reader,
    lucy_I32Array *doc_map
);

Move content from an existing segment into the one currently being written.

The default implementation calls Add_Segment() then Delete_Segment().

reader

The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.

doc_map

An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.

Finish (abstract)
void
lucy_DataWriter_Finish(
    lucy_DataWriter *self
);

Complete the segment: close all streams, store metadata, etc.

Metadata
cfish_Hash* // incremented
lucy_DataWriter_Metadata(
    lucy_DataWriter *self
);

Arbitrary metadata to be serialized and stored by the Segment. The default implementation supplies a hash with a single key-value pair for “format”.

Format (abstract)
int32_t
lucy_DataWriter_Format(
    lucy_DataWriter *self
);

Every writer must specify a file format revision number, which should increment each time the format changes. Responsibility for revision checking is left to the companion DataReader.

Get_Snapshot
lucy_Snapshot*
lucy_DataWriter_Get_Snapshot(
    lucy_DataWriter *self
);

Accessor for “snapshot” member var.

Get_Segment
lucy_Segment*
lucy_DataWriter_Get_Segment(
    lucy_DataWriter *self
);

Accessor for “segment” member var.

Get_PolyReader
lucy_PolyReader*
lucy_DataWriter_Get_PolyReader(
    lucy_DataWriter *self
);

Accessor for “polyreader” member var.

Get_Schema
lucy_Schema*
lucy_DataWriter_Get_Schema(
    lucy_DataWriter *self
);

Accessor for “schema” member var.

Get_Folder
lucy_Folder*
lucy_DataWriter_Get_Folder(
    lucy_DataWriter *self
);

Accessor for “folder” member var.

Inheritance

Lucy::Index::DataWriter is a Clownfish::Obj.