parcel | Lucy |
class variable | LUCY_DATAWRITER |
struct symbol | lucy_DataWriter |
class nickname | lucy_DataWriter |
header file | Lucy/Index/DataWriter.h |
Lucy::Index::DataWriter – Write data to an index.
DataWriter is an abstract base class for writing index data, generally in segment-sized chunks. Each component of an index – e.g. stored fields, lexicon, postings, deletions – is represented by a DataWriter/DataReader pair.
Components may be specified per index by subclassing Architecture.
lucy_DataWriter*
lucy_DataWriter_init(
lucy_DataWriter *self,
lucy_Schema *schema,
lucy_Snapshot *snapshot,
lucy_Segment *segment,
lucy_PolyReader *polyreader
);
Abstract initializer.
The Snapshot that will be committed at the end of the indexing session.
The Segment in progress.
A PolyReader representing all existing data in the index. (If the index is brand new, the PolyReader will have no sub-readers).
void
lucy_DataWriter_Add_Segment(
lucy_DataWriter *self,
lucy_SegReader *reader,
lucy_I32Array *doc_map
);
Add content from an existing segment into the one currently being written.
The SegReader containing content to add.
An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.
void
lucy_DataWriter_Delete_Segment(
lucy_DataWriter *self,
lucy_SegReader *reader
);
Remove a segment’s data. The default implementation is a no-op, as all files within the segment directory will be automatically deleted. Subclasses which manage their own files outside of the segment system should override this method and use it as a trigger for cleaning up obsolete data.
The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.
void
lucy_DataWriter_Merge_Segment(
lucy_DataWriter *self,
lucy_SegReader *reader,
lucy_I32Array *doc_map
);
Move content from an existing segment into the one currently being written.
The default implementation calls Add_Segment() then Delete_Segment().
The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.
An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.
void
lucy_DataWriter_Finish(
lucy_DataWriter *self
);
Complete the segment: close all streams, store metadata, etc.
cfish_Hash* // incremented
lucy_DataWriter_Metadata(
lucy_DataWriter *self
);
Arbitrary metadata to be serialized and stored by the Segment. The default implementation supplies a hash with a single key-value pair for “format”.
int32_t
lucy_DataWriter_Format(
lucy_DataWriter *self
);
Every writer must specify a file format revision number, which should increment each time the format changes. Responsibility for revision checking is left to the companion DataReader.
lucy_Snapshot*
lucy_DataWriter_Get_Snapshot(
lucy_DataWriter *self
);
Accessor for “snapshot” member var.
lucy_Segment*
lucy_DataWriter_Get_Segment(
lucy_DataWriter *self
);
Accessor for “segment” member var.
lucy_PolyReader*
lucy_DataWriter_Get_PolyReader(
lucy_DataWriter *self
);
Accessor for “polyreader” member var.
lucy_Schema*
lucy_DataWriter_Get_Schema(
lucy_DataWriter *self
);
Accessor for “schema” member var.
lucy_Folder*
lucy_DataWriter_Get_Folder(
lucy_DataWriter *self
);
Accessor for “folder” member var.
Lucy::Index::DataWriter is a Clownfish::Obj.
Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
Apache License, Version 2.0.
Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
respective owners.