parcel | Lucy |
class variable | LUCY_SEGWRITER |
struct symbol | lucy_SegWriter |
class nickname | lucy_SegWriter |
header file | Lucy/Index/SegWriter.h |
Lucy::Index::SegWriter – Write one segment of an index.
SegWriter is a conduit through which information fed to Indexer passes. It manages Segment and Inverter, invokes the Analyzer chain, and feeds low level DataWriters such as PostingListWriter and DocWriter.
The sub-components of a SegWriter are determined by Architecture. DataWriter components which are added to the stack of writers via Add_Writer() have Add_Inverted_Doc() invoked for each document supplied to SegWriter’s Add_Doc().
void
lucy_SegWriter_Register(
lucy_SegWriter *self,
cfish_String *api,
lucy_DataWriter *component // decremented
);
Register a DataWriter component with the SegWriter. (Note that registration simply makes the writer available via Fetch(), so you may also want to call Add_Writer()).
The name of the DataWriter api which writer
implements.
A DataWriter.
cfish_Obj*
lucy_SegWriter_Fetch(
lucy_SegWriter *self,
cfish_String *api
);
Retrieve a registered component.
The name of the DataWriter api which the component implements.
void
lucy_SegWriter_Add_Writer(
lucy_SegWriter *self,
lucy_DataWriter *writer // decremented
);
Add a DataWriter to the SegWriter’s stack of writers.
void
lucy_SegWriter_Add_Doc(
lucy_SegWriter *self,
lucy_Doc *doc,
float boost
);
Add a document to the segment. Inverts doc
, increments
the Segment’s internal document id, then calls Add_Inverted_Doc(),
feeding all sub-writers.
void
lucy_SegWriter_Add_Segment(
lucy_SegWriter *self,
lucy_SegReader *reader,
lucy_I32Array *doc_map
);
Add content from an existing segment into the one currently being written.
The SegReader containing content to add.
An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.
void
lucy_SegWriter_Merge_Segment(
lucy_SegWriter *self,
lucy_SegReader *reader,
lucy_I32Array *doc_map
);
Move content from an existing segment into the one currently being written.
The default implementation calls Add_Segment() then Delete_Segment().
The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.
An array of integers mapping old document ids to new. Deleted documents are mapped to 0, indicating that they should be skipped.
void
lucy_SegWriter_Delete_Segment(
lucy_SegWriter *self,
lucy_SegReader *reader
);
Remove a segment’s data. The default implementation is a no-op, as all files within the segment directory will be automatically deleted. Subclasses which manage their own files outside of the segment system should override this method and use it as a trigger for cleaning up obsolete data.
The SegReader containing content to merge, which must represent a segment which is part of the the current snapshot.
void
lucy_SegWriter_Finish(
lucy_SegWriter *self
);
Complete the segment: close all streams, store metadata, etc.
cfish_Hash* // incremented
lucy_SegWriter_Metadata(
lucy_SegWriter *self
);
Arbitrary metadata to be serialized and stored by the Segment. The default implementation supplies a hash with a single key-value pair for “format”.
int32_t
lucy_SegWriter_Format(
lucy_SegWriter *self
);
Every writer must specify a file format revision number, which should increment each time the format changes. Responsibility for revision checking is left to the companion DataReader.
lucy_Snapshot*
lucy_SegWriter_Get_Snapshot(
lucy_SegWriter *self
);
Accessor for “snapshot” member var.
lucy_Segment*
lucy_SegWriter_Get_Segment(
lucy_SegWriter *self
);
Accessor for “segment” member var.
lucy_PolyReader*
lucy_SegWriter_Get_PolyReader(
lucy_SegWriter *self
);
Accessor for “polyreader” member var.
lucy_Schema*
lucy_SegWriter_Get_Schema(
lucy_SegWriter *self
);
Accessor for “schema” member var.
lucy_Folder*
lucy_SegWriter_Get_Folder(
lucy_SegWriter *self
);
Accessor for “folder” member var.
Lucy::Index::SegWriter is a Lucy::Index::DataWriter is a Clownfish::Obj.
Copyright © 2010-2015 The Apache Software Foundation, Licensed under the
Apache License, Version 2.0.
Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The
Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their
respective owners.