This project has retired. For details please refer to its Attic page.
Lucy::Search::Compiler – C API Documentation
Apache Lucy™

Lucy::Search::Compiler

parcel Lucy
class variable LUCY_COMPILER
struct symbol lucy_Compiler
class nickname lucy_Compiler
header file Lucy/Search/Compiler.h

Name

Lucy::Search::Compiler – Query-to-Matcher compiler.

Description

The purpose of the Compiler class is to take a specification in the form of a Query object and compile a Matcher object that can do real work.

The simplest Compiler subclasses – such as those associated with constant-scoring Query types – might simply implement a Make_Matcher() method which passes along information verbatim from the Query to the Matcher’s constructor.

However it is common for the Compiler to perform some calculations which affect it’s “weight” – a floating point multiplier that the Matcher will factor into each document’s score. If that is the case, then the Compiler subclass may wish to override Get_Weight(), Sum_Of_Squared_Weights(), and Apply_Norm_Factor().

Compiling a Matcher is a two stage process.

The first stage takes place during the Compiler’s construction, which is where the Query object meets a Searcher object for the first time. Searchers operate on a specific document collection and they can tell you certain statistical information about the collection – such as how many total documents are in the collection, or how many documents in the collection a particular term is present in. Lucy’s core Compiler classes plug this information into the classic TF/IDF weighting algorithm to adjust the Compiler’s weight; custom subclasses might do something similar.

The second stage of compilation is Make_Matcher(), method, which is where the Compiler meets a SegReader object. SegReaders are associated with a single segment within a single index on a single machine, and are thus lower-level than Searchers, which may represent a document collection spread out over a search cluster (comprising several indexes and many segments). The Compiler object can use new information supplied by the SegReader – such as whether a term is missing from the local index even though it is present within the larger collection represented by the Searcher – when figuring out what to feed to the Matchers’s constructor, or whether Make_Matcher() should return a Matcher at all.

Functions

init
lucy_Compiler*
lucy_Compiler_init(
    lucy_Compiler *self,
    lucy_Query *parent,
    lucy_Searcher *searcher,
    lucy_Similarity *similarity,
    float boost
);

Abstract initializer.

parent

The parent Query.

searcher

A Lucy::Search::Searcher, such as an IndexSearcher.

similarity

A Similarity.

boost

An arbitrary scoring multiplier. Defaults to the boost of the parent Query.

Methods

Make_Matcher (abstract)
lucy_Matcher* // incremented
lucy_Compiler_Make_Matcher(
    lucy_Compiler *self,
    lucy_SegReader *reader,
    bool need_score
);

Factory method returning a Matcher.

reader

A SegReader.

need_score

Indicate whether the Matcher must implement Score().

Returns: a Matcher, or NULL if the Matcher would have matched no documents.

Get_Weight
float
lucy_Compiler_Get_Weight(
    lucy_Compiler *self
);

Return the Compiler’s numerical weight, a scoring multiplier. By default, returns the object’s boost.

Get_Similarity
lucy_Similarity*
lucy_Compiler_Get_Similarity(
    lucy_Compiler *self
);

Accessor for the Compiler’s Similarity object.

Get_Parent
lucy_Query*
lucy_Compiler_Get_Parent(
    lucy_Compiler *self
);

Accessor for the Compiler’s parent Query object.

Sum_Of_Squared_Weights
float
lucy_Compiler_Sum_Of_Squared_Weights(
    lucy_Compiler *self
);

Compute and return a raw weighting factor. (This quantity is used by Normalize()). By default, simply returns 1.0.

Apply_Norm_Factor
void
lucy_Compiler_Apply_Norm_Factor(
    lucy_Compiler *self,
    float factor
);

Apply a floating point normalization multiplier. For a TermCompiler, this involves multiplying its own weight by the supplied factor; combining classes such as ORCompiler would apply the factor recursively to their children.

The default implementation is a no-op; subclasses may wish to multiply their internal weight by the supplied factor.

factor

The multiplier.

Normalize
void
lucy_Compiler_Normalize(
    lucy_Compiler *self
);

Take a newly minted Compiler object and apply query-specific normalization factors. Should be invoked by Query subclasses during Make_Compiler() for top-level nodes.

For a TermQuery, the scoring formula is approximately:

(tf_d * idf_t / norm_d) * (tf_q * idf_t / norm_q)

Normalize() is theoretically concerned with applying the second half of that formula to a the Compiler’s weight. What actually happens depends on how the Compiler and Similarity methods called internally are implemented.

Equals
bool
lucy_Compiler_Equals(
    lucy_Compiler *self,
    cfish_Obj *other
);

Indicate whether two objects are the same. By default, compares the memory address.

other

Another Obj.

To_String
cfish_String* // incremented
lucy_Compiler_To_String(
    lucy_Compiler *self
);

Generic stringification: “ClassName@hex_mem_address”.

Methods inherited from Lucy::Search::Query

Make_Compiler (abstract)
lucy_Compiler* // incremented
lucy_Compiler_Make_Compiler(
    lucy_Compiler *self,
    lucy_Searcher *searcher,
    float boost,
    bool subordinate
);

Abstract factory method returning a Compiler derived from this Query.

searcher

A Searcher.

boost

A scoring multiplier.

subordinate

Indicates whether the Query is a subquery (as opposed to a top-level query). If false, the implementation must invoke Normalize() on the newly minted Compiler object before returning it.

Set_Boost
void
lucy_Compiler_Set_Boost(
    lucy_Compiler *self,
    float boost
);

Set the Query’s boost.

Get_Boost
float
lucy_Compiler_Get_Boost(
    lucy_Compiler *self
);

Get the Query’s boost.

Dump
cfish_Obj* // incremented
lucy_Compiler_Dump(
    lucy_Compiler *self
);
Load
cfish_Obj* // incremented
lucy_Compiler_Load(
    lucy_Compiler *self,
    cfish_Obj *dump
);

Inheritance

Lucy::Search::Compiler is a Lucy::Search::Query is a Clownfish::Obj.