This project has retired. For details please refer to its Attic page.
Lucy::Search::Searcher – Apache Lucy Documentation
Apache Lucy™

NAME

Lucy::Search::Searcher - Base class for searching collections of documents.

SYNOPSIS

# Abstract base class.

DESCRIPTION

Abstract base class for objects which search. Core subclasses include IndexSearcher and PolySearcher.

CONSTRUCTORS

new

package MySearcher;
use base qw( Lucy::Search::Searcher );
sub new {
    my $self = shift->SUPER::new;
    ...
    return $self;
}

Abstract constructor.

  • schema - A Schema.

ABSTRACT METHODS

doc_max

my $int = $searcher->doc_max();

Return the maximum number of docs in the collection represented by the Searcher, which is also the highest possible internal doc id. Documents which have been marked as deleted but not yet purged are included in this count.

doc_freq

my $int = $searcher->doc_freq(
    field => $field  # required
    term  => $term   # required
);

Return the number of documents which contain the term in the given field.

  • field - Field name.
  • term - The term to look up.

collect

$searcher->collect(
    query     => $query      # required
    collector => $collector  # required
);

Iterate over hits, feeding them into a Collector.

  • query - A Query.
  • collector - A Collector.

fetch_doc

my $hit_doc = $searcher->fetch_doc($doc_id);

Retrieve a document. Throws an error if the doc id is out of range.

  • doc_id - A document id.

METHODS

glean_query

my $query = $searcher->glean_query($query);
my $query = $searcher->glean_query();  # default: undef

If the supplied object is a Query, return it; if it’s a query string, create a QueryParser and parse it to produce a query against all indexed fields.

hits

my $hits = $searcher->hits(
    query      => $query       # required
    offset     => $offset      # default: 0
    num_wanted => $num_wanted  # default: 10
    sort_spec  => $sort_spec   # default: undef
);

Return a Hits object containing the top results.

  • query - Either a Query object or a query string.
  • offset - The number of most-relevant hits to discard, typically used when “paging” through hits N at a time. Setting offset to 20 and num_wanted to 10 retrieves hits 21-30, assuming that 30 hits can be found.
  • num_wanted - The number of hits you would like to see after offset is taken into account.
  • sort_spec - A SortSpec, which will affect how results are ranked and returned.

get_schema

my $schema = $searcher->get_schema();

Accessor for the object’s schema member.

INHERITANCE

Lucy::Search::Searcher isa Clownfish::Obj.