Lucy::Analysis::PolyAnalyzer – Apache Lucy Documentation

Apache » Lucy » Docs » Perl » Lucy » Analysis

About

Resources

Related Projects

NAME

Lucy::Analysis::PolyAnalyzer - Multiple Analyzers in series.

SYNOPSIS

my $schema = Lucy::Plan::Schema->new;
my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( 
    analyzers => \@analyzers,
);
my $type = Lucy::Plan::FullTextType->new(
    analyzer => $polyanalyzer,
);
$schema->spec_field( name => 'title',   type => $type );
$schema->spec_field( name => 'content', type => $type );

A PolyAnalyzer is a series of Analyzers, each of which will be called upon to “analyze” text in turn. You can either provide the Analyzers yourself, or you can specify a supported language, in which case a PolyAnalyzer consisting of a CaseFolder, a RegexTokenizer, and a SnowballStemmer will be generated for you.

The language parameter is DEPRECATED. Use EasyAnalyzer instead.

Supported languages:

en => English,
da => Danish,
de => German,
es => Spanish,
fi => Finnish,
fr => French,
hu => Hungarian,
it => Italian,
nl => Dutch,
no => Norwegian,
pt => Portuguese,
ro => Romanian,
ru => Russian,
sv => Swedish,
tr => Turkish,

CONSTRUCTORS

new

my $tokenizer    = Lucy::Analysis::StandardTokenizer->new;
my $normalizer   = Lucy::Analysis::Normalizer->new;
my $stemmer      = Lucy::Analysis::SnowballStemmer->new( language => 'en' );
my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new(
    analyzers => [ $tokenizer, $normalizer, $stemmer, ], );

Create a new PolyAnalyzer.

language - An ISO code from the list of supported languages. DEPRECATED, use EasyAnalyzer instead.
analyzers - An array of Analyzers. The order of the analyzers matters. Don’t put a SnowballStemmer before a RegexTokenizer (can’t stem whole documents or paragraphs – just individual words), or a SnowballStopFilter after a SnowballStemmer (stemmed words, e.g. “themselv”, will not appear in a stoplist). In general, the sequence should be: tokenize, normalize, stopalize, stem.

METHODS

get_analyzers

my $arrayref = $poly_analyzer->get_analyzers();

Getter for “analyzers” member.

transform

my $inversion = $poly_analyzer->transform($inversion);

Take a single Inversion as input and returns an Inversion, either the same one (presumably transformed in some way), or a new one.

inversion - An inversion.

INHERITANCE

Lucy::Analysis::PolyAnalyzer isa Lucy::Analysis::Analyzer isa Clownfish::Obj.

Copyright © 2010-2015 The Apache Software Foundation, Licensed under the Apache License, Version 2.0.
Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.