PhraseHunter::Phrase Class Reference

A phrasal token consists of several words or tokens. More...

#include <token.h>

Inherits PhraseHunter::MutableToken.

Inheritance diagram for PhraseHunter::Phrase:

Inheritance graph
[legend]
Collaboration diagram for PhraseHunter::Phrase:

Collaboration graph
[legend]
List of all members.

Public Types

enum  Direction { LeftToRight, RightToLeft }
 The direction which should be used to merge a Token to this Phrase. When merging two Tokens into a Phrase, we need to go through the entire occurrence matrix of one Token and see if its position is next to an occurrence of the other Token. This process can be quite time consuming but may be considerably speed up by going through the Token with less occurrences as a starting point. Phrase::Direction indicates the Token to start with: If the left Token is less frequent, call getAdjacent<LeftToRight>(TokenPtr left, TokenPtr right), otherwise call getAdjacent<RightToLeft>(TokenPtr left, TokenPtr right). More...

Public Member Functions

size_t length () const
 Returns the length of the Phrase as it is in the text repository.
unsigned int numTokens () const
 Returns how often this Phrase occurs in the corpus.

Static Public Member Functions

template<Direction d>
static TokenPtr getAdjacent (TokenPtr left, TokenPtr right)
 Merge two Tokens and specify the merging Direction. Usually, you want to call mergeTokens() instead.
static TokenPtr mergeTokens (TokenPtr left, TokenPtr right)
 Merge two Token objects into a Phrase.

Private Member Functions

 Phrase (const char *token, int tokencount)
 Phrase (schma::UnicodePtr token, int tokencount)

Private Attributes

int m_tokencount

Classes

struct  adja

Detailed Description

A phrasal token consists of several words or tokens.

Definition at line 263 of file token.h.


Member Enumeration Documentation

enum PhraseHunter::Phrase::Direction

The direction which should be used to merge a Token to this Phrase. When merging two Tokens into a Phrase, we need to go through the entire occurrence matrix of one Token and see if its position is next to an occurrence of the other Token. This process can be quite time consuming but may be considerably speed up by going through the Token with less occurrences as a starting point. Phrase::Direction indicates the Token to start with: If the left Token is less frequent, call getAdjacent<LeftToRight>(TokenPtr left, TokenPtr right), otherwise call getAdjacent<RightToLeft>(TokenPtr left, TokenPtr right).

Enumerator:
LeftToRight 
RightToLeft 

Definition at line 276 of file token.h.


Constructor & Destructor Documentation

PhraseHunter::Phrase::Phrase ( const char *  token,
int  tokencount 
) [inline, private]

Definition at line 287 of file token.h.

Referenced by getAdjacent().

PhraseHunter::Phrase::Phrase ( schma::UnicodePtr  token,
int  tokencount 
) [inline, private]

Definition at line 290 of file token.h.


Member Function Documentation

template<Phrase::Direction d>
TokenPtr PhraseHunter::Phrase::getAdjacent ( TokenPtr  left,
TokenPtr  right 
) [static]

Merge two Tokens and specify the merging Direction. Usually, you want to call mergeTokens() instead.

Definition at line 172 of file token.cpp.

References PhraseHunter::MutableToken::addOccurrence(), PhraseHunter::EmptyToken::instance(), PhraseHunter::Token::isEmpty(), and Phrase().

TokenPtr PhraseHunter::Phrase::mergeTokens ( TokenPtr  left,
TokenPtr  right 
) [static]

Merge two Token objects into a Phrase.

Definition at line 210 of file token.cpp.

size_t PhraseHunter::Phrase::length (  )  const [inline, virtual]

Returns the length of the Phrase as it is in the text repository.

Reimplemented from PhraseHunter::Token.

Definition at line 306 of file token.h.

References m_tokencount, and PhraseHunter::Token::m_tokenstring.

unsigned int PhraseHunter::Phrase::numTokens (  )  const [inline, virtual]

Returns how often this Phrase occurs in the corpus.

Reimplemented from PhraseHunter::Token.

Definition at line 312 of file token.h.

References m_tokencount.


Member Data Documentation

int PhraseHunter::Phrase::m_tokencount [private]

Definition at line 285 of file token.h.

Referenced by length(), and numTokens().


The documentation for this class was generated from the following files:
Generated on Thu Dec 21 16:14:44 2006 for The Phrasehunter by  doxygen 1.5.1