org.eml.sir.rel.disc
Class RelatednessRanker

java.lang.Object
  extended byorg.eml.sir.rel.disc.RelatednessRanker

public class RelatednessRanker
extends java.lang.Object

An instance of this class compares a SlimDiscourse (profile) to a corpus of other SlimDiscourses (descriptions) and computes a ranking represented in RelatednessResult.

Author:
Hendrik Niederlich (EML-R)

Field Summary
static int EQU_COMPARISONS
          Equation type of RelatednessRanker.
static int EQU_QUERY_WORDS
          Equation type of RelatednessRanker.
 WordRelatednessMap wrm
           
 
Constructor Summary
RelatednessRanker()
          Create a new RelatednessRanker instance.
 
Method Summary
 int getEquation()
          Return the type number of the equation used to compute the discourse relatedness.
 int getOnlyPositiveProfFeatures()
          Return wether to use only positive professional features or not.
 Stemmer getStemmer()
          Return the stemmer.
 double getThreshold()
          Word relatedness values that are smaller than the threshold will not be used to compute the discourse relatednesses.
 void initSlimCorpus(java.lang.String corpusPath)
          Creates a new corpus of SlimDiscourses whose relatedness values to an other SlimDiscourse will be computed.
 void initWordRelatednessMap(java.lang.String profilesPath, WordRelatednessComparator wrc)
          Creates a new instance of WordRelatednessMap.
 boolean isIgnoreIrrelevantClauses()
          Return wether to ignore irrelevant clauses or not.
 boolean isUsingStopWords()
          Return wether stop words are used.
 boolean loadSlimCorpus(java.lang.String filePath)
          Load an already computed an serialized instance of a corpus.
 boolean loadWordRelatednessMap(java.lang.String filePath)
          Load an already computed an serialized instance of a WordRelatednessMap.
 RelatednessResult rank(SlimDiscourse profile)
           
 boolean saveSlimCorpus(java.lang.String filePath)
          Saerialize an save an already computed instance of a corpus.
 boolean saveWordRelatednessMap(java.lang.String filePath)
          Serialize an save an already computed instance of a WordRelatednessMap.
 void setEquation(int equationNumber)
          Set the type number of the equation used to compute the discourse relatedness.
 void setIgnoreIrrelevantClauses(boolean ignoreIrrelevantClauses)
          Set wether to ignore irrelevant clauses or not.
 void setOnlyPositiveProfFeatures(int onlyPositiveProfFeatures)
          Set wether to use only positive professional features or not.
 void setStemmer(Stemmer stemmer)
          Set the stemmer.
 void setThreshold(double threshold)
          Word relatedness values that are smaller than the threshold will not be used to compute the discourse relatednesses.
 void setUseStopWords(boolean useStopWords)
          Set wether to use stop words or not.
 java.lang.String toString()
          Return a String that contains importnat infomration about this instance of RelatednessRanker.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

EQU_COMPARISONS

public static final int EQU_COMPARISONS
Equation type of RelatednessRanker. rel_docs = sum_rel_words / ( n(doc words) * n(query words) )

See Also:
Constant Field Values

EQU_QUERY_WORDS

public static final int EQU_QUERY_WORDS
Equation type of RelatednessRanker. rel_docs = sum_rel_words / n(query words)

See Also:
Constant Field Values

wrm

public WordRelatednessMap wrm
Constructor Detail

RelatednessRanker

public RelatednessRanker()
Create a new RelatednessRanker instance.

Method Detail

getStemmer

public Stemmer getStemmer()
Return the stemmer.


setStemmer

public void setStemmer(Stemmer stemmer)
Set the stemmer.


isIgnoreIrrelevantClauses

public boolean isIgnoreIrrelevantClauses()
Return wether to ignore irrelevant clauses or not.


setIgnoreIrrelevantClauses

public void setIgnoreIrrelevantClauses(boolean ignoreIrrelevantClauses)
Set wether to ignore irrelevant clauses or not. Only needed if a new WordRelatednessMap is going to be created. Loading a WordRelatednessMap sets this property appropriately.

Parameters:
ignoreIrrelevantClauses - The ignoreIrrelevantClauses to set.

getOnlyPositiveProfFeatures

public int getOnlyPositiveProfFeatures()
Return wether to use only positive professional features or not. Possible values: SirDiscourse.FEAT_NOT (0) or SirDiscourse.FEAT_ALL (5).


setOnlyPositiveProfFeatures

public void setOnlyPositiveProfFeatures(int onlyPositiveProfFeatures)
Set wether to use only positive professional features or not. Only needed if a new WordRelatednessmap is created.

Parameters:
onlyPositiveProfFeatures - The onlyPositiveProfFeatures to set.

isUsingStopWords

public boolean isUsingStopWords()
Return wether stop words are used.


setUseStopWords

public void setUseStopWords(boolean useStopWords)
Set wether to use stop words or not.


getThreshold

public double getThreshold()
Word relatedness values that are smaller than the threshold will not be used to compute the discourse relatednesses.

Returns:
Returns the threshold.

setThreshold

public void setThreshold(double threshold)
Word relatedness values that are smaller than the threshold will not be used to compute the discourse relatednesses.

Parameters:
threshold - The threshold to set.

getEquation

public int getEquation()
Return the type number of the equation used to compute the discourse relatedness.

Returns:
Returns the equation type.

setEquation

public void setEquation(int equationNumber)
Set the type number of the equation used to compute the discourse relatedness.

Parameters:
equationNumber - The equation to set.

initSlimCorpus

public void initSlimCorpus(java.lang.String corpusPath)
Creates a new corpus of SlimDiscourses whose relatedness values to an other SlimDiscourse will be computed. Call setIgnoreIrrelevantClauses() and setOnlyPositiveProfFeatures() to make shure you get the corpus you want.

Parameters:
corpusPath - path of the directory that contains the discourses of the corpus.

loadSlimCorpus

public boolean loadSlimCorpus(java.lang.String filePath)
Load an already computed an serialized instance of a corpus. May change the ignore irrelevant clause and the use only positive professional feature properties.

Parameters:
filePath -
Returns:
Success.

saveSlimCorpus

public boolean saveSlimCorpus(java.lang.String filePath)
Saerialize an save an already computed instance of a corpus.

Parameters:
filePath -
Returns:
Success.

initWordRelatednessMap

public void initWordRelatednessMap(java.lang.String profilesPath,
                                   WordRelatednessComparator wrc)
Creates a new instance of WordRelatednessMap. Call setStemmer() first and initialize or load a slim corpus to make shure you get the map you want.

Parameters:
profilesPath -
wrc -

loadWordRelatednessMap

public boolean loadWordRelatednessMap(java.lang.String filePath)
Load an already computed an serialized instance of a WordRelatednessMap. May change the stemmer of this RelatednessRanker.


saveWordRelatednessMap

public boolean saveWordRelatednessMap(java.lang.String filePath)
Serialize an save an already computed instance of a WordRelatednessMap.

Parameters:
filePath -
Returns:
Success.

rank

public RelatednessResult rank(SlimDiscourse profile)
Parameters:
profile - ungestemmt. profile wird intern gestemmt und ggflls. um stoppwoerter bereinigt

toString

public java.lang.String toString()
Return a String that contains importnat infomration about this instance of RelatednessRanker.