org.datamanager.passiveentityvalue
Class WordFrequencyMapEntityValue

java.lang.Object
  |
  +--org.datamanager.passiveentityvalue.WordFrequencyMapEntityValue
All Implemented Interfaces:
EntityValue, PassiveEntityValue, Serializable

public class WordFrequencyMapEntityValue
extends Object
implements PassiveEntityValue

This class maintains a Map of words mapped to their numerical frequencies of occurrence within a given Document.

Version:
$Revision: 1.8 $
Author:
Team Helium
See Also:
Serialized Form

Field Summary
static String ATTRIBUTE_NAME
          This constant represents the attribute name that should be used for the Entity whose EntityValue is this WordFrequencyMapEntityValue.
static String DELIMITERS
          Delimiters to be used when tokenizing up a String corpus.
 
Constructor Summary
WordFrequencyMapEntityValue(String corpus)
           
 
Method Summary
 boolean contains(String word)
           
 boolean equals(Object object)
          Returns true if the object in question is a WordFrequencyMapEntityValue and their underlying Maps and corpusi are also .equals() to each other.
 double getFrequencyOf(String word)
           
 double getNumberOfUniqueWords()
           
 Map getSynchronizedWordFrequencyMap()
           
 double getTotalNumberOfWords()
          Returns an double representing the total number of words in this word frequency map.
 Map getWordFrequencyMap()
           
 Set getWordSet()
           
 int hashCode()
          Returns the hashCode() of the String corpus.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DELIMITERS

public static final String DELIMITERS
Delimiters to be used when tokenizing up a String corpus.

See Also:
Constant Field Values

ATTRIBUTE_NAME

public static final String ATTRIBUTE_NAME
This constant represents the attribute name that should be used for the Entity whose EntityValue is this WordFrequencyMapEntityValue.

See Also:
Constant Field Values
Constructor Detail

WordFrequencyMapEntityValue

public WordFrequencyMapEntityValue(String corpus)
Method Detail

hashCode

public int hashCode()
Returns the hashCode() of the String corpus.

Specified by:
hashCode in interface PassiveEntityValue
Overrides:
hashCode in class Object

equals

public boolean equals(Object object)
Returns true if the object in question is a WordFrequencyMapEntityValue and their underlying Maps and corpusi are also .equals() to each other.

Specified by:
equals in interface PassiveEntityValue
Overrides:
equals in class Object

getWordFrequencyMap

public Map getWordFrequencyMap()

getSynchronizedWordFrequencyMap

public Map getSynchronizedWordFrequencyMap()

getWordSet

public Set getWordSet()

contains

public boolean contains(String word)

getFrequencyOf

public double getFrequencyOf(String word)

getNumberOfUniqueWords

public double getNumberOfUniqueWords()

getTotalNumberOfWords

public double getTotalNumberOfWords()
Returns an double representing the total number of words in this word frequency map. Not the total number of unique words in the map, the total number of all words in the map. A double is used because a lot of these values are used for calculating probabilities and what not and forcing the explicit cast on the application side is a pain.



See the Helium Website