Set the minimum allowable standard deviation for normal density calculation.
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
- Serialized Form
Constructor Summary |
EM()
Constructor. |
Method Summary |
void |
buildClusterer(Instances data)
Generates a clusterer. |
double |
densityForInstance(Instance inst)
Computes the density for a given instance. |
double[] |
distributionForInstance(Instance inst)
Predicts the cluster memberships for a given instance. |
boolean |
getDebug()
Get debug mode |
int |
getMaxIterations()
Get the maximum number of iterations |
double |
getMinStdDev()
Get the minimum allowable standard deviation. |
int |
getNumClusters()
Get the number of clusters |
java.lang.String[] |
getOptions()
Gets the current settings of EM. |
int |
getSeed()
Get the random number seed |
java.lang.String |
globalInfo()
Returns a string describing this clusterer |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
java.lang.String |
maxIterationsTipText()
Returns the tip text for this property |
java.lang.String |
minStdDevTipText()
Returns the tip text for this property |
int |
numberOfClusters()
Returns the number of clusters. |
java.lang.String |
numClustersTipText()
Returns the tip text for this property |
protected void |
resetOptions()
Reset to default options |
java.lang.String |
seedTipText()
Returns the tip text for this property |
void |
setDebug(boolean v)
Set debug mode - verbose output |
void |
setMaxIterations(int i)
Set the maximum number of iterations to perform |
void |
setMinStdDev(double m)
Set the minimum value for standard deviation when calculating
normal density. |
void |
setNumClusters(int n)
Set the number of clusters (-1 to select by CV). |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setSeed(int s)
Set the random number seed |
java.lang.String |
toString()
Outputs the generated clusters into a string. |
protected double[] |
weightsForInstance(Instance inst)
Returns the weights (indicating cluster membership) for a given instance |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
EM
public EM()
- Constructor.
globalInfo
public java.lang.String globalInfo()
- Returns a string describing this clusterer
- Returns:
- a description of the evaluator suitable for
displaying in the explorer/experimenter gui
listOptions
public java.util.Enumeration listOptions()
- Returns an enumeration describing the available options.
Valid options are:
-V
Verbose.
-N
Specify the number of clusters to generate. If omitted,
EM will use cross validation to select the number of clusters
automatically.
-I
Terminate after this many iterations if EM has not converged.
-S
Specify random number seed.
-M
Set the minimum allowable standard deviation for normal density
calculation.
- Specified by:
listOptions
in interface OptionHandler
- Returns:
- an enumeration of all the available options
setOptions
public void setOptions(java.lang.String[] options)
throws java.lang.Exception
- Parses a given list of options.
- Specified by:
setOptions
in interface OptionHandler
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
minStdDevTipText
public java.lang.String minStdDevTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setMinStdDev
public void setMinStdDev(double m)
- Set the minimum value for standard deviation when calculating
normal density. Reducing this value can help prevent arithmetic
overflow resulting from multiplying large densities (arising from small
standard deviations) when there are many singleton or near singleton
values.
- Parameters:
m
- minimum value for standard deviation
getMinStdDev
public double getMinStdDev()
- Get the minimum allowable standard deviation.
- Returns:
- the minumum allowable standard deviation
seedTipText
public java.lang.String seedTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setSeed
public void setSeed(int s)
- Set the random number seed
- Parameters:
s
- the seed
getSeed
public int getSeed()
- Get the random number seed
- Returns:
- the seed
numClustersTipText
public java.lang.String numClustersTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setNumClusters
public void setNumClusters(int n)
throws java.lang.Exception
- Set the number of clusters (-1 to select by CV).
- Parameters:
n
- the number of clusters- Throws:
java.lang.Exception
- if n is 0
getNumClusters
public int getNumClusters()
- Get the number of clusters
- Returns:
- the number of clusters.
maxIterationsTipText
public java.lang.String maxIterationsTipText()
- Returns the tip text for this property
- Returns:
- tip text for this property suitable for
displaying in the explorer/experimenter gui
setMaxIterations
public void setMaxIterations(int i)
throws java.lang.Exception
- Set the maximum number of iterations to perform
- Parameters:
i
- the number of iterations- Throws:
java.lang.Exception
- if i is less than 1
getMaxIterations
public int getMaxIterations()
- Get the maximum number of iterations
- Returns:
- the number of iterations
setDebug
public void setDebug(boolean v)
- Set debug mode - verbose output
- Parameters:
v
- true for verbose output
getDebug
public boolean getDebug()
- Get debug mode
- Returns:
- true if debug mode is set
getOptions
public java.lang.String[] getOptions()
- Gets the current settings of EM.
- Specified by:
getOptions
in interface OptionHandler
- Returns:
- an array of strings suitable for passing to setOptions()
resetOptions
protected void resetOptions()
- Reset to default options
toString
public java.lang.String toString()
- Outputs the generated clusters into a string.
- Overrides:
toString
in class java.lang.Object
numberOfClusters
public int numberOfClusters()
throws java.lang.Exception
- Returns the number of clusters.
- Overrides:
numberOfClusters
in class Clusterer
- Returns:
- the number of clusters generated for a training dataset.
- Throws:
java.lang.Exception
- if number of clusters could not be returned
successfully
buildClusterer
public void buildClusterer(Instances data)
throws java.lang.Exception
- Generates a clusterer. Has to initialize all fields of the clusterer
that are not being set via options.
- Overrides:
buildClusterer
in class Clusterer
- Parameters:
data
- set of instances serving as training data- Throws:
java.lang.Exception
- if the clusterer has not been
generated successfully
densityForInstance
public double densityForInstance(Instance inst)
throws java.lang.Exception
- Computes the density for a given instance.
- Overrides:
densityForInstance
in class DistributionClusterer
- Parameters:
inst
- the instance to compute the density for- Returns:
- the density.
- Throws:
java.lang.Exception
- if the density could not be computed
successfully
distributionForInstance
public double[] distributionForInstance(Instance inst)
throws java.lang.Exception
- Predicts the cluster memberships for a given instance.
- Overrides:
distributionForInstance
in class DistributionClusterer
- Parameters:
data
- set of test instancesinstance
- the instance to be assigned a cluster.- Returns:
- an array containing the estimated membership
probabilities of the test instance in each cluster (this
should sum to at most 1)
- Throws:
java.lang.Exception
- if distribution could not be
computed successfully
weightsForInstance
protected double[] weightsForInstance(Instance inst)
throws java.lang.Exception
- Returns the weights (indicating cluster membership) for a given instance
- Parameters:
inst
- the instance to be assigned a cluster- Returns:
- an array of weights
- Throws:
java.lang.Exception
- if weights could not be computed
main
public static void main(java.lang.String[] argv)
- Main method for testing this class.
- Parameters:
argv
- should contain the following arguments:
-t training file [-T test file] [-N number of clusters] [-S random seed]