weka.classifiers
Class NaiveBayes

java.lang.Object
  |
  +--weka.classifiers.Classifier
        |
        +--weka.classifiers.DistributionClassifier
              |
              +--weka.classifiers.NaiveBayes
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable, WeightedInstancesHandler

public class NaiveBayes
extends DistributionClassifier
implements OptionHandler, WeightedInstancesHandler

Class for a Naive Bayes classifier using estimator classes. Numeric estimator precision values are chosen based on analysis of the training data. For this reason, the classifier is not an UpdateableClassifier (which in typical usage are initialized with zero training instances) -- if you need the UpdateableClassifier functionality, Create an empty class such as the following:


 public class NaiveBayesUpdateable extends NaiveBayes 
     implements UpdateableClassifier {

 }
 
This classifier will use a default precision of 0.1 for numeric attributes when buildClassifier is called with zero training instances.

For more information on Naive Bayes classifiers, see

George H. John and Pat Langley (1995). Estimating Continuous Distributions in Bayesian Classifiers. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. pp. 338-345. Morgan Kaufmann, San Mateo.

Valid options are:

-K
Use kernel estimation for modelling numeric attributes rather than a single normal distribution.

Author:
Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected static double DEFAULT_NUM_PRECISION
          The precision parameter used for numeric attributes
protected  Estimator m_ClassDistribution
          The class estimator.
protected  Estimator[][] m_Distributions
          The attribute estimators.
protected  Instances m_Instances
          The dataset header for the purposes of printing out a semi-intelligible model
protected  int m_NumClasses
          The number of classes (or 1 for numeric class)
protected  boolean m_UseKernelEstimator
          Whether to use kernel density estimator rather than normal distribution for numeric attributes
 
Constructor Summary
NaiveBayes()
           
 
Method Summary
 void buildClassifier(Instances instances)
          Generates the classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 boolean getUseKernelEstimator()
          Gets if kernel estimator is being used.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
static void main(java.lang.String[] argv)
          Main method for testing this class.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setUseKernelEstimator(boolean v)
          Sets if kernel estimator is to be used.
 java.lang.String toString()
          Returns a description of the classifier.
 void updateClassifier(Instance instance)
          Updates the classifier with the given instance.
 
Methods inherited from class weka.classifiers.DistributionClassifier
classifyInstance
 
Methods inherited from class weka.classifiers.Classifier
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_Distributions

protected Estimator[][] m_Distributions
The attribute estimators.

m_ClassDistribution

protected Estimator m_ClassDistribution
The class estimator.

m_UseKernelEstimator

protected boolean m_UseKernelEstimator
Whether to use kernel density estimator rather than normal distribution for numeric attributes

m_NumClasses

protected int m_NumClasses
The number of classes (or 1 for numeric class)

m_Instances

protected Instances m_Instances
The dataset header for the purposes of printing out a semi-intelligible model

DEFAULT_NUM_PRECISION

protected static final double DEFAULT_NUM_PRECISION
The precision parameter used for numeric attributes
Constructor Detail

NaiveBayes

public NaiveBayes()
Method Detail

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.
Overrides:
buildClassifier in class Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Updates the classifier with the given instance.
Parameters:
instance - the new training instance to include in the model
Throws:
java.lang.Exception - if the instance could not be incorporated in the model.

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.
Overrides:
distributionForInstance in class DistributionClassifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if there is a problem generating the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options
Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-K
Use kernel estimation for modelling numeric attributes rather than a single normal distribution.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.
Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

toString

public java.lang.String toString()
Returns a description of the classifier.
Overrides:
toString in class java.lang.Object
Returns:
a description of the classifier as a string.

getUseKernelEstimator

public boolean getUseKernelEstimator()
Gets if kernel estimator is being used.
Returns:
Value of m_UseKernelEstimatory.

setUseKernelEstimator

public void setUseKernelEstimator(boolean v)
Sets if kernel estimator is to be used.
Parameters:
v - Value to assign to m_UseKernelEstimatory.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.
Parameters:
argv - the options