weka.classifiers
Class Logistic

java.lang.Object
  |
  +--weka.classifiers.Classifier
        |
        +--weka.classifiers.DistributionClassifier
              |
              +--weka.classifiers.Logistic
All Implemented Interfaces:
java.lang.Cloneable, OptionHandler, java.io.Serializable

public class Logistic
extends DistributionClassifier
implements OptionHandler

Class for building and using a two-class logistic regression model with a ridge estimator.

This class utilizes globally convergent Newtons Method adapted from Numerical Recipies in C. Reference: le Cessie, S. and van Houwelingen, J.C. (1997). Ridge Estimators in Logistic Regression. Applied Statistics, Vol. 41, No. 1, pp. 191-201.

Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter.

Valid options are:

-D
Turn on debugging output.

Author:
Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz), Tony Voyle (tv6@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  int m_ClassIndex
          The index of the class attribute
protected  boolean m_Debug
          Debugging output
protected  double m_LL
          The log-likelihood of the built model
protected  double m_LLn
          The log-likelihood of the null model
protected  int m_NumPredictors
          The number of attributes in the model
protected  double[] m_Par
          The coefficients of the model
protected  double m_Ridge
          The ridge parameter.
 
Constructor Summary
Logistic()
           
 
Method Summary
 void buildClassifier(Instances train)
          Builds the classifier
protected  double calculateLogLikelihood(double[][] X, double[] Y, Matrix jacobian, double[] deltas)
          Calculates the log likelihood of the current set of coefficients (stored in m_Par), given the data.
 double[] distributionForInstance(Instance instance)
          Computes the distribution for a given instance
protected  double evaluateProbability(double[] instDat)
          Evaluate the probability for this point using the current coefficients
 boolean getDebug()
          Gets whether debugging output will be printed.
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
 void lnsrch(int n, double[] xold, double fold, double[] g, double[] p, double[] x, double stpmax, double[][] X, double[] Y)
          Finds a new point x in the direction p from a point xold at which the value of the function has decreased sufficiently.
static void main(java.lang.String[] argv)
          Main method for testing this class.
protected static double Norm(double z)
          Returns probability.
 void setDebug(boolean debug)
          Sets whether debugging output will be printed.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 java.lang.String toString()
          Gets a string describing the classifier.
 
Methods inherited from class weka.classifiers.DistributionClassifier
classifyInstance
 
Methods inherited from class weka.classifiers.Classifier
forName, makeCopies
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_LL

protected double m_LL
The log-likelihood of the built model

m_LLn

protected double m_LLn
The log-likelihood of the null model

m_Par

protected double[] m_Par
The coefficients of the model

m_NumPredictors

protected int m_NumPredictors
The number of attributes in the model

m_ClassIndex

protected int m_ClassIndex
The index of the class attribute

m_Ridge

protected double m_Ridge
The ridge parameter.

m_Debug

protected boolean m_Debug
Debugging output
Constructor Detail

Logistic

public Logistic()
Method Detail

lnsrch

public void lnsrch(int n,
                   double[] xold,
                   double fold,
                   double[] g,
                   double[] p,
                   double[] x,
                   double stpmax,
                   double[][] X,
                   double[] Y)
            throws java.lang.Exception
Finds a new point x in the direction p from a point xold at which the value of the function has decreased sufficiently.
Parameters:
n - number of variables
xold - old point
fold - value at that point
g - gtradient at that point
p - direction
x - new value along direction p from xold
stpmax - maximum step length
X - instance data
Y - class values
Throws:
java.lang.Exception - if an error occurs

Norm

protected static double Norm(double z)
Returns probability.

evaluateProbability

protected double evaluateProbability(double[] instDat)
Evaluate the probability for this point using the current coefficients
Parameters:
instDat - the instance data
Returns:
the probability for this instance

calculateLogLikelihood

protected double calculateLogLikelihood(double[][] X,
                                        double[] Y,
                                        Matrix jacobian,
                                        double[] deltas)
Calculates the log likelihood of the current set of coefficients (stored in m_Par), given the data.
Parameters:
X - the instance data
Y - the class values for each instance
jacobian - the matrix which will contain the jacobian matrix after the method returns
deltas - an array which will contain the parameter adjustments after the method returns
Returns:
the log likelihood of the data.

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options
Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-D
Turn on debugging output.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.
Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

setDebug

public void setDebug(boolean debug)
Sets whether debugging output will be printed.
Parameters:
debug - true if debugging output should be printed

getDebug

public boolean getDebug()
Gets whether debugging output will be printed.
Returns:
true if debugging output will be printed

buildClassifier

public void buildClassifier(Instances train)
                     throws java.lang.Exception
Builds the classifier
Overrides:
buildClassifier in class Classifier
Parameters:
data - the training data to be used for generating the boosted classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Computes the distribution for a given instance
Overrides:
distributionForInstance in class DistributionClassifier
Parameters:
instance - the instance for which distribution is computed
Returns:
the distribution
Throws:
java.lang.Exception - if the distribution can't be computed successfully

toString

public java.lang.String toString()
Gets a string describing the classifier.
Overrides:
toString in class java.lang.Object
Returns:
a string describing the classifer built.

main

public static void main(java.lang.String[] argv)
Main method for testing this class.
Parameters:
argv - should contain the command line arguments to the scheme (see Evaluation)