weka.filters
Class InstanceFilter

java.lang.Object
  |
  +--weka.filters.Filter
        |
        +--weka.filters.InstanceFilter
All Implemented Interfaces:
OptionHandler, java.io.Serializable

public class InstanceFilter
extends Filter
implements OptionHandler

Filters instances according to the value of an attribute.

Valid filter-specific options are:

-C num
Choose attribute to be used for selection (default last).

-S num
Numeric value to be used for selection on numeric attribute. Instances with values smaller than given value will be selected. (default 0)

-L index1,index2-index4,...
Range of label indices to be used for selection on nominal attribute. First and last are valid indexes. (default all values)

-M
Missing values count as a match. This setting is independent of the -V option. (default missing values don't match)

-V
Invert matching sense.

-H
When selecting on nominal attributes, removes header references to excluded values.

Author:
Eibe Frank (eibe@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
protected  int m_Attribute
          Stores which attribute to be used for filtering
protected  int m_AttributeSet
          Stores the attribute setting
protected  boolean m_Inverse
          Inverse of test to be used?
protected  boolean m_MatchMissingValues
          True if missing values should count as a match
protected  boolean m_ModifyHeader
          Modify header for nominal attributes?
protected  int[] m_NominalMapping
          If m_ModifyHeader, stores a mapping from old to new indexes
protected  double m_Value
          Stores which value of a numeric attribute is to be used for filtering.
protected  Range m_Values
          Stores which values of nominal attribute are to be used for filtering.
 
Fields inherited from class weka.filters.Filter
m_NewBatch
 
Constructor Summary
InstanceFilter()
          Default constructor
 
Method Summary
 int getAttributeIndex()
          Get the attribute to be used for selection (-1 for last)
 boolean getInvertSelection()
          Get whether the supplied columns are to be removed or kept
 boolean getMatchMissingValues()
          Gets whether missing values are counted as a match.
 boolean getModifyHeader()
          Gets whether the header will be modified when selecting on nominal attributes.
 java.lang.String getNominalIndices()
          Get the set of nominal value indices that will be used for selection
 java.lang.String[] getOptions()
          Gets the current settings of the filter.
 double getSplitPoint()
          Get the split point used for numeric selection
 boolean input(Instance instance)
          Input an instance for filtering.
 boolean isNominal()
          Returns true if selection attribute is nominal.
 boolean isNumeric()
          Returns true if selection attribute is numeric.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options
static void main(java.lang.String[] argv)
          Main method for testing this class.
 void setAttributeIndex(int attribute)
          Sets attribute to be used for selection
 boolean setInputFormat(Instances instanceInfo)
          Sets the format of the input instances.
 void setInvertSelection(boolean invert)
          Set whether selected values should be removed or kept.
 void setMatchMissingValues(boolean newMatchMissingValues)
          Sets whether missing values are counted as a match.
 void setModifyHeader(boolean newModifyHeader)
          Sets whether the header will be modified when selecting on nominal attributes.
 void setNominalIndices(java.lang.String rangeList)
          Set which nominal labels are to be included in the selection.
 void setNominalIndicesArr(int[] values)
          Set which values of a nominal attribute are to be used for selection.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSplitPoint(double value)
          Split point to be used for selection on numeric attribute.
 
Methods inherited from class weka.filters.Filter
batchFilterFile, batchFinished, bufferInput, copyStringValues, copyStringValues, filterFile, flushInput, getInputFormat, getInputStringIndex, getOutputFormat, getOutputStringIndex, getStringIndices, inputFormat, isOutputFormatDefined, numPendingOutput, output, outputFormat, outputFormatPeek, outputPeek, push, resetQueue, setOutputFormat, useFilter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_AttributeSet

protected int m_AttributeSet
Stores the attribute setting

m_Attribute

protected int m_Attribute
Stores which attribute to be used for filtering

m_Values

protected Range m_Values
Stores which values of nominal attribute are to be used for filtering.

m_Value

protected double m_Value
Stores which value of a numeric attribute is to be used for filtering.

m_Inverse

protected boolean m_Inverse
Inverse of test to be used?

m_MatchMissingValues

protected boolean m_MatchMissingValues
True if missing values should count as a match

m_ModifyHeader

protected boolean m_ModifyHeader
Modify header for nominal attributes?

m_NominalMapping

protected int[] m_NominalMapping
If m_ModifyHeader, stores a mapping from old to new indexes
Constructor Detail

InstanceFilter

public InstanceFilter()
Default constructor
Method Detail

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options
Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

-C num
Choose attribute to be used for selection (default last).

-S num
Numeric value to be used for selection on numeric attribute. Instances with values smaller than given value will be selected. (default 0)

-L index1,index2-index4,...
Range of label indices to be used for selection on nominal attribute. First and last are valid indexes. (default all values)

-M
Missing values count as a match. This setting is independent of the -V option. (default missing values don't match)

-V
Invert matching sense.

-H
When selecting on nominal attributes, removes header references to excluded values.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the filter.
Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

setInputFormat

public boolean setInputFormat(Instances instanceInfo)
                       throws java.lang.Exception
Sets the format of the input instances.
Overrides:
setInputFormat in class Filter
Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Throws:
UnsupportedAttributeTypeException - if the specified attribute is neither numeric or nominal.

input

public boolean input(Instance instance)
Input an instance for filtering. Ordinarily the instance is processed and made available for output immediately. Some filters require all instances be read before producing output.
Overrides:
input in class Filter
Parameters:
instance - the input instance
Returns:
true if the filtered instance may now be collected with output().
Throws:
java.lang.IllegalStateException - if no input format has been set.

isNominal

public boolean isNominal()
Returns true if selection attribute is nominal.
Returns:
true if selection attribute is nominal

isNumeric

public boolean isNumeric()
Returns true if selection attribute is numeric.
Returns:
true if selection attribute is numeric

getModifyHeader

public boolean getModifyHeader()
Gets whether the header will be modified when selecting on nominal attributes.
Returns:
true if so.

setModifyHeader

public void setModifyHeader(boolean newModifyHeader)
Sets whether the header will be modified when selecting on nominal attributes.
Parameters:
newModifyHeader - true if so.

getAttributeIndex

public int getAttributeIndex()
Get the attribute to be used for selection (-1 for last)
Returns:
the attribute index

setAttributeIndex

public void setAttributeIndex(int attribute)
Sets attribute to be used for selection
Parameters:
attribute - the attribute's index (-1 for last);

getSplitPoint

public double getSplitPoint()
Get the split point used for numeric selection
Returns:
the numeric split point

setSplitPoint

public void setSplitPoint(double value)
Split point to be used for selection on numeric attribute.
Parameters:
value - the split point

getMatchMissingValues

public boolean getMatchMissingValues()
Gets whether missing values are counted as a match.
Returns:
true if missing values are counted as a match.

setMatchMissingValues

public void setMatchMissingValues(boolean newMatchMissingValues)
Sets whether missing values are counted as a match.
Parameters:
newMatchMissingValues - true if missing values are counted as a match.

getInvertSelection

public boolean getInvertSelection()
Get whether the supplied columns are to be removed or kept
Returns:
true if the supplied columns will be kept

setInvertSelection

public void setInvertSelection(boolean invert)
Set whether selected values should be removed or kept. If true the selected values are kept and unselected values are deleted.
Parameters:
invert - the new invert setting

getNominalIndices

public java.lang.String getNominalIndices()
Get the set of nominal value indices that will be used for selection
Returns:
rangeList a string representing the list of nominal indices.

setNominalIndices

public void setNominalIndices(java.lang.String rangeList)
Set which nominal labels are to be included in the selection.
Parameters:
rangeList - a string representing the list of nominal indices. eg: first-3,5,6-last
Throws:
InvalidArgumentException - if an invalid range list is supplied

setNominalIndicesArr

public void setNominalIndicesArr(int[] values)
Set which values of a nominal attribute are to be used for selection.
Parameters:
values - an array containing indexes of values to be used for selection
Throws:
InvalidArgumentException - if an invalid set of ranges is supplied

main

public static void main(java.lang.String[] argv)
Main method for testing this class.
Parameters:
argv - should contain arguments to the filter: use -h for help