jncc20
Class MdlDiscretizer

java.lang.Object
  extended by jncc20.MdlDiscretizer

 class MdlDiscretizer
extends java.lang.Object

Implements recursive MDL-based supervised discretization (Fayyad and Irani, 1993).

The constructor receives three arguments: a vector containing the numerical values, in the different classes, of the feature to be discretized;
a vector containing the corresponding classes;
an integer which is the number of classes of the problem.

The constructor directly discretizes the variables, and the cut points can be accessed via the function getCutPoints.


Nested Class Summary
private static class MdlDiscretizer.Pair
          Helper class for MdlDiscretizer, which effectively stores feature-class pairs
 
Field Summary
private  java.util.ArrayList<java.lang.Double> cutPoints
          Discretization intervals (if any) identified by the algorihtm
private  int numClasses
          Total number of classes (immutable, hence final)
private  MdlDiscretizer.Pair[] pairVector
          Vector of feature/class pairs
private  java.util.ArrayList<java.lang.Double> possibleCutPoints
          Numerical values of the feature, which constitues possible discretization intervals (i.e, possible cutPoints)
private  java.util.ArrayList<java.lang.Integer> possibleCutPointsIdxInPairVector
          Indexes of the possible cutPoints, with reference to PairVector
 
Constructor Summary
MdlDiscretizer(java.util.ArrayList<java.lang.Double> SuppliedFeatureValues, java.util.ArrayList<java.lang.Integer> SuppliedClassValues, int suppliedNumClasses)
          Computes the discretization intervals using the supplied class and feature vector; stores them into cutPoints
 
Method Summary
private  double[] computeEntropy(int lowerBound, int upperBound)
          Computes the entropy of the partion of pairVector comprised between the indexes lowerBound and upperBound
private  void computePossibleCutPoints()
          Identifies the feature values that can constitute possible cutPoints (i.e., possible discretization intervals)
private  java.util.ArrayList<java.lang.Integer> getClassList(int index)
          Returns the list of classes corresponding to a certain numerical value of a feature, within the PairVector.
 java.util.ArrayList<java.lang.Double> getCutPoints()
           
private  boolean recursiveMDLDiscretization(int lowerIdx, int upperIdx)
          Discretizes the variable and instantiates cutPoints; called from the constructor.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

cutPoints

private java.util.ArrayList<java.lang.Double> cutPoints
Discretization intervals (if any) identified by the algorihtm


numClasses

private final int numClasses
Total number of classes (immutable, hence final)


pairVector

private MdlDiscretizer.Pair[] pairVector
Vector of feature/class pairs


possibleCutPoints

private java.util.ArrayList<java.lang.Double> possibleCutPoints
Numerical values of the feature, which constitues possible discretization intervals (i.e, possible cutPoints)


possibleCutPointsIdxInPairVector

private java.util.ArrayList<java.lang.Integer> possibleCutPointsIdxInPairVector
Indexes of the possible cutPoints, with reference to PairVector

Constructor Detail

MdlDiscretizer

MdlDiscretizer(java.util.ArrayList<java.lang.Double> SuppliedFeatureValues,
               java.util.ArrayList<java.lang.Integer> SuppliedClassValues,
               int suppliedNumClasses)
Computes the discretization intervals using the supplied class and feature vector; stores them into cutPoints

Method Detail

computeEntropy

private double[] computeEntropy(int lowerBound,
                                int upperBound)
Computes the entropy of the partion of pairVector comprised between the indexes lowerBound and upperBound


computePossibleCutPoints

private void computePossibleCutPoints()
Identifies the feature values that can constitute possible cutPoints (i.e., possible discretization intervals)


getClassList

private java.util.ArrayList<java.lang.Integer> getClassList(int index)
Returns the list of classes corresponding to a certain numerical value of a feature, within the PairVector.


getCutPoints

public java.util.ArrayList<java.lang.Double> getCutPoints()

recursiveMDLDiscretization

private boolean recursiveMDLDiscretization(int lowerIdx,
                                           int upperIdx)
Discretizes the variable and instantiates cutPoints; called from the constructor.