jncc20
Class NaiveCredalClassifier2

java.lang.Object
  extended by jncc20.NaiveClassifier
      extended by jncc20.NaiveCredalClassifier2

 class NaiveCredalClassifier2
extends NaiveClassifier

Implementation of the Naive Credal Classifier 2 (NCC2).


Nested Class Summary
private static class NaiveCredalClassifier2.PartitionPoint
          Helper class for NaiveCredal Classifier, used to store crossing points and minimizing tuples; it is used to deal with missing data in the NonMar part of the testing instances.
 
Nested classes/interfaces inherited from class jncc20.NaiveClassifier
NaiveClassifier.Feature, NaiveClassifier.OutputClass
 
Field Summary
private  int alpha
          overall occurrences of class c1
private  java.util.ArrayList<java.lang.Integer> alphaArr
          alphaArr is defined for NonMar features only; it contains the conditional count after having dropped missing data
private  int beta
          overall occurrences of class c2
private  java.util.ArrayList<java.lang.Integer> betaArr
          BetaArr is defined for NonMar features only; it contains conditional count to which the number of missing records for the given feature is added
private  java.util.ArrayList<java.lang.Integer> deltaArr
          delta array is defined for Mar features only; it contains conditional count with respect to class c2 after having dropped missing data
private  java.util.ArrayList<java.lang.Integer> deltaTildeArr
          Sum of occurrences of class c2, considering only those instances of the learning set where the NonMar feature is non missing.
private  java.util.ArrayList<java.lang.Integer> gammaArr
          gamma array is defined for Mar features only; it contains conditional count with respect to class c1 after having dropped missing data
private  java.util.ArrayList<java.lang.Integer> gammaTildeArr
          Sum of occurrences of class c1, considering only those instances of the learning set where the NonMar feature is non missing.
private  int k
          Number of NonMar features in training
private  java.util.ArrayList<java.lang.Integer> nonMarTestingIdx
          Indexes of nonMarFeature in testing
private  java.util.ArrayList<java.lang.Integer> nonMarTrainingIdx
          Indexes of nonMarFeature in training
private  java.util.ArrayList<java.lang.Integer> numClassesNonMarTesting
          Number of classes of each NonMar variable in Testing
private  java.util.ArrayList<NaiveCredalClassifier2.PartitionPoint> partitionPoints
          Partition Points, used when classyfing instances with missing units in the NonMar part.
private  int[][] predictions
          Stores NCC predictions; as every prediction can be imprecise and hence contain several value, it is implemented as a matrix.
private  int s
          number of "hidden" observations, that rules the weights of the prior with respect to the likelihood.
 
Fields inherited from class jncc20.NaiveClassifier
featureSet, numClasses, numFeatures, numValues, outputClasses, pcClass, pcCond, pcUncond, probabilities, trainInstances
 
Constructor Summary
NaiveCredalClassifier2(java.util.ArrayList<int[]> TrainingSet, java.util.ArrayList<java.lang.String> FeatureNames, java.util.ArrayList<java.lang.String> classNames, java.util.ArrayList<java.lang.Integer> numClassForEachFeature, java.util.ArrayList<java.lang.Integer> SuppliedNonMarInTraining, java.util.ArrayList<java.lang.Integer> SuppliedNonMarInTesting, java.util.ArrayList<java.lang.Integer> SuppliedNumClassesNonMarTesting)
          Builds feature and output class, and computes the relevant counts for MAR and NON-MAR features
 
Method Summary
private  double checkCredalDominanceCIR(int c1, int c2, int[] currentInstance, double xmin, double xmax)
          Computes the CIR test of dominance between class c1 and c2 (if the returned value is >1, c1 dominates c2)
private  int[] classifyInstance(int[] CurrentInstance)
          Classifies a single instance, returning the list of predicted classes
(package private)  void classifyInstances(java.util.ArrayList<int[]> TestingSet)
          Classify all the instances of the supplied TestingSet; stores the predictions into CredalPredictedInstances
private  double computeDeriv2LnHxCIR(double x)
          Computes the second derivative of Ln(Hx) (see Corani and Zaffalon, 2007)
private  double computeDerivLnHxCIR(double x)
          Computes the derivative of Ln(Hx) (see Corani and Zaffalon, 2007)
private  double computeHxCIR(double x)
          Computes Hx for a given value of x, alpha, beta ecc.
private  int findMinimizingValue(int FeatureIdx, int NumValues, int c1, int c2, double xmin, double xmax)
          Given a sub-partion (xmin,xmax) of[0,s], returns the value of feature FeatureIdx, which minimizes the ratio (lowercount(feature,c1)/(uppercount(feature,c2)+x)) in the interval.
private  boolean findPartitionPoints(int c1, int c2, java.util.ArrayList<java.lang.Integer> MissingNonMarIdx)
          If there are missing data in the NonMar part of the units to be classified, this function identifies the intervals in which the range [0,s] has to be sub-partitioned.
private  double findZeroCIR(double x1, double x2)
          Numerical approximation of the min of Ln(Hx) via Newton-Raphson method.
(package private)  int[][] getPredictions()
          Returns the matrix of the predictions
(package private)  int[] getPredictions(int idx)
          Returns the vector, which contains the prediction for instance in position idx
 
Methods inherited from class jncc20.NaiveClassifier
buildFeatureSet, buildOutputClasses, gammaln, getOutputClasses, saveProbabilities
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

alpha

private int alpha
overall occurrences of class c1


alphaArr

private java.util.ArrayList<java.lang.Integer> alphaArr
alphaArr is defined for NonMar features only; it contains the conditional count after having dropped missing data


beta

private int beta
overall occurrences of class c2


betaArr

private java.util.ArrayList<java.lang.Integer> betaArr
BetaArr is defined for NonMar features only; it contains conditional count to which the number of missing records for the given feature is added


deltaArr

private java.util.ArrayList<java.lang.Integer> deltaArr
delta array is defined for Mar features only; it contains conditional count with respect to class c2 after having dropped missing data


deltaTildeArr

private java.util.ArrayList<java.lang.Integer> deltaTildeArr
Sum of occurrences of class c2, considering only those instances of the learning set where the NonMar feature is non missing. A different value for every feature.


gammaArr

private java.util.ArrayList<java.lang.Integer> gammaArr
gamma array is defined for Mar features only; it contains conditional count with respect to class c1 after having dropped missing data


gammaTildeArr

private java.util.ArrayList<java.lang.Integer> gammaTildeArr
Sum of occurrences of class c1, considering only those instances of the learning set where the NonMar feature is non missing. A different value for every feature.


k

private int k
Number of NonMar features in training


nonMarTestingIdx

private java.util.ArrayList<java.lang.Integer> nonMarTestingIdx
Indexes of nonMarFeature in testing


nonMarTrainingIdx

private java.util.ArrayList<java.lang.Integer> nonMarTrainingIdx
Indexes of nonMarFeature in training


numClassesNonMarTesting

private java.util.ArrayList<java.lang.Integer> numClassesNonMarTesting
Number of classes of each NonMar variable in Testing


partitionPoints

private java.util.ArrayList<NaiveCredalClassifier2.PartitionPoint> partitionPoints
Partition Points, used when classyfing instances with missing units in the NonMar part.


predictions

private int[][] predictions
Stores NCC predictions; as every prediction can be imprecise and hence contain several value, it is implemented as a matrix.


s

private int s
number of "hidden" observations, that rules the weights of the prior with respect to the likelihood. It is a safe choice to set it as 1 or 2.

Constructor Detail

NaiveCredalClassifier2

NaiveCredalClassifier2(java.util.ArrayList<int[]> TrainingSet,
                       java.util.ArrayList<java.lang.String> FeatureNames,
                       java.util.ArrayList<java.lang.String> classNames,
                       java.util.ArrayList<java.lang.Integer> numClassForEachFeature,
                       java.util.ArrayList<java.lang.Integer> SuppliedNonMarInTraining,
                       java.util.ArrayList<java.lang.Integer> SuppliedNonMarInTesting,
                       java.util.ArrayList<java.lang.Integer> SuppliedNumClassesNonMarTesting)
Builds feature and output class, and computes the relevant counts for MAR and NON-MAR features

Method Detail

checkCredalDominanceCIR

private double checkCredalDominanceCIR(int c1,
                                       int c2,
                                       int[] currentInstance,
                                       double xmin,
                                       double xmax)
Computes the CIR test of dominance between class c1 and c2 (if the returned value is >1, c1 dominates c2)


classifyInstance

private int[] classifyInstance(int[] CurrentInstance)
Classifies a single instance, returning the list of predicted classes


classifyInstances

void classifyInstances(java.util.ArrayList<int[]> TestingSet)
Classify all the instances of the supplied TestingSet; stores the predictions into CredalPredictedInstances

Specified by:
classifyInstances in class NaiveClassifier

computeDeriv2LnHxCIR

private double computeDeriv2LnHxCIR(double x)
Computes the second derivative of Ln(Hx) (see Corani and Zaffalon, 2007)


computeDerivLnHxCIR

private double computeDerivLnHxCIR(double x)
Computes the derivative of Ln(Hx) (see Corani and Zaffalon, 2007)


computeHxCIR

private double computeHxCIR(double x)
Computes Hx for a given value of x, alpha, beta ecc. (see Corani and Zaffalon, 2007)


findMinimizingValue

private int findMinimizingValue(int FeatureIdx,
                                int NumValues,
                                int c1,
                                int c2,
                                double xmin,
                                double xmax)
Given a sub-partion (xmin,xmax) of[0,s], returns the value of feature FeatureIdx, which minimizes the ratio (lowercount(feature,c1)/(uppercount(feature,c2)+x)) in the interval.


findPartitionPoints

private boolean findPartitionPoints(int c1,
                                    int c2,
                                    java.util.ArrayList<java.lang.Integer> MissingNonMarIdx)
If there are missing data in the NonMar part of the units to be classified, this function identifies the intervals in which the range [0,s] has to be sub-partitioned. Later, the test of dominance has to performed separately on each sub-partition.


findZeroCIR

private double findZeroCIR(double x1,
                           double x2)
Numerical approximation of the min of Ln(Hx) via Newton-Raphson method.


getPredictions

int[][] getPredictions()
Returns the matrix of the predictions


getPredictions

int[] getPredictions(int idx)
Returns the vector, which contains the prediction for instance in position idx