jncc20
Class NaiveClassifier

java.lang.Object
  extended by jncc20.NaiveClassifier
Direct Known Subclasses:
NaiveBayes, NaiveCredalClassifier2

public abstract class NaiveClassifier
extends java.lang.Object

Abstract super-class for Naive Classifiers


Nested Class Summary
protected  class NaiveClassifier.Feature
          Helper class for Naive Classifiers, that implements Mar and NonMar features.
protected  class NaiveClassifier.OutputClass
          Helper class for Naive Classifiers, that implements the output class of the classification problem.
 
Field Summary
protected  NaiveClassifier.Feature[] featureSet
          Array of Feature objects, that represents the feature set of the classifier
protected  int numClasses
          number of classes
protected  int numFeatures
          number of features
protected  java.lang.Integer[] numValues
          number of categories for categorical features and number of bins for numerical, then discretized, features .
protected  NaiveClassifier.OutputClass[] outputClasses
          Array of OutputClass objects, that represents the possible output classes of the problem
protected  double pcClass
          prior counts for classes
protected  double[] pcCond
          prior counts for conditional frequencies
protected  double[] pcUncond
          prior counts for unconditional frequencies
protected  double[][] probabilities
          Probabilities estimated for each class, for each instance
protected  int trainInstances
           
 
Constructor Summary
NaiveClassifier(java.util.ArrayList<int[]> TrainingSet, java.util.ArrayList<java.lang.String> FeatureNames, java.util.ArrayList<java.lang.String> classNames, java.util.ArrayList<java.lang.Integer> numClassForEachFeature, int priorType)
          Initializes all features and output classes; computes all the relevant conditionalFreq on the training set, setting the specified prior (0:0; 1:laplace; 2:uniform)
 
Method Summary
protected  void buildFeatureSet(java.util.ArrayList<int[]> TrainingSet, java.util.ArrayList<java.lang.String> FeatureNames, java.util.ArrayList<java.lang.Integer> NumClassForEachFeature)
          Instantiates the FeatureSet, by computing all the relevant conditionalFreq of all features on the training set; note that a Laplace prior can be introduced, by setting parameter priorCounts different from zero (i.e., the quantity priorCounts will be then added to each computed count).
protected  void buildOutputClasses(java.util.ArrayList<int[]> TrainingSet, java.util.ArrayList<java.lang.String> ClassNames)
          Instantiates class names and conditionalFreq of the OutputClass; prior is defined by parameter priorType (0:0; 1:laplace; 2:uniform)
(package private) abstract  void classifyInstances(java.util.ArrayList<int[]> TestingSet)
          Abstract function
(package private)  double gammaln(double xx)
          The gamma function is necessary in order to compute the marginal likelihood.
 NaiveClassifier.OutputClass[] getOutputClasses()
           
(package private)  void saveProbabilities(java.lang.String fileAddress)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

featureSet

protected NaiveClassifier.Feature[] featureSet
Array of Feature objects, that represents the feature set of the classifier


numClasses

protected int numClasses
number of classes


numFeatures

protected int numFeatures
number of features


numValues

protected java.lang.Integer[] numValues
number of categories for categorical features and number of bins for numerical, then discretized, features . Each position refers to a different feature


outputClasses

protected NaiveClassifier.OutputClass[] outputClasses
Array of OutputClass objects, that represents the possible output classes of the problem


pcClass

protected double pcClass
prior counts for classes


pcCond

protected double[] pcCond
prior counts for conditional frequencies


pcUncond

protected double[] pcUncond
prior counts for unconditional frequencies


probabilities

protected double[][] probabilities
Probabilities estimated for each class, for each instance


trainInstances

protected int trainInstances
Constructor Detail

NaiveClassifier

NaiveClassifier(java.util.ArrayList<int[]> TrainingSet,
                java.util.ArrayList<java.lang.String> FeatureNames,
                java.util.ArrayList<java.lang.String> classNames,
                java.util.ArrayList<java.lang.Integer> numClassForEachFeature,
                int priorType)
Initializes all features and output classes; computes all the relevant conditionalFreq on the training set, setting the specified prior (0:0; 1:laplace; 2:uniform)

Method Detail

buildFeatureSet

protected void buildFeatureSet(java.util.ArrayList<int[]> TrainingSet,
                               java.util.ArrayList<java.lang.String> FeatureNames,
                               java.util.ArrayList<java.lang.Integer> NumClassForEachFeature)
Instantiates the FeatureSet, by computing all the relevant conditionalFreq of all features on the training set; note that a Laplace prior can be introduced, by setting parameter priorCounts different from zero (i.e., the quantity priorCounts will be then added to each computed count). In particular, it computes:

the bivariates count n(a_i,c_j), that correspond to the occurences ignoring missing data for NBC, and to the lower counts for NCC;

for each output class, the number of missing data of the current feature, needed to then compute the upper counts for NCC. The priorType defines the prior to be used (0:0; 1:laplace; 2:uniform) .


buildOutputClasses

protected void buildOutputClasses(java.util.ArrayList<int[]> TrainingSet,
                                  java.util.ArrayList<java.lang.String> ClassNames)
Instantiates class names and conditionalFreq of the OutputClass; prior is defined by parameter priorType (0:0; 1:laplace; 2:uniform)


classifyInstances

abstract void classifyInstances(java.util.ArrayList<int[]> TestingSet)
Abstract function


gammaln

double gammaln(double xx)
The gamma function is necessary in order to compute the marginal likelihood. Code taken from StatsconLib.java see: www.symbolicnet.org/conferences/iamc02/IAMCNosal.pdf Function gammaln: returns the value of ln(gamma(xx)) for xx > 0*


getOutputClasses

public NaiveClassifier.OutputClass[] getOutputClasses()

saveProbabilities

void saveProbabilities(java.lang.String fileAddress)