jncc20
Class Mjncc.MP

java.lang.Object
  extended by jncc20.Mjncc.MP
Enclosing class:
Mjncc

private static class Mjncc.MP
extends java.lang.Object


Field Summary
private  java.lang.String ArffExportFile
           
private  java.util.ArrayList<java.lang.String[]> CategoryNames
          Matrix of String with rows of different lenght, as different features (each row of the matrix corresponds to a different feature) can have different numbers of categories.
private  java.util.ArrayList<java.lang.String> ClassNames
          Names of the output class.
private  java.lang.String DatasetName
          Dataset Name as read from the field "@relation" in the Arff file
private  java.util.ArrayList<java.lang.String> FeatureNames
          Names of input features
private  java.util.ArrayList<java.lang.Integer> MarFeatureIndex
          features over which the MP has to work as missing-at-random
private  java.util.ArrayList<java.lang.String> MarFeatureNames
          features over which the MP has to work as missing-at-random
private  java.lang.String missingness
           
private  java.lang.Double MissingnessProb
          probability of each feature, for the data to be turned into a missing
private  int[][] NonMarFeatureClasses
          Classes to be affected by NonMar MP for each feature
private  java.util.ArrayList<java.lang.Integer> NonMarFeatureIndex
          features over which the MP has to work as missing-at-random
private  java.util.ArrayList<java.lang.String> NonMarFeatureNames
          Names of all NonMar features.
private  java.util.ArrayList<double[]> RawDataset
          Copy of the data read from Arff file (having hence -9999 as marker for missing data), and category names substituted by the corresponding indexes.)
private  java.lang.String UserArffFile
           
private  java.lang.String WorkingPath
          Path where the files for the given experiment (Arff files, NonMar.txt) are found, and where output files will be saved.
 
Constructor Summary
Mjncc.MP(java.lang.String[] SuppliedArguments)
          Constructor which expects two arguments: -the working path; -the Arff file from which to load the original data set.
 
Method Summary
(package private)  void exportArff()
          Exports to Arff file the data set (supposed to contain the missing data generated by the MP).
(package private)  void generateMissingness()
          Generates missingness independently on each feature, according to the user specifications.
 java.util.ArrayList<double[]> getRawDataset()
          Simple getter
(package private)  void readArffFile(java.lang.String UserSuppliedArffName)
          Reads Arff file (using objects and methods of the AParser class)
(package private)  void readPars(java.lang.String LearningStage)
          Reads the file mprocess.training.txt, or mprocess.testing or mprocess.training or mprocess (dependining on how the Learning Stage is set: either as "training" or "testing" or "none") containing the specifications for the MP.
(package private)  void setPars(java.lang.String LearningStage)
          Set parameters for missingness generation: 5% probability,the first half of the values of all features in training, the second half of values of all features in testing.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ArffExportFile

private java.lang.String ArffExportFile

CategoryNames

private java.util.ArrayList<java.lang.String[]> CategoryNames
Matrix of String with rows of different lenght, as different features (each row of the matrix corresponds to a different feature) can have different numbers of categories.


ClassNames

private java.util.ArrayList<java.lang.String> ClassNames
Names of the output class.


DatasetName

private java.lang.String DatasetName
Dataset Name as read from the field "@relation" in the Arff file


FeatureNames

private java.util.ArrayList<java.lang.String> FeatureNames
Names of input features


MarFeatureIndex

private java.util.ArrayList<java.lang.Integer> MarFeatureIndex
features over which the MP has to work as missing-at-random


MarFeatureNames

private java.util.ArrayList<java.lang.String> MarFeatureNames
features over which the MP has to work as missing-at-random


missingness

private java.lang.String missingness

MissingnessProb

private java.lang.Double MissingnessProb
probability of each feature, for the data to be turned into a missing


NonMarFeatureClasses

private int[][] NonMarFeatureClasses
Classes to be affected by NonMar MP for each feature


NonMarFeatureIndex

private java.util.ArrayList<java.lang.Integer> NonMarFeatureIndex
features over which the MP has to work as missing-at-random


NonMarFeatureNames

private java.util.ArrayList<java.lang.String> NonMarFeatureNames
Names of all NonMar features. Names are checked for correspondence against the names contained in FeatureNames


RawDataset

private java.util.ArrayList<double[]> RawDataset
Copy of the data read from Arff file (having hence -9999 as marker for missing data), and category names substituted by the corresponding indexes.)


UserArffFile

private java.lang.String UserArffFile

WorkingPath

private java.lang.String WorkingPath
Path where the files for the given experiment (Arff files, NonMar.txt) are found, and where output files will be saved.

Constructor Detail

Mjncc.MP

Mjncc.MP(java.lang.String[] SuppliedArguments)
Constructor which expects two arguments: -the working path; -the Arff file from which to load the original data set.

Method Detail

exportArff

void exportArff()
Exports to Arff file the data set (supposed to contain the missing data generated by the MP).


generateMissingness

void generateMissingness()
Generates missingness independently on each feature, according to the user specifications.


getRawDataset

public java.util.ArrayList<double[]> getRawDataset()
Simple getter


readArffFile

void readArffFile(java.lang.String UserSuppliedArffName)
Reads Arff file (using objects and methods of the AParser class)


readPars

void readPars(java.lang.String LearningStage)
Reads the file mprocess.training.txt, or mprocess.testing or mprocess.training or mprocess (dependining on how the Learning Stage is set: either as "training" or "testing" or "none") containing the specifications for the MP.

The syntax for each row of the file is as follows:

FeatureName ClassesList(or "MAR") Degree

where:
-FeatureName is the name of the feature as specified in the original Arff file
-ClassesList is the list of affected classes, separated by commas (classes not listed won't be affected by the missingness process). If "MAR" is provided instead than ClassesList, the missingness process is will affect all the values of the feature with the same degree of probability.
- All the field are separated by a white space.
- If no file is found, the programs terminates.


setPars

void setPars(java.lang.String LearningStage)
Set parameters for missingness generation: 5% probability,the first half of the values of all features in training, the second half of values of all features in testing.