at.tuwien.ifs.somtoolbox.reportgenerator.output
Class SOMDescriptionLATEX

java.lang.Object
  extended by at.tuwien.ifs.somtoolbox.reportgenerator.output.SOMDescriptionLATEX
Direct Known Subclasses:
SOMGGDescriptionLATEX

public class SOMDescriptionLATEX
extends Object

this class is the base class for generating the part of the reports that describes a SOM. it creates output containing information about basic properties of the learning process and the created SOM, as well as about the distribution of the input data on this SOM. This class implements the description of a standard GrowingSOM, and is subtyped for creating proper description of other SOMs.

Version:
$Id: SOMDescriptionLATEX.java 3914 2010-11-04 14:28:18Z mayer $
Author:
Sebastian Skritek (0226286, Sebastian.Skritek@gmx.at), Martin Waitzbauer (0226025)

Field Summary
protected  DatasetInformation dataset
          contains all available information about the input data
protected  String imgDir
          the directory to which the created images shall be saved
static String imgSubdir
          the name of the directory where the image shall be saved - relative to the given baseDir path in the constructor
protected  TestRunResult testrun
          encapsulates all available information about the testrun - just ask
protected  ReportFileWriter writer
          all strings that shall be in the output are sent to this object
 
Constructor Summary
SOMDescriptionLATEX(ReportFileWriter writer, DatasetInformation dataset, TestRunResult testrun, String baseDir)
          creates a new instance
 
Method Summary
private  String classDistInCluster(int level, int numbInputs)
          returns a formatted string that contains information about the classes present in the given cluster
private  String getInputCoords(InputQEContainer value)
          formats a list of input vectors for use in the quantization error list the created format is: "on input vector(s) "id" on unit at[x,y], "id2" on unit at [x2,y2],...
protected  String getUnitCoords(UnitQEContainer value)
          formats a list of units for use in the quantization error list the created format is "on unit(s) at [x,y] - z vectors mapped, [x2,y2] - z2 vectors mapped, ...
protected  void printClusterInfos()
          this function prints some information about the possible clusters that can be found on the SOM.
 void printDataDistribution()
           
protected  void printDistributionDetailTable(Hashtable<String,Vector<InputDatum>> lookup, boolean classInformationAvailable)
          creates and outputs one out of two possible tables the first possible table: \\\\ the first table contains for each unit only the number of input vectors mapped to it, and a pie chart image of the class distribution within this unit.
protected  void printLearningRate()
          prints the type of how the learning rate changes, and the initial learning rate (taken from the MySOMMapDescription, provided by the TestRunResult object)
protected  void printMapLayout(boolean classInfoAvailable)
          Creates output describing the layout of the created som this includes tables showing the distribution of the input vectors of the som, as well as (if available) the distribution of the classes on the som.
protected  void printMetricUsed()
          prints the metric used to calculate the distance between two vectors (taken from the MySOMMapDescription, provided by the TestRunResult object)
protected  void printNeighbourhoodFunction()
          Prints the neighbourhood type and the initial neighbourhood range used for training (taken from the MySOMMapDescription, provided by the TestRunResult object)
protected  void printNumberOfIterations()
          prints the number of iterations used in the training process (taken from the MySOMMapDescription, provided by the TestRunResult object)
protected  void printQuantizationErrorReport()
          prints a list with different quantization errors.
protected  void printRandomSeed()
          prints the random seed used for the initialization of the SOM (taken from the MySOMMapDescription, provided by the TestRunResult object)
protected  void printSigma()
          prints the value of the sigma, as one of the learning parameters (taken from the SOMProperty object)
 void printSOMDescription()
          initiates the creation of the output Creates the description of the SOM and training properties
protected  void printSOMDimensions()
          Prints the dimension of the SOM, that is the number of units in x and y direction (for Growing SOM this is enough) (taken from the MySOMMapDescription, provided by the TestRunResult object)
protected  void printSOMProperties()
          prints a list of properties describing the training process and the generated SOM this list contains besides others: \\begin{itemize} \\item type and topology of SOM \\item dimensions of the som \\item different training parameters \\item neighbourhood function \\item ...
protected  void printTau()
          prints the value of the tau, as one of the learning parameters (taken from the SOMProperty object)
protected  void printTopographicErrorReport()
          adds information about the topographic error on the map to the report beside a list containing the topographic error of the map and the min/max top.
protected  void printTopologyOfSOM()
          Prints the topology and type of the SOM (unit shape, type of SOM, ...) the value is dependent of the value keyTopoology specified by the TestRunResult Object
protected  void printTrainingDate()
          if available, prints the time and date of the training (taken from the MySOMMapDescription, provided by the TestRunResult object)
protected  void printTrainingTime()
          if available, prints the time the training of the SOM needed (taken from the MySOMMapDescription, provided by the TestRunResult object)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

imgSubdir

public static final String imgSubdir
the name of the directory where the image shall be saved - relative to the given baseDir path in the constructor

See Also:
Constant Field Values

writer

protected ReportFileWriter writer
all strings that shall be in the output are sent to this object


testrun

protected TestRunResult testrun
encapsulates all available information about the testrun - just ask


dataset

protected DatasetInformation dataset
contains all available information about the input data


imgDir

protected String imgDir
the directory to which the created images shall be saved

Constructor Detail

SOMDescriptionLATEX

public SOMDescriptionLATEX(ReportFileWriter writer,
                           DatasetInformation dataset,
                           TestRunResult testrun,
                           String baseDir)
creates a new instance

Parameters:
writer - object that handles how the created string is written to a file
dataset - object storing information about input data
testrun - object storing information about testrun results
baseDir - path to the directory where created images shall be stored
Method Detail

printSOMDescription

public void printSOMDescription()
initiates the creation of the output Creates the description of the SOM and training properties


printSOMProperties

protected void printSOMProperties()
prints a list of properties describing the training process and the generated SOM this list contains besides others: \\begin{itemize} \\item type and topology of SOM \\item dimensions of the som \\item different training parameters \\item neighbourhood function \\item ... \\end{itemize}


printTopologyOfSOM

protected void printTopologyOfSOM()
Prints the topology and type of the SOM (unit shape, type of SOM, ...) the value is dependent of the value keyTopoology specified by the TestRunResult Object


printSOMDimensions

protected void printSOMDimensions()
Prints the dimension of the SOM, that is the number of units in x and y direction (for Growing SOM this is enough) (taken from the MySOMMapDescription, provided by the TestRunResult object)


printSigma

protected void printSigma()
prints the value of the sigma, as one of the learning parameters (taken from the SOMProperty object)


printTau

protected void printTau()
prints the value of the tau, as one of the learning parameters (taken from the SOMProperty object)


printMetricUsed

protected void printMetricUsed()
prints the metric used to calculate the distance between two vectors (taken from the MySOMMapDescription, provided by the TestRunResult object)


printNumberOfIterations

protected void printNumberOfIterations()
prints the number of iterations used in the training process (taken from the MySOMMapDescription, provided by the TestRunResult object)


printTrainingDate

protected void printTrainingDate()
if available, prints the time and date of the training (taken from the MySOMMapDescription, provided by the TestRunResult object)


printTrainingTime

protected void printTrainingTime()
if available, prints the time the training of the SOM needed (taken from the MySOMMapDescription, provided by the TestRunResult object)


printRandomSeed

protected void printRandomSeed()
prints the random seed used for the initialization of the SOM (taken from the MySOMMapDescription, provided by the TestRunResult object)


printNeighbourhoodFunction

protected void printNeighbourhoodFunction()
Prints the neighbourhood type and the initial neighbourhood range used for training (taken from the MySOMMapDescription, provided by the TestRunResult object)


printLearningRate

protected void printLearningRate()
prints the type of how the learning rate changes, and the initial learning rate (taken from the MySOMMapDescription, provided by the TestRunResult object)


printDataDistribution

public void printDataDistribution()

printMapLayout

protected void printMapLayout(boolean classInfoAvailable)
Creates output describing the layout of the created som this includes tables showing the distribution of the input vectors of the som, as well as (if available) the distribution of the classes on the som. In addition, if the user selected input items for getting their position on the trained SOM, this information is also created in this function.

Parameters:
classInfoAvailable - true if class information are available (and therefore a piechart should be inserted, false otherwise)

printDistributionDetailTable

protected void printDistributionDetailTable(Hashtable<String,Vector<InputDatum>> lookup,
                                            boolean classInformationAvailable)
creates and outputs one out of two possible tables the first possible table: \\\\ the first table contains for each unit only the number of input vectors mapped to it, and a pie chart image of the class distribution within this unit. (lookup == null) If no class information is available, this table is not created\\\\ \\\\ The second table additionally contains the ids of input vectors selected by the user within the unit they are mapped to. Therefore lookup must be a hashtable that contains for each unit (key = "x_y") a list of InputDatum objects specifying the input elements mapped to this unit. If for a key there's no list, then nothing is mapped to this unit. if no class information is available, the table does not contain the number of input vectors mapped to each unit

Parameters:
lookup - a hashtable with the content specified above to map input vectors to units
classInformationAvailable - true if class information (and therefore pie chart diagrams for the units are available, false otherwise

printQuantizationErrorReport

protected void printQuantizationErrorReport()
prints a list with different quantization errors. This includes: \\begin{itemize} \\item Map mean quantization error \\item Map mean mean quantization error \\item Min/Max unit quantization error \\item Min/Max unit mean quantization error \\item Min/Max quantization error of an input vector (as it is mapped to SOM) \\item Min/Max quantization error of an input vector (taken the best matching unit, as I think it could happen that during the training an input vector is mapped to a unit that might be the best matching unit, but during the training process is then pulled away from this input, s.t. in the end another unit would be better)


printTopographicErrorReport

protected void printTopographicErrorReport()
adds information about the topographic error on the map to the report beside a list containing the topographic error of the map and the min/max top. error on the units, an image visualizing the distribution of the topographic error on the map is output.


getUnitCoords

protected String getUnitCoords(UnitQEContainer value)
formats a list of units for use in the quantization error list the created format is "on unit(s) at [x,y] - z vectors mapped, [x2,y2] - z2 vectors mapped, ...

Parameters:
value - the container from which the information about the number ob units can be picked
Returns:
A string formatted as stated above

getInputCoords

private String getInputCoords(InputQEContainer value)
formats a list of input vectors for use in the quantization error list the created format is: "on input vector(s) "id" on unit at[x,y], "id2" on unit at [x2,y2],...

Parameters:
value - a container storing all information required to create the output
Returns:
a string formatted as stated above

printClusterInfos

protected void printClusterInfos()
this function prints some information about the possible clusters that can be found on the SOM. It tries to find some good, or stable clusters, using a simple heuristic. For a description of this, please see TestRunResult.getStableClusters2. (it works at least quite well with the iris and the animal dataset - more test tbd. as maximal number of clusters, the number of units on the SOM is choosen. Besides the tree of available clusters, the ten "best" (according to this heuristic clusters are listed and visualized. Also the image of the UMatrix of the SOM is created and attached to the report !!! I also wanted to do some tests with the k-nearest neighbour clustering, but couldn't find any implementation of it. !!!


classDistInCluster

private String classDistInCluster(int level,
                                  int numbInputs)
returns a formatted string that contains information about the classes present in the given cluster

Parameters:
level - the level of the cluster of interest
numbInputs - the number of input vectors mapped to this cluster
Returns:
a formatted string containing some information about this cluster