TU Logo   IFS Logo Vienna University of Technology
Institute of Software Technology and Interactive Systems
Data Mining with the Java SOMToolbox
[DataMining Home] [People] [Publications] [SOMToolbox]

Java SOMToolbox Applications & Tools

This page contains a reference documentation of the applications and tools available in the Java SOMToolbox.
The overview on existing applications is also available when running the somtoolbox.sh script (somtoolbox.bat on Windows systems), while the detailed help for each application is available when running somtoolbox.sh <applicationName> --help (somtoolbox.bat <applicationName> --help on Windows systems).

Training Applications
GHSOM The Growing Hierarchical SOM grows a hierarchy of maps, depending on the structure of the data set.
GrowingSOM Provides a Growing Grid and a static SOM (with the growth parameters disabled)
MnemonicSOM MenmonicSOM has a grid which is not fully occupied with units.
SOMTrainer Graphical Interface to train a SOM
Viewer Applications
SOMViewer An interactive viewer for exploring SOMs, using different visualisations
UnitFileViewer Plots a unit file, can be used especially to plot a 3D-SOM
Utils Applications
AttendeeMapper Writes an HTML output of the map, with highlighting certain data items
DataMapper Maps inputs to an already trained SOM.
DataSetViewer
HTMLOutputter Creates an HTML representation of the Map.
RhythmPatternsMatrixVisualisationSaver Save rhythm patterns matrix visualisation of an input file as images to a file.
SOMMerger Combines the weight vectors of one or more SOM maps to an input vector file
SimilarityRetrieval Performs similarity retrieval in a given database (vector file)
SimilarityRetrievalGUI GUI for similarity retrieval
SimilarityRetrievalWriter Performs similarity retrieval in a given database (vector file)
SomFilePacker Packer to create autonomous SomFiles
TemplateVectorComparator Compares the contents of two template vectors
TrajectoryOutputter Generates a graphical representation of a trajectory of the given points over the map
VisualisationImageSaver Save Visualisations of a map as images to a file.
Helper Applications
ClassSubsetGenerator Generates a subset and class assignment based on the given class file
DataWinnerMappingWriter Writes a data winner mapping file from a trained map
DatasetRandomiser Randomises data sets
DistanceMatrixWriter Writes a distance matrix for the given data
InputDataFileFormatConverter Converts between various file formats for input data.
LabelSOM Implements the LabelSOM labelling method
LagusKeywordLabeler Implements the LagusKeyword labelling method
MapFileFormatConverter Converts between various file formats for trained SOMs.
MapRotator Rotates a map by the given degrees, and writes a new unit- and weight-vector file
MnemonicSOMGenerator MnemonicSOMGenerator allows to create an arbitrary shaped SOM (unit file)
QualityMeasureComputer Wrapper for the individual Quality Measures
RGBPaletteConverter Converts palettes given as RGB values to the XML format used in the SOMToolbox.
SOMLibInputConcatenator Merges two or more SOMLib Input files, i.e. vector and template files
SOMLibVectorNormalization Handles the normalization of vector files in SOMLib format
SOMLibZeroVectorRemover Removes zero vectors (i.e. with 0 in all their components) from files in SOMLib format
StringReplacer Replaces strings by a replacement in the given file
VectorFile2DatabaseImporter Imports input and template vector files to a database
VectorFileAppender Append SOMLibVectorFiles
VectorFileConcatenator Merge SOMLibVectorFiles
VectorFileRewriter Replace labels in SOMLibVectorFiles

Training Applications

GHSOM

The Growing Hierarchical SOM grows a hierarchy of maps, depending on the structure of the data set.

Usage: java at.tuwien.ifs.somtoolbox.models.GHSOM [-h] [-l <labeling>] [-n <numberLabels>] [--numberWinners <numberWinners>] [--skipDWM] <properties>

Options:
  -h
        Generate HTML output.

  -l <labeling>
        Labeling algorithm to use.

  -n <numberLabels>
        Number of labels per unit. Useless if no labeling algorithm is specified.

  --numberWinners <numberWinners>
        Number of winners to write. Default is 300. default: 300

  --skipDWM
        Skip writing the data winner mapping file

  <properties>
        Name of property file.

GrowingSOM

Provides a Growing Grid (Bernd Fritzke, 1995), i.e. a Self-Organising Map that can dynamically grow by inserting whole rows or cells. When setting the growth-controlling parameters to disable growth, it can also be used to train a standard SOM.

Usage: java at.tuwien.ifs.somtoolbox.models.GrowingSOM [-h] [-l <labeling>] [-n <numberLabels>] [-w <weightVectorFile>] [-m <mapDescriptionFile>] [--skipDWM] [--numberWinners <numberWinners>] <properties> [--cpus <cpus>]

Options:
  -h
        Generate HTML output.

  -l <labeling>
        Labeling algorithm to use.

  -n <numberLabels>
        Number of labels per unit. Useless if no labeling algorithm is specified.

  -w <weightVectorFile>
        Weight vector file used for initialization of the map.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  --skipDWM
        Skip writing the data winner mapping file

  --numberWinners <numberWinners>
        Number of winners to write. Default is 300. default: 300

  <properties>
        Name of property file.

  --cpus <cpus>
        Numbers of CPUs to use. default: 1

MnemonicSOM

MenmonicSOM has a grid which is not fully occupied with units.

Usage: java at.tuwien.ifs.somtoolbox.models.MnemonicSOM [-h] [-l <labeling>] [-n <numberLabels>] --dim <dimension> -m <mapDescriptionFile> -u <unitDescriptionFile> [--numberWinners <numberWinners>] [--skipDWM] <properties>

Options:
  -h
        Generate HTML output.

  -l <labeling>
        Labeling algorithm to use.

  -n <numberLabels>
        Number of labels per unit. Useless if no labeling algorithm is specified.

  --dim <dimension>
        The dimension of the input data.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  --numberWinners <numberWinners>
        Number of winners to write. Default is 300. default: 300

  --skipDWM
        Skip writing the data winner mapping file

  <properties>
        Name of property file.

SOMTrainer

The SOMTrainer provides a graphical Interface to create a SOM based on different SOM Models.

Usage: java at.tuwien.ifs.somtoolbox.apps.trainer.SOMTrainer 

Options:

Viewer Applications

SOMViewer

An interactive viewer for exploring SOMs, using different visualisations

Usage: java at.tuwien.ifs.somtoolbox.apps.viewer.SOMViewer -u <unitDescriptionFile> -w <weightVectorFile> [--dw <dataWinnerMappingFile>] [-m <mapDescriptionFile>] [(-t|--tv) <templateVectorFile>] [-v <inputVectorFile>] [-c <classInformationFile>] [-r <regressionInformationFile>] [-i <dataInformationFile>] [-d <dataNamesFile>] [--linkage <linkageMapFile>] [--corrections <inputCorrections>] [--colours <classColours>] [-p <fileNamePrefix>] [-s <fileNameSuffix>] [--imagePrefix <imagePrefix>] [--imageSuffix <imageSuffix>] [--appdir <applicationDirectory>] [--datadir <viewerWorkingDirectory>] [--vis <initialVisualisation>] [--visParams <initialVisParams>] [--palette <initialPalette>] [(-2|--secondSOM) <secondSOMPrefix>] [-o|--documenMode] [--noplayer] [--pDdecode <probabilityDecode>] [--decodeDir <decodedOutputDir>]

Options:
  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  --dw <dataWinnerMappingFile>
        Unit description file describing the winners mapped onto a unit for a SOM/GHSOM.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -c <classInformationFile>
        Class information file containing the class for each data item.

  -r <regressionInformationFile>
        Regression information file containing the predicted values for each data item.

  -i <dataInformationFile>
        Data information file containing information such as location of each data item.

  -d <dataNamesFile>
        File containing the names of data items to be highlighted.

  --linkage <linkageMapFile>
        File containing data item linkage mapping.

  --corrections <inputCorrections>
        Name of the file containing input corrections.

  --colours <classColours>
        File listing the RGB values of the colours to be used for the class-based visualisations Pie-charts, Thematic
        class map.

  -p <fileNamePrefix>
        Prefix for the relative path to the files. This option is only to be used if no absolute paths are available.

  -s <fileNameSuffix>
        Suffix to the filenames aka file endings, e.g. ".mp3". This option is only to be used if no absolute paths are
        available.

  --imagePrefix <imagePrefix>
        Prefix to prepend to the labels to find the images

  --imageSuffix <imageSuffix>
        Suffix to append to the labels to find the images

  --appdir <applicationDirectory>
        Directory containing the SOMViewer application start file and the somviewer.prop file.

  --datadir <viewerWorkingDirectory>
        Directory containing the input data items needed for the visualisations.

  --vis <initialVisualisation>
        The name of the initial visualisation.

  --visParams <initialVisParams>
        Parameters for the initial visualisation. Currently only implemented for SmoothedDataHistograms visualisations.

  --palette <initialPalette>
        The name of the initial palette to be used.

  -2, --secondSOM <secondSOMPrefix>
        Prefix for the set of files representing second SOM

  -o, --documenMode
        Activates the Document mode, which hides the PlaySOM toolbar for exporting playlists and instead activates a
        document preview. For this to work properly, you have to specify the document path prefix denoting the base path
        for the files, e.g. 'file:///c:/somemap/files/', and the suffix, like .html.

  --noplayer
        Don't use the internal player PlaySOMPlayer, create the classic PlaySOMPanel

  --pDdecode <probabilityDecode>
        When using multi-channel audio playback Probability to decode a mp3 file to wav before playing.

  --decodeDir <decodedOutputDir>
        When using multi-channel audio playback Decoded mp3 files will be saved to and loaded from this directory.

UnitFileViewer

Plots a unit file, can be used especially to plot a 3D-SOM

Usage: java at.tuwien.ifs.somtoolbox.apps.UnitFileViewer -u <unitDescriptionFile> [--showLabels <showLabels>] [--verbose]

Options:
  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  --showLabels <showLabels>
        How many labels per unit should be displayed default = 0 default: 0

  --verbose
        Be more verbose...

Utils Applications

AttendeeMapper

Writes an HTML output of the map, with highlighting certain data items

Usage: java at.tuwien.ifs.somtoolbox.output.AttendeeMapper -d <dataNamesFile> [-l <labeling>] -u <unitDescriptionFile> <htmlFile>

Options:
  -d <dataNamesFile>
        File containing the names of data items to be highlighted.

  -l <labeling>
        Labeling algorithm to use.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  <htmlFile>
        Name of HTML file to write.

DataMapper

Maps inputs to an already trained SOM. If a unit-file is given, the data items are added to the loaded map, without a unti file the mapping starts with an empty map.

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.DataMapper -w <weightVectorFile> [-m <mapDescriptionFile>] -v <inputVectorFile> [-u <unitDescriptionFile>] [-c <classInformationFile>] [--classlist classList1:classList2:...:classListN ] [-l <labeling>] [-n <numberLabels>] [--numberWinners <numberWinners>] [--skipDWM] [<output>]

Options:
  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -c <classInformationFile>
        Class information file containing the class for each data item.

  --classlist classList1:classList2:...:classListN 
        A List of class names

  -l <labeling>
        Labeling algorithm to use.

  -n <numberLabels>
        Number of labels per unit. Useless if no labeling algorithm is specified.

  --numberWinners <numberWinners>
        Number of winners to write. Default is 300. default: 300

  --skipDWM
        Skip writing the data winner mapping file

  <output>
        Name of the output file.

DataSetViewer

Usage: java at.tuwien.ifs.somtoolbox.apps.DataSetViewer [-v <inputVectorFile>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

HTMLOutputter

Creates an HTML representation of the Map. The representation displays a hit histogram, and shows the names of the mapped inputs.

Usage: java at.tuwien.ifs.somtoolbox.output.HTMLOutputter [--metric <metric>] [--normalized] [-v <inputVectorFile>] [--dense] [-l <labeling>] [--ignoreZero] [-n <numberLabels>] [(-t|--tv) <templateVectorFile>] [-w <weightVectorFile>] [-u <unitDescriptionFile>] [-m <mapDescriptionFile>] <htmlFile>

Options:
  --metric <metric>
        Name of the metric to be used for distance calculation in input space. default:
        at.tuwien.ifs.somtoolbox.layers.metrics.L2Metric

  --normalized
        Set, if vectors are normalized to unit length. At the moment this option is not crucial.

  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  --dense
        Set if input data vectors are densely populated.

  -l <labeling>
        Labeling algorithm to use.

  --ignoreZero
        Ignore labels with zero mean value and que.

  -n <numberLabels>
        Number of labels per unit. Useless if no labeling algorithm is specified.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  <htmlFile>
        Name of HTML file to write.

RhythmPatternsMatrixVisualisationSaver

Provides a batch mode to save rhythm patterns matrix visualisation of all inputs in a vector file to image files.

Usage: java at.tuwien.ifs.feature.evaluation.RhythmPatternsMatrixVisualisationSaver -v <inputVectorFile> [--width <width>] [--type <filetype>] [(-g|--unitGrid) <unitGrid>] [(-o|--output) <basename>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  --width <width>
        The width of a unit. default: 10

  --type <filetype>
        default: png

  -g, --unitGrid <unitGrid>
        Whether to draw the grid of units. default: true

  -o, --output <basename>

SOMMerger

Combines the weight vectors of one or more SOM maps to an input vector file

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.SOMMerger [--mode <mode>] [--skipConversion] [--mapSize <mapSize>] [--inputDir <inputDir>] <output> [maps1 maps2 ... mapsN]

Options:
  --mode <mode>
        The merging mode to use, i.e. Union, Intersection, or a number of vectors the term occurrs.
        If no mode is provided, all possible combinations will be generated. default: All

  --skipConversion
        Skip conversion of map files to input vector files, if you already did that before.

  --mapSize <mapSize>
        The size of the map to be used to write the properties files, e.g. 4x5.
        If not specified, a default map size will be computed, depending on the number of input vectors.

  --inputDir <inputDir>
        Path to the input directory.

  <output>
        Name of the output file.

  maps1 maps2 ... mapsN
        Prefix for the SOM maps to merge i.e. file name w/o .unit/.wgt extension

SimilarityRetrieval

Performs similarity retrieval in a given database (vector file)

Usage: java at.tuwien.ifs.feature.evaluation.SimilarityRetrieval -v <inputVectorFile> -l <inputLabel> [(-n|--numberNeighbours) <numberNeighbours>] [--metric <metric>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -l <inputLabel>
        Name/label of the input vector

  -n, --numberNeighbours <numberNeighbours>
        Number of neighbours to find.

  --metric <metric>
        Name of the metric to be used for distance calculation in input space. default:
        at.tuwien.ifs.somtoolbox.layers.metrics.L2Metric

SimilarityRetrievalGUI

Provides a graphical interface for similarity retrieval. Allows to load multiple vector files

Usage: java at.tuwien.ifs.feature.evaluation.SimilarityRetrievalGUI 

Options:

SimilarityRetrievalWriter

Performs similarity retrieval in a given database (vector file), and writes the results in one file per data itme

Usage: java at.tuwien.ifs.feature.evaluation.SimilarityRetrievalWriter -v <inputVectorFile> [(-n|--numberNeighbours) <numberNeighbours>] [--metric <metric>] --outputDir <outputDirectory> [--startIndex <startIndex>] [--numberItems <numberItems>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -n, --numberNeighbours <numberNeighbours>
        Number of neighbours to find. default: 1000

  --metric <metric>
        Name of the metric to be used for distance calculation in input space. default:
        at.tuwien.ifs.somtoolbox.layers.metrics.L1Metric

  --outputDir <outputDirectory>
        Name of the output directory.

  --startIndex <startIndex>
        The start index. default: 0

  --numberItems <numberItems>
        Number of items

SomFilePacker

Packer to create autonomous SomFiles

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.SomFilePacker <output> -u <unitDescriptionFile> -w <weightVectorFile> [-d <dataNamesFile>] [-c <classInformationFile>] [-r <regressionInformationFile>] [-m <mapDescriptionFile>] [-i <dataInformationFile>] [-p <fileNamePrefix>] [-s <fileNameSuffix>] [--dw <dataWinnerMappingFile>] [-v <inputVectorFile>] [(-t|--tv) <templateVectorFile>] [--linkage <linkageMapFile>] [--colours <classColours>] [--corrections <inputCorrections>]

Options:
  <output>
        Name of the output file.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -d <dataNamesFile>
        File containing the names of data items to be highlighted.

  -c <classInformationFile>
        Class information file containing the class for each data item.

  -r <regressionInformationFile>
        Regression information file containing the predicted values for each data item.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  -i <dataInformationFile>
        Data information file containing information such as location of each data item.

  -p <fileNamePrefix>
        Prefix for the relative path to the files. This option is only to be used if no absolute paths are available.

  -s <fileNameSuffix>
        Suffix to the filenames aka file endings, e.g. ".mp3". This option is only to be used if no absolute paths are
        available.

  --dw <dataWinnerMappingFile>
        Unit description file describing the winners mapped onto a unit for a SOM/GHSOM.

  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  --linkage <linkageMapFile>
        File containing data item linkage mapping.

  --colours <classColours>
        File listing the RGB values of the colours to be used for the class-based visualisations Pie-charts, Thematic
        class map.

  --corrections <inputCorrections>
        Name of the file containing input corrections.

TemplateVectorComparator

Compares the contents of two template vectors

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.TemplateVectorComparator <templateVectorFile> <templateVectorFile2>

Options:
  <templateVectorFile>
        First template vector file.

  <templateVectorFile2>
        Second template vector file.

TrajectoryOutputter

Generates a graphical representation of a trajectory of the given points over the map

Usage: java at.tuwien.ifs.somtoolbox.output.TrajectoryOutputter -d <dataNamesFile> -u <unitDescriptionFile> [-m <mapDescriptionFile>] [-l|--drawLines] <imageFile>

Options:
  -d <dataNamesFile>
        File containing the names of data items to be highlighted.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  -l, --drawLines
        Draw trajectory lines on map.

  <imageFile>
        Name of image file to write. No Suffix needed. Writes as PNG and EPS.

VisualisationImageSaver

Provides a batch mode to save several/all visualisations of a map to image files.

Usage: java at.tuwien.ifs.somtoolbox.apps.VisualisationImageSaver -u <unitDescriptionFile> -w <weightVectorFile> [-v <inputVectorFile>] [(-t|--tv) <templateVectorFile>] [--dw <dataWinnerMappingFile>] [-c <classInformationFile>] [(-o|--output) <basename>] [--width <width>] [--height <height>] [--type <filetype>] [(-g|--unitGrid) <unitGrid>] [vis1 vis2 ... visN]

Options:
  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  --dw <dataWinnerMappingFile>
        Unit description file describing the winners mapped onto a unit for a SOM/GHSOM.

  -c <classInformationFile>
        Class information file containing the class for each data item.

  -o, --output <basename>

  --width <width>
        The width of a unit. default: 10

  --height <height>
        The height of a unit.

  --type <filetype>
        default: png

  -g, --unitGrid <unitGrid>
        Whether to draw the grid of units. default: true

  vis1 vis2 ... visN
        The visualisations to create.

Helper Applications

ClassSubsetGenerator

Generates a subset and class assignment based on the given class file

Usage: java at.tuwien.ifs.feature.ClassSubsetGenerator -c <classInformationFile> [(-z|--gzip) <gzip>] <input> <output>

Options:
  -c <classInformationFile>
        Class information file containing the class for each data item.

  -z, --gzip <gzip>
        Whether or not to gzip the output. default: true

  <input>
        Name of input vector file to be read.

  <output>
        Name of the output file.

DataWinnerMappingWriter

Writes a data winner mapping file from a trained map

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.DataWinnerMappingWriter -v <inputVectorFile> -w <weightVectorFile> -u <unitDescriptionFile> [-m <mapDescriptionFile>] --numberWinners <numberWinners> <output> [--outputDir <outputDirectory>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  --numberWinners <numberWinners>
        Number of winners to write. Default is 300. default: 300

  <output>
        Name of the output file.

  --outputDir <outputDirectory>
        Name of the output directory.

DatasetRandomiser

Randomises data sets by swapping the order of columns (features/attributes) and/or rows (vectors)

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.DatasetRandomiser -v <inputVectorFile> [(-t|--tv) <templateVectorFile>] [--variants <variants>] [--interleave <interleave>] [--startIndex <startIndex>] [--preserveFeatureOrder] [--preserveVectorOrder] [(-z|--gzip) <gzip>] <output>

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  --variants <variants>
        Number of variants to generate. default: 1

  --interleave <interleave>
        Interleave between the indices. default: 1

  --startIndex <startIndex>
        The start index. default: 1

  --preserveFeatureOrder
        Wether or not preserve the order of features.

  --preserveVectorOrder
        Wether or not preserve the order of vectors.

  -z, --gzip <gzip>
        Whether or not to gzip the output. default: true

  <output>
        Name of the output file.

DistanceMatrixWriter

Writes a distance matrix for the given data, in ASCII or binary format

Usage: java at.tuwien.ifs.somtoolbox.data.distance.DistanceMatrixWriter -v <inputVectorFile> [-c <classInformationFile>] [--metric <metric>] [--metricParams <metricParams>] <output> [--outputFormat <outputFormat>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -c <classInformationFile>
        Class information file containing the class for each data item.

  --metric <metric>
        Name of the metric to be used for distance calculation in input space. default:
        at.tuwien.ifs.somtoolbox.layers.metrics.L2Metric

  --metricParams <metricParams>
        Parameters for the metric.

  <output>
        Name of the output file.

  --outputFormat <outputFormat>
        Format of the output file, valid values are: SOMLib, plain, Binary, Orange default: SOMLib

InputDataFileFormatConverter

Converts between various file formats for input data. Currently supported formats are [randomAccess, simpleMatrix, SOMPak, Marsyas0.2ARFF, SOMLib, ESOM, ARFF] and [SOMLib, ARFF, randomAccess, ESOM, SOMPak, Orange, CSV, vowpal], respective

Usage: java at.tuwien.ifs.somtoolbox.data.InputDataFileFormatConverter [--inputFormat <inputFormat>] <input> [(-t|--tv) <templateVectorFile>] [-c <classInformationFile>] [--outputFormat <outputFormat>] [(-z|--gzip) <gzip>] <output> [--skipInstanceNames] [--skipInputsWithoutClass] [--tabSeparated]

Options:
  --inputFormat <inputFormat>
        Format of the input file, valid values are: randomAccess, simpleMatrix, SOMPak, Marsyas0.2ARFF, SOMLib, ESOM,
        ARFF
        If not specified, the format will be determined from the file extension.

  <input>
        Name of input vector file to be read.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  -c <classInformationFile>
        Class information file containing the class for each data item.

  --outputFormat <outputFormat>
        Format of the output file, valid values are: SOMLib, ARFF, randomAccess, ESOM, SOMPak, Orange, CSV, vowpal
        If not specified, the format will be determined from the file extension.

  -z, --gzip <gzip>
        Whether or not to gzip the output. default: true

  <output>
        Name of the output file.

  --skipInstanceNames
        Skipping writing of instance names in ARFF file

  --skipInputsWithoutClass
        Skipping writing instances without assigned class to ARFF file

  --tabSeparated
        Write the class-file tab separated.

LabelSOM

Implements the LabelSOM labelling method

Usage: java at.tuwien.ifs.somtoolbox.output.labeling.LabelSOM -v <inputVectorFile> (-t|--tv) <templateVectorFile> -w <weightVectorFile> -u <unitDescriptionFile> [-n <numberLabels>] [--dense] [--ignoreZero] [-m <mapDescriptionFile>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -n <numberLabels>
        Number of labels per unit. Default value is 5.  default: 5

  --dense
        Set if input data vectors are densely populated.

  --ignoreZero
        Ignore labels with zero mean value and que.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

LagusKeywordLabeler

Implements the LagusKeyword labelling method

Usage: java at.tuwien.ifs.somtoolbox.output.labeling.LagusKeywordLabeler -v <inputVectorFile> (-t|--tv) <templateVectorFile> -w <weightVectorFile> -u <unitDescriptionFile> [-n <numberLabels>] [--dense] [-m <mapDescriptionFile>] --inputDir <inputDir>

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -n <numberLabels>
        Number of labels per unit. Default value is 5.  default: 5

  --dense
        Set if input data vectors are densely populated.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  --inputDir <inputDir>
        Path to the input directory.

MapFileFormatConverter

Converts between various file formats for trained SOMs.Currently supported formats are [SOMLib, SOMPak, ESOM]

Usage: java at.tuwien.ifs.somtoolbox.input.MapFileFormatConverter --inputFormat <inputFormat> <input> [-v <inputVectorFile>] [(-t|--tv) <templateVectorFile>] [-u <unitDescriptionFile>] [--outputDir <outputDirectory>] --outputFormat <outputFormat> [(-z|--gzip) <gzip>] <output>

Options:
  --inputFormat <inputFormat>
        Format of the input file, valid values are: SOMLib, SOMPak, ESOM

  <input>
        Name of input vector file to be read.

  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  --outputDir <outputDirectory>
        Name of the output directory.

  --outputFormat <outputFormat>
        Format of the output file, valid values are: SOMLib, SOMPak, ESOM

  -z, --gzip <gzip>
        Whether or not to gzip the output. default: true

  <output>
        Name of the output file.

MapRotator

Rotates a map by the given degrees, and writes a new unit- and weight-vector file

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.MapRotator -u <unitDescriptionFile> -w <weightVectorFile> [--dw <dataWinnerMappingFile>] <output> [(-r|--rotation) <rotation>] [(-f|--flip) <flip>]

Options:
  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  --dw <dataWinnerMappingFile>
        Unit description file describing the winners mapped onto a unit for a SOM/GHSOM.

  <output>
        Name of the output file.

  -r, --rotation <rotation>
        Rotation of the new map, values are: 90, 180, 270.

  -f, --flip <flip>
        Flip the map, values are horizontal or vertical.

MnemonicSOMGenerator

MnemonicSOMGenerator allows to create an arbitrary shaped SOM (unit file)

Usage: java at.tuwien.ifs.somtoolbox.util.mnemonic.MnemonicSOMGenerator <backgroundImage> [(-n|--nodes) <totalNodes>] [(-r|--rows) <rows>] [(-c|--columns) <cols>]

Options:
  <backgroundImage>
        The Background Image png, jpg

  -n, --nodes <totalNodes>

  -r, --rows <rows>

  -c, --columns <cols>

QualityMeasureComputer

Wrapper for the individual Quality Measures

Usage: java at.tuwien.ifs.somtoolbox.apps.QualityMeasureComputer -w <weightVectorFile> -m <mapDescriptionFile> -u <unitDescriptionFile> -v <inputVectorFile> [(-t|--tv) <templateVectorFile>] [--dw <dataWinnerMappingFile>] --qualityClass <qualityMeasureClass> --qualityVariant <qualityMeasureVariant> [-k <k>] <output> [<properties>]

Options:
  -w <weightVectorFile>
        Weight vector file describing a SOM/GHSOM.

  -m <mapDescriptionFile>
        Map description file describing a mapped SOM/GHSOM.

  -u <unitDescriptionFile>
        Unit description file describing a mapped SOM/GHSOM.

  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  --dw <dataWinnerMappingFile>
        Unit description file describing the winners mapped onto a unit for a SOM/GHSOM.

  --qualityClass <qualityMeasureClass>
        Quality measure class.

  --qualityVariant <qualityMeasureVariant>
        Quality measure variant.

  -k <k>

  <output>
        Name of the output file.

  <properties>
        Name of property file.

RGBPaletteConverter

Converts palettes given as RGB values to the XML format used in the SOMToolbox.

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.RGBPaletteConverter -f <inputFile> [--name <name>] [--shortName <shortName>] [--description <description>] <output>

Options:
  -f <inputFile>
        Name of input file to be read.

  --name <name>
        Name of the palette.

  --shortName <shortName>
        Short name of the palette.

  --description <description>
        Description of the palette.

  <output>
        Name of the output file.

SOMLibInputConcatenator

Merges two or more SOMLib Input files, i.e. vector and template files. Template vectors can be off different dimensionality, and may contain different features, but some features may also be overlapping. Different merge strategies are available: union of all features sets, intersection of features sets, and strategies in between, retaining a feature if it appears in at least x sets

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.SOMLibInputConcatenator [--mode <mode>] [--inputDir <inputDir>] <output> [inputs1 inputs2 ... inputsN]

Options:
  --mode <mode>
        The merging mode to use, i.e. Union, Intersection, or a number of vectors the term occurrs.
        If no mode is provided, all possible combinations will be generated. default: All

  --inputDir <inputDir>
        Path to the input directory.

  <output>
        Name of the output file.

  inputs1 inputs2 ... inputsN
        Prefix for the input files to merge i.e. file name w/o .tv/.vec extension

SOMLibVectorNormalization

Handles the normalization of vector files in SOMLib format

Usage: java at.tuwien.ifs.somtoolbox.data.SOMLibVectorNormalization [-m <method>] <input> <output>

Options:
  -m <method>
        Normalization method.
        UNIT_LEN normalises the vectors in the input file to unit length.
        MIN_MAX normalises each attributes between 0 and 1.
        STANDARD_SCORE normalises each attribute to a mean of 0, and a max value of the standard deviation.
        default: UNIT_LEN

  <input>
        Name of input vector file to be read.

  <output>
        Name of new vector file to be created.

SOMLibZeroVectorRemover

Removes zero vectors (i.e. with 0 in all their components) from files in SOMLib format

Usage: java at.tuwien.ifs.somtoolbox.data.SOMLibZeroVectorRemover <input> <output>

Options:
  <input>
        Name of input vector file to be read.

  <output>
        Name of new vector file to be created.

StringReplacer

Replaces strings by a replacement in the given file

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.StringReplacer --replace <replace> [--replacement <replacement>] [-f <inputFile>] [--inputDir <inputDir>]

Options:
  --replace <replace>
        The string to be replace.

  --replacement <replacement>
        The string to replace with.

  -f <inputFile>
        Name of input file to be read.

  --inputDir <inputDir>
        Path to the input directory.

VectorFile2DatabaseImporter

Imports input and template vector files to a database

Usage: java at.tuwien.ifs.somtoolbox.database.VectorFile2DatabaseImporter -v <inputVectorFile> (-t|--tv) <templateVectorFile> --dbName <databaseName> --tablePrefix <databaseTableNamePrefix> [--server <databaseServerAddress>] [--user <databaseUser>] [--password <databasePassword>]

Options:
  -v <inputVectorFile>
        Input file containing the input vectors to be mapped.

  -t, --tv <templateVectorFile>
        Template vector file containing vector element labels.

  --dbName <databaseName>
        Name of the database.

  --tablePrefix <databaseTableNamePrefix>
        Prefix for the tables in the database.

  --server <databaseServerAddress>
        Servername or IP of the database server. Defaults to 'localhost'. default: localhost

  --user <databaseUser>
        Username for the database acccess. Defaults to 'root'. default: root

  --password <databasePassword>
        Password for the database acccess. Defaults to an empty password. default: 

VectorFileAppender

Append multilple VectorFiles containing the same type of features into one Vector file

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.VectorFileAppender <output> input1 input2 ... inputN

Options:
  <output>
        Name of new vector file to be created.

  input1 input2 ... inputN
        The input files

VectorFileConcatenator

Merge two or more VectorFiles containing different Features of the same Data into one Vector file

Usage: java at.tuwien.ifs.somtoolbox.apps.helper.VectorFileConcatenator <output> input1 input2 ... inputN [(-q|--weights) weights1:weights2:...:weightsN ] [--outputFormat <outputFormat>] [--tv]

Options:
  <output>
        Name of new vector file to be created.

  input1 input2 ... inputN
        The input files

  -q, --weights weights1:weights2:...:weightsN 
        Apply different weights when normalising the vectors. No normalisation if skipped, missing values default to 1

  --outputFormat <outputFormat>
        Format of the output file, valid values are: SOMLib, ARFF, randomAccess, ESOM, SOMPak, Orange, CSV, vowpal
        If not specified, the format will be determined from the file extension.

  --tv
        Create and write an apropriate TemplateVector file: <outfile>.tv

VectorFileRewriter

Replaces labels in SOMLibVectorFiles by means of a mapping file; labels in the mapping file must be separated by a tab