[elan@student elan]$ mkdir demo [elan@student elan]$ mv pressetext.email demo/ [elan@student elan]$ java somlib.textrepresentation.emailexc -i demo/pressetext.email -o demo/mails -u demo/plain -r demo/descr Creating directory: demo/mails Mailfile: start extracting... Files Created. Max ID: 811 Mailfile: end. Creating directory: demo/plain Creating directory: demo/descr EMails: Creating files for parser... EMails with unknown types: 0. EMails: end. [elan@student elan]$ java somlib.textrepresentation.wordsexc -i demo/plain/ -o demo/words Creating directory: demo/words Start Words: Files: 811 Done. [elan@student elan]$ java somlib.textrepresentation.templatevectorexc -i demo/words/ -o demo/tv Start: CreateTemplate: Reading 811 Files. End: CreateTemplate. [elan@student elan]$ java somlib.textrepresentation.reducerexc -i demo/tv -o demo/rtv -n 0.02 -x 0.8 $files: 811 Reducer: start words: 38222 max limit: 648 min limit: 16 words - reduced: 1379 Reducer: done. [elan@student elan]$ java somlib.textrepresentation.extractorexc -i demo/words/ -j demo/rtv -o demo/demo -t f -b f -f t making TFxIDF. Extractor: start Extractor: done. done TFxIDF. [elan@student elan]$ ls demo descr demo.tfxidf demo.tv mails plain pressetext.email rtv tv words