TUMKitchen

From Chorus
Revision as of 15:59, 29 November 2010 by Frank (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
TUMKitchen
Domain Human actions
Media Image, Video
Size
Instances
File Format JPEG, XVID, MPEG-4
Creation Date
Task recognition, segmentation
Copyright
URL http://ias.cs.tum.edu/download/kitchen-activity-data


Domain

  • The TUM Kitchen Data Set is provided to foster research in the areas of markerless human motion capture, motion segmentation and human activity recognition.

Comments

  • It should aid researchers in these fields by providing a comprehensive collection of sensory input data that can be used to try out and to verify their algorithms. It is also meant to serve as a benchmark for comparative studies given the manually annotated "ground truth" labels of the underlying actions. The recorded activities have been selected with the intention to provide realistic and seemingly natural motions, and consist of everyday manipulation activities in a natural kitchen environment.

Media (image, video, mixed, …)

  • Multi-modal sensor data:

- Video data from four fixed, overhead cameras (384x288 pixels RGB color or 780x582 pixels raw Bayer pattern, at 25Hz)
- Motion capture data (*.bvh file format) extracted from the videos using our markerless full-body MeMoMan tracker
- RFID tag readings from three fixed readers embedded in the environment (sample rate 2Hz)
- Magnetic (reed) sensors detecting when a door or drawer is opened. (sample rate 10Hz)
- Action labels (the data is labeled separately for the left hand, the right hand, and the trunk of the person)

Size (no images, in GB, …)

Source (FlickR, Corel)

Annotation type (free text, structured, …)

Ground truth

  • Manually annotated "ground truth" labels of the underlying actions

Event or project

Task (retrieval, recognition, …)

  • Areas of markerless human motion capture, motion segmentation and human activity recognition

Format

  • The video data is available in three different formats:

- avi: xvid encoded RGB color video of size 384x288 pixels
- raw: mpeg4 (lavc) encoded video of original raw camera input stream (monochrome Bayer pattern RGGB) of size 780x582 pixels
- jpg: gzipped tar archive of jpeg files for each frame (size 384x288 pixels)

Quality (resolution)

Creation date

Copyright

  • Technische Universität München

URL

Personal tools
CHORUS+