BlogData

From Chorus
Jump to: navigation, search
BlogData
Domain Blog Posts
Media Text
Size 27 GB
Instances
File Format
Creation Date
Task retrieval
Copyright
URL http://groups.google.com/group/icwsm-data

Domain

  • A set of blog posts, including the posted text, as well as metadata such as the blog's homepage, timestamps, etc

Comments

Media (image, video, mixed, 
)

Size (no images, in GB, 
)

  • 27GB compressed (142GB uncompressed)

Source (FlickR, Corel)

Annotation type (free text, structured, 
)

Ground truth

Event or project

Task (retrieval, recognition, 
)

Format

  • 14 tiers of XML documents (44 million blog posts)

Quality (resolution)

Creation date

Copyright

  • Permitted uses in dataset path, file: icwsm-spinn3r.pdf

URL

Personal tools
CHORUS+