ImageCLEFwikipedia2010-2011

From Chorus
Jump to: navigation, search
ImageCLEFwikipedia2010-2011
Domain Web
Media Image, Text
Size 21 GB
Instances 237,434 images
File Format JPEG, PNG
Creation Date 2010
Task Retrieval
Copyright
URL http://www.imageclef.org/wikidata


Description

The Wikipedia Retrieval 2010 collection consists of 237,434 images, their associated user-generated textual annotations (i.e., the images' textual descriptions extracted from the Wikimedia Commons files and the images' captions in the Wikipedia article(s) that contain them), and the Wikipedia articles containing the images.

The collection was built to cover similar topics in English, German and French, with the following language distribution for their associated textual annotations:

- English only: 70,127
- German only: 50,291
- French only: 28,461
- English and German: 26,880
- English and French: 20,747
- German and French: 9,646
- English, German and French: 22,899
- Language undetermined: 8,144
- No textual annotation: 239

The topics are descriptions of multimedia information needs that contain textual and visual hints. There were 70 topics in 2010 and 50 topics in 2011.

Quality

Source

ImageCLEF

Ground Truth Annotation

There were 70 topics in 2010 and 50 topics in 2011. The ground truth for these topics was created by assuming binary relevance (relevant vs. non relevant) and by assessing only the images in the pools created by the retrieved images contained in the runs submitted by the participants each year; a pool depth of 100 was used in 2010, and 2011.


Features

The CIME, TLEP, SURF, CEDD features of the images in the collection and in the topic examples are available.


Licensing / Copyright

The content of the collection is extracted from Wikipedia and Wikimedia Commons. Licenses for texts and images in the collection are those used in the original sources.


Citation


External Links

http://www.imageclef.org/wikidata

Personal tools
CHORUS+