ImageCLEFwikipedia2010-2011
Domain | Web |
Media | Image, Text |
Size | 21 GB |
Instances | 237,434 images |
File Format | JPEG, PNG |
Creation Date | 2010 |
Task | Retrieval |
Copyright | |
URL | http://www.imageclef.org/wikidata |
Description
The Wikipedia Retrieval 2010 collection consists of 237,434 images, their associated user-generated textual annotations (i.e., the images' textual descriptions extracted from the Wikimedia Commons files and the images' captions in the Wikipedia article(s) that contain them), and the Wikipedia articles containing the images.
The collection was built to cover similar topics in English, German and French, with the following language distribution for their associated textual annotations:
- English only: 70,127 - German only: 50,291 - French only: 28,461 - English and German: 26,880 - English and French: 20,747 - German and French: 9,646 - English, German and French: 22,899 - Language undetermined: 8,144 - No textual annotation: 239
The topics are descriptions of multimedia information needs that contain textual and visual hints. There were 70 topics in 2010 and 50 topics in 2011.
Quality
Source
Ground Truth Annotation
There were 70 topics in 2010 and 50 topics in 2011. The ground truth for these topics was created by assuming binary relevance (relevant vs. non relevant) and by assessing only the images in the pools created by the retrieved images contained in the runs submitted by the participants each year; a pool depth of 100 was used in 2010, and 2011.
Features
The CIME, TLEP, SURF, CEDD features of the images in the collection and in the topic examples are available.
Licensing / Copyright
The content of the collection is extracted from Wikipedia and Wikimedia Commons. Licenses for texts and images in the collection are those used in the original sources.
Citation
- Theodora Tsikrika, Adrian Popescu, and Jana Kludas. | Overview of the Wikipedia Image Retrieval task at ImageCLEF 2011.In the Working Notes for the CLEF 2011 Labs and Workshop, 19-22 September, Amsterdam, The Netherlands, 2011.
- Adrian Popescu, Theodora Tsikrika and Jana Kludas. Overview of the wikipedia retrieval task at ImageCLEF 2010.In the Working Notes for the CLEF 2010 Workshop, 20-23 September, Padova, Italy, 2010.
External Links
http://www.imageclef.org/wikidata