BBCnews
From Chorus
		
		
		
| Domain | News Media | 
| Media | Image | 
| Size | 255 MB | 
| Instances | |
| File Format | XHTML, XML | 
| Creation Date | |
| Task | retrieval | 
| Copyright | |
| URL | http://mlg.ucd.ie/datasets/bbc.html | 
Domain
- News media
Comments
- Cross media dataset combining images and text
- BBC news html pages categorized in 11 categories and split into two sets
Media (image, video, mixed, …)
- Images
Size (no images, in GB, …)
- ~255 MB compressed
Source (FlickR, Corel)
- Joao Magalhaes (Crawled from the internet)
Annotation type (free text, structured, …)
Ground truth
Event or project
Task (retrieval, recognition, …)
Format
- xhtml pages, xml files containing metadata and images extracted from the xhtml pages
Quality (resolution)
Creation date
Copyright
- Joao Magalhaes
