Claims to passage task 2012

Input | Output | Run Submission | Training data | Test data | Contact
Evaluation Measures | Evaluation Results

Topics in this task are sets of claims extracted from actual patent application documents. Participants are asked to return passages that are relevant to the topic claims. The passages must occur in the documents in the CLEF-IP 2012 collection. No other data is allowed to be used in preparing for this task.

These sets of claims were chosen based on existing search reports for the considered pantent applications. It is often the case that from one patent application document we have extracted more than one topic.

The topics (defined below) contain also a pointer to the original patent application file. The content of the xml file (other than the claims selected as topics) can be used as you like.


You can read further clarifications on this task here

Input

A topic in the 'Claims to Passage' task contains the following sgml codes:

<tid>topic_id</tid>
<file>topic_file.xml</tfile>
<claims>xpaths_to_claims</tclaims>
   

where

  • 'tid' contains the id of the topic defined
  • 'tfile' contains the name of the xml file from which the topic claims are extracted
  • 'tclaims' contains the xpaths in the xml file to the claims selected as topics. The xpaths are separated by space.

Example (taken from the set of training topics):

<tid>tPSG-5</tid>
<tfile>EP-1480263-A1.xml</tfile>
<tclaims>/patent-document/claims/claim[1] /patent-document/claims/claim[2] 
/patent-document/claims/claim[3] /patent-document/claims/claim[16] 
/patent-document/claims/claim[17] /patent-document/claims/claim[18] </tclaims>
	

 back to top

Output

The retrieval results should be returned in a text file with 6 columns, as described below (based on the trec formats):

   
topic_id Q0 doc_id rel_psg_xpath psg_rank psg_score

where:

  • topic_id is the identifier of a topic
  • Q0 is a value maintained for historical reasons
  • doc_id is the identifier of the patent document (i.e. file name WITHOUT extension) in which the relevant passages occur
  • rel_psg_xpath is the xpath identifying the relevant passage in the doc_id document
  • psg_rank the rank of the passage in the overall list of relevant passages
  • psg_score is the score of the passage in the (complete) list of relevant passages

We allow only one xpath per line in the result files. If more passages are considered relevant for a topic, these have to be placed on separate lines.
The maximum number of lines in the result files is limited to containing 100 doc_ids when ignoring the xpaths.

Example (taken from the qrels, therefore, both psg_rank and psg_score values are fictional)

   
...
tPSG-5 Q0 WO-2002015251-A1 /patent-document/claims/claim 5 1.34
tPSG-5 Q0 WO-2002015251-A1 /patent-document/description/p[22] 6 1.11
tPSG-5 Q0 WO-2002015251-A1 /patent-document/description/p[23] 7 0.87
tPSG-5 Q0 WO-2002015251-A1 /patent-document/description/p[34] 8 0.80
...

 back to top

Run Submission

Each participant is allowed to submit up to 8 run files. Each run should be submitted compressed. As in the previous years, the run files should be named using the following schema: participantID-runID-taskID.extension.
   participantID will identify your institution/group
   runID identifies the different runs you submit
   taskID should be PSG
   extension is either tgz, gz, zip or other extension used by compressing programs.

As seen above, the topics contain also a pointer to the original patent application file. Participants to the task are allowed to use the content of this file as they need and see fit.

 back to top

Evaluation Measures

There are two types of measurements we can compute on the submitted runs: at the document level and at the passage level.

At the Document Level

The main measure we will report will be PRES at 20 and 100 cut-offs. PRES rewards systems that return relevant documents earlier in the retrieval list.
In order to apply PRES to the submitted experiments, the experiments will be stripped off of the passage information, the ranking will be kept. For instance, the example run

   
...
tPSG-16 Q0 WO-2000078185-A2 /patent-document/abstract[1]/p 1 2.53
tPSG-16 Q0 WO-2000078185-A2 /patent-document/abstract[2]/p 2 2.2
tPSG-16 Q0 WO-2000078185-A2 /patent-document/description/p[41] 3 1.89
tPSG-16 Q0 WO-2000078185-A2 /patent-document/description/p[42] 4 1.75
tPSG-16 Q0 WO-2000078185-A2 /patent-document/description/p[43] 5 1.5
tPSG-16 Q0 WO-2000078185-A2 /patent-document/description/p[44] 6 1.02
tPSG-16 Q0 WO-2000078185-A2 /patent-document/description/p[45] 7 0.9
tPSG-16 Q0 WO-2000078185-A2 /patent-document/description/p[46]  8 0.8
tPSG-16 Q0 WO-2000078185-A2 /patent-document/description/p[47]  9 0.7
tPSG-16 Q0 WO-1997007715-A1 /patent-document/abstract[1]/p 10 0.66
tPSG-16 Q0 WO-1997007715-A1 /patent-document/abstract[2]/p 11 0.60
tPSG-16 Q0 WO-1997007715-A1 /patent-document/description/p[43] 12 0.5
tPSG-16 Q0 WO-1997007715-A1 /patent-document/description/p[44] 13 0.42
tPSG-16 Q0 WO-1997007715-A1 /patent-document/description/p[45] 14 0.42
tPSG-16 Q0 WO-1997007715-A1 /patent-document/description/p[46] 15 0.42
...
will be processed into the following:
   
...
tPSG-16 Q0 WO-2000078185-A2 1 2.53
tPSG-16 Q0 WO-1997007715-A1 2 0.66
...
and given as input to the script computing the PRES score. (Note that the psg_score column - the last one - is ignored in the PRES computation.)

At the Passage Level

The main measure reported here will be the MAgP measure, which is an adaptation of the measure used in the INEX Ad-hoc track, Relevant in Context Task (see this article). If the time permits we will also compute a 'Best Entry Point' measure which assesses how good the first retrieved paragraph in a document matches the first document paragraph selected by the patent expert.

 back to top

Evaluation Results

You can now download the evaluation results. The file shows the evaluation per EN/DE/FR language as well.

We have computed three measures at the document level -- PRES@100, RECALL@100, and MAP@100, and two measures at the passage level -- MAP(D) and Precision(D).

MAP(D): For each of the relevant documents we compute the AP. To get a score for a topics we average the AP over all topic's relevant documents. The final score is obtained by averaging over all queries.

Precision(D): For each of the relevant documents we compute the set-based precision. The scores per topic and the final score are computed as for MAP(D)

Before running the evaluation scripts we have removed from the participants' runs the xpaths refering to headings.
You can get the evaluation script here and the qrels here. To run an evaluation on your data you should use the following command:

 
         perl PRESeval.CLEF2012.pl -d qrels.file Results.file 100

 back to top

Training data

We have created a set of 51 training topics, together with their relevance judgements, out of which 18 are in German, 21 in English, and 12 in French. The language of the topic is given by the language of the application document out of which the topic claims were extracted. Note that the language of the relevant passages may differ from the language of the topic (e.g. topic language is German, and the relevant passages are in French).

Download here the training data.
If you downloaded the training data before May 9th, please download it again. Some xpaths in documents not existing in the released CLEF-IP files slipped into the training qrels.

 back to top

Test data

The set of test topics contains 105 topics, 35 in each EPO language. You can download it here.

When looking at the topics in the training set, you surely have noticed that all XPaths are relative to A level documents (i.e. application document). The same is true for the topics in the test set. This is due to the fact that search reports allways refer to application documents as relevant citations. Therefore, all results will be from "A" documents only.

 back to top

Contact

For questions, suggestions and anything else regarding this task, please contact Mihai Lupu (lupu at ifs.tuwien.ac.at) or Florina Piroi (piroi at ifs.tuwien.ac.at)

 back to top