The whole generation process is finished at this point. The meaning of a user dialog has been interpreted as good as possible and values have been chosen. But was the categorisation right, have good values been picked? This can be tested by scrutinising the resulting document, the answer to the query returned by the server.
Of course, judging on the successfulness depends on the actual intent mainly. Just to give an impression on how the interactive dialog is continued, it is sufficient to get an answer saying about anything. But still, the answer should not be an error message or a simple note, that no information was found on this query. Properties indicating the success of a query are based on experience and by no means claim to be mathematically sound.
Size - One expressive, yet easy to obtain property is the size of a document. On the assumption that error messages or replies saying, that no information was found, are very short compared to documents containing a lot of information, it can be assumed that replies bigger in size are also the better ones.
To put this hypothesis to the test samples were taken. Altogether 40 queries were sent in pairs to different servers, one causing an expressive answer, the other provoking an error message or a statement that nothing matching was found. For 17 to 19 of those 20 probes the hypothesis holds, the size of the document containing the expressive answer was indeed bigger than the error message.
The variance is caused by 2 rather special cases. Navigation on the corresponding web-sites was realised by a single selection-field. A policy choosing one of the pages after assessing it the most expressive one would lose probably an integral part of the web-site. Thus, every single page should be obtained for those cases.
As these special cases can be neglected, it can be concluded that the size of a returned document is indeed a decisive criterion for almost 95 percent of queries.
Complexity - Another approach is the complexity of a returned web-page. By the complexity of a page it is referred to the number of interactive forms and their number of fields on it. A query can be considered a request for information. A web-page containing information only will have a low level of complexity in this sense. On the other hand, a page having lots of interaction fields means that the server was not able to extract the appropriate information from the query and in turn poses some additional questions in order to extract the seeked information successfully.
This hypothesis could not be verified testing it on the same samples drawn to prove the former hypothesis on the documents size. 15 of the 20 probes had exactly the same interaction forms on informative documents as on error messages. For 2 cases the opposite of the hypothesis was true, consequently only for 3 cases it holds. Even though this hypothesis holds on the only sample, for which the size is not a decisive criterion, it appears not promising to combine the two approaches.
Others - There certainly are other criteria for assessing the "correctness" of a query. The most direct solution, of course, is to work on the content of the document trying to understand its meaning. Simple methods like searching for keys ("invalid", "no entries") or the like, offer a rather limited possibility for understanding. On the other hand very complex methods can be used. The gain of accuracy in the assessment of the Referee needs to be weighted up against the loss of performance, which is to be expected as the complexity of the used method rises.