Every field can be assigned to one of n manually defined categories. Thus, every form is represented as a vector of n dimensions, where every dimension refers to a specific category. This key-vector has a value of 1.0 in the i-th dimension, if the form contains one field with the category i.
Since the categories are predefined, we have to be aware, that there are fields, which cannot be put in one of those, be it because they are not recognised or because they just do not fit in any. A special undefined category will be defined for those fields. Also, hidden fields that cannot be assigned to a category, will be put in this category.
After having transformed the representation of a form into a vector, we can compare two forms by comparing their respective key-vectors.
The undefined-category is not contained in this vector.
The dot product is used as a measure of distance between two forms: