evaluation_retrieval module

evaluation_retrieval.py: Evaluation for the KNN task using Information retrieval measures.

Usage:

python evaluation_retrieval.py [1] -s [2] -ov [3] -cfg [4] --help

where:

  • [1] : input similarity matrix (unnormalized similarities or pre-treated MCL format). The script expects a ‘exp_configuration.ini’ file in the same folder, usually generated when using main.py.
  • [2] -s: number of samples to evaluate (s first samples of the ground-truth). If -1, the use the whole set. Defaults to -1
  • [3] -ov: If positive, assume the resulting script was obtained in OVA mode for the sample of index ov. Defaults to -1.
  • [4] -cfg: provide a custom configuration file to replace ‘exp_configuration.ini’.
  • -h, --help

This outputs the results of the neighbour retrieval evaluation on the given matrix.

evaluation_retrieval.evaluate(data_type, co_occ, label_to_index, index_to_label, ground_truth, output_folder, samples=-1, ova=-1, writing=True, idtf=None, suffix=None)

Evaluates the KNN task on the given similarity matrix and ground-truth

Args:
  • datat_type (str): Dataset used in the experiments (for ground-truth parsing).
  • co_occ (ndarray): co-occurrence matrix.
  • label_to_index (dictt): Reverse mapping of index_to_label.
  • index_to_label (list): list mapping an index to the corresponding named entity; used to generate a readable clustering.
  • ground_truth (dict): ground truth clustering to compare against.
  • output_folder (str): path to the output folder.
  • samples (int, optional): evaluation on the first samples samples (if -1, evaluation of all queries). Default to -1.
  • ova (int, optional): if non negative, evaluation on the sample ova only. Defaults to -1.
  • writing (bool, optional): if True, outputs the resulting measures in a file. Defaults to True.