utils.classification_scripts package¶

Submodules¶

utils.classification_scripts.basic_hmm module¶

basic_hmm.py. Generates a basic initial HMM for HTK training.

utils.classification_scripts.basic_hmm.generate_basic_hmm(features_type, components, name, output_folder, n_state=12, hmm_type=1)¶

On-the-fly generation of an initial HMM.

Args:

feature_type (str): HTK features target kind.
components (int): number of base components.
name (str) : HMM/model name.
output_folder (str): folder where to output the HMM.
n_state (int, optional): number of emitting states in the HMM (not counting initial and final states). Defaults to 12.
hmm_type (int, optional): determines the HMM topology to use (1: basic left/right; 2: left/right1/right2). Defaults to 1

utils.classification_scripts.classification_CRF module¶

classify_CRF.py. For training and applying a wapiti CRF.

utils.classification_scripts.classification_CRF.label_CRF(model, test, test_entities_indices, classification_params, verbose=1)¶

Labels a testing set using a wapiti CRF classifier with wapiti and returns the resulting entities.

Args:

model: model built from training the classifier
test: formatted testing set.
test_entities_indices (list): location of the interesting entities in the test dataset.
classification_params (dict): additional classification parameters.
verbose (int, optional): controls verbosity level. Defaults to 1.

Returns:

result_iter: a generator expression on the result

utils.classification_scripts.classification_CRF.train_CRF(n, train, temp_folder, classification_params, verbose=1, debug=False)¶

Trains a CRF classifier with wapiti and returns the resulting model.

Args:

n (int): step number.
train: annotated training set (structure may depend on the classifier).
temp_folder (str): path to the directory for storing temporary files.
classification_params (dict): additional classification parameters.
verbose (int, optional): controls verbosity level.
clean (bool, optional): if False, removes the temporary files that were created.

Returns:

model: model built by the classifier from the given training set.

utils.classification_scripts.classification_DT module¶

classify.py. For training and applying a weka decision tree classifier on the artifically annotated data set.

utils.classification_scripts.classification_DT.label_DT(model, test, test_entities_indices, verbose=1)¶

Labels a testing set using a weka decision tree and returns the resulting entities.

Args:

model: model built from training the classifier
test: formatted testing set.
test_entities_indices (list): location of the interesting entities in the test dataset.
verbose (int, optional): controls verbosity level. Defaults to 1.

Returns:

result_iter: a generator expression on the result

utils.classification_scripts.classification_DT.train_DT(n, train, temp_folder, classification_params, verbose=1)¶

Trains a decision tree classifier with weka and returns the resulting model.

Args:

n (int): step number.
train: annotated training set (structure may depend on the classifier).
temp_folder (str): path to the directory for storing temporary files.
classification_params (list): additional classification parameters.
verbose (int, optional): controls verbosity level.

Returns:

model: model built by the classifier from the given training set.

utils.classification_scripts.classification_HTK module¶

classify.py. For training and applying a classifier on the artifically annotated data set.

utils.classification_scripts.classification_HTK.label_HTK(n, hmmdef_file, hmmlist_file, wnet_file, dic_file, test, test_entities_indices, temp_folder, verbose=1, debug=False)¶

Labels a testing set using a HTK HMM classifier and returns the resulting entities.

Args:

model: model built from training the classifier
test: formatted testing set.
test_entities_indices (list): location of the interesting entities in the test dataset.
verbose (int, optional): controls verbosity level. Defaults to 1.
debug (bool, optional): if True, some outputs are kept in the temporary directory.

Returns:

result_iter: a generator expression on the result

utils.classification_scripts.classification_HTK.train_HTK(n, train, temp_folder, classification_params, verbose=1, debug=False)¶

Trains a HMM classifier with HTK and returns the resulting model.

Args:

n (int): step number.
train: annotated training set (structure may depend on the classifier).
temp_folder (str): path to the directory for storing temporary files.
classification_params (list): additional classification parameters.
verbose (int, optional): controls verbosity level. Defaults to 1.
debug (bool, optional): if True, some outputs are kept in the temporary directory.

Returns:

hmmdef: HMMs master file.
hmmlist: list of HMMs in the model.

utils.classification_scripts package¶

Submodules¶

utils.classification_scripts.basic_hmm module¶

utils.classification_scripts.classification_CRF module¶

utils.classification_scripts.classification_DT module¶

utils.classification_scripts.classification_HTK module¶

Module contents¶

Table Of Contents

Previous topic

This Page