Contain the functions concerning the processing of data.
Preprocessing functions
Preprocess data
- cyanure.data_processing.preprocess(X, centering=False, normalize=True, columns=False)[source]
Preprocess features training data.
Perform in-place centering or normalization, either of columns or rows of the input matrix X.
- Parameters
- X (numpy array or scipy sparse CSR matrix):
Input matrix
- centering (boolean)default=False
Perform a centering operation
- normalize (boolean): default=True
l2-normalization
- input_nameTrue).
Input verification functions
These functions are not be necessary for any normal use of the library
- cyanure.data_processing.check_labels(labels, estimator)[source]
Verify the format of labels depending on the type of the estimator.
Can convert labels in some cases.
- Parameters
- labels (numpy array or scipy sparse CSR matrix):
Numpy array containing labels
- estimator (ERM):
The estimator which will be fitted
- Returns
- labels (numpy array or scipy sparse CSR matrix):
Converted labels if required by the estimator.
- label_encoder (sklearn.LabelEncoder):
Convert text labels if needed
- Raises
- ValueError:
Format of the labels does not respect the format supported by Cyanure classifiers.
- ValueError:
Labels have an non finite value
- ValueError:
Problem has only one class
- cyanure.data_processing.check_input_type(X, labels, estimator)[source]
Verify the format of labels and features depending on the type of the estimator.
Can convert labels in some cases.
- Parameters
- X (numpy array or scipy sparse CSR matrix):
Numpy array containing features
- labels (numpy array or scipy sparse CSR matrix):
Numpy array containing labels
- estimator (ERM):
The estimator which will be fitted
- Returns
- X (numpy array or scipy sparse CSR matrix):
Converted features if required by the estimator.
- labels (numpy array or scipy sparse CSR matrix):
Converted labels if required by the estimator.
- label_encoder (sklearn.LabelEncoder):
Convert text labels if needed
- Raises
- ValueError:
Data are complex
- ValueError:
Data contains non finite value
- TypeError:
Sparsed features are not CSR
- TypeError:
Sparsed labels are not CSR
- cyanure.data_processing.check_positive_parameter(parameter, message)[source]
Check that a parameter if a number and positive.
- Parameters
- parameter (Any):
Parameter to verify
- message (string):
Message of the exception
- Raises
- ValueError:
Parameter is not a number
- ValueError:
Parameter is not positive
- cyanure.data_processing.check_parameters(estimator)[source]
Verify that the different parameters of an estimator respect the constraints.
- Parameters
- estimator (ERM):
Estimator to veriffy
- cyanure.data_processing.check_input_fit(X, labels, estimator)[source]
Check the different input arrays required for training according to the estimator type.
Can convert data if necessary.
- Parameters
- X (numpy array or scipy sparse CSR matrix):
Numpy array containing features
- labels (numpy array or scipy sparse CSR matrix):
Numpy array containing labels
- estimator (ERM):
The estimator which will be fitted
- Returns
- X (numpy array or scipy sparse CSR matrix):
Converted features if required by the estimator.
- labels (numpy array or scipy sparse CSR matrix):
Converted labels if required by the estimator.
- label_encoder (sklearn.LabelEncoder):
Convert text labels if needed
- Raises
- ValueError:
There is only one feature.
- ValueError:
There is no sample.
- ValueError:
An observation has no label.
- ValueError:
Feature array has no feature
- ValueError:
Features and labels does not have the same number of observations.
- ValueError:
There is only one sample.
- cyanure.data_processing.check_input_inference(X, estimator)[source]
Check the format of the array which will be used for inference. Input array can be converted.
- Parameters
- X (numpy array or scipy sparse CSR matrix):
Array which will be used for inference
- estimator (ERM):
Estimator which will be used
- Returns
- X (numpy array or scipy sparse CSR matrix):
Potentially converted array (if converted as numpy.float64)
- Raises
- ValueError:
One of the value is not finite
- ValueError:
Shape of features is not correct
- ValueError:
Shape of features does not correspond to estimators shape