PyRBP.evaluateClassifiers¶
This two functions are used to evaluate the validity of the sequence representations obtained in the preceding process.
- PyRBP.evaluateClassifiers.evaluateDLclassifers(features, labels, file_path='', shuffle=True, folds=5)¶
PyRBPintegrates four classical deep learning models (CNN, RNN, MLP and ResNet), cross-validates them using the representation matrix on the four classes of models, and stores the final performance metrics obtained for each model inDL_evalution_metrics.csv.- Parameters:
- features:numpy array, necessary parameters
Sequence feature matrix for training the four deep learning models.
- labels:numpy array, necessary parameters
The label corresponding to each sequence (which indicates whether the corresponding sequence is the target sequence of the RBPs).
- file_path:str, default=''
Path for storing cross-validation result files.
- shuffle:bool, default=True
Whether to perform disorder when dividing sequence subsets used for cross-validation.
- folds:int, default=5
Cross-validated folds, which divides the training set into 5 (or other values) subsets, where one subset is the validation set, and the other
fold - 1subsets constitute the training set. Each subset needs to be performed once as a validation set.
- PyRBP.evaluateClassifiers.evaluateMLclassifers(features, labels, file_path='', shuffle=True, folds=5)¶
PyRBPintegrates eleven classical machine learning models (Logistic Regression, K-Nearest Neighbor, Decision Tree, GaussianNB, Bagging, Random Forest, AdaBoost, Gradient Boosting, SVM, LDA and ExtRa Trees), cross-validates them using the representation matrix on each model, and stores the final performance metrics obtained for each model inML_evalution_metrics.csv.- Parameters:
- features:numpy array, necessary parameters
Sequence feature matrix for training the machine learning models.
- labels:numpy array, necessary parameters
The label corresponding to each sequence (which indicates whether the corresponding sequence is the target sequence of the RBPs).
- file_path:str, default=''
Path for storing cross-validation result files.
- shuffle:bool, default=True
Whether to perform disorder when dividing sequence subsets used for cross-validation.
- folds:int, default=5
Cross-validated folds, which divides the training set into 5 (or other values) subsets, where one subset is the validation set, and the other
fold - 1subsets constitute the training set. Each subset needs to be performed once as a validation set.