Naive Bayes Classification with Python
This project is my Project I for Bachelor’s program in Winter 2021 Semester.
Summary
Naive Bayes (NB) is a supervised machine learning classification model. The mathematical foundation behind NB is Bayes’s Theorem, which describes the likelihood of an event given observations using conditional probability. In the project, two Naive Bayes models were implemented for two types of explanatory variables. The former is Categorical NB for nominal categorical variables, and the latter is Gaussian NB for numeric continous ones, given that these variables follow normal distribution. The project was implemented with Python using NumPy and SciPy libraries. Experiments with there datasets Iris [1], Breast Cancer [2], and Wine [3] were carried out. Predictions of the implemented Naive Bayes (NB) showed the compatibility with Scikit-Learn Naive Bayes model [4].
Project Tasks and Knowledge Acquired
Project tasks:
- Implementing Naive Bayes model with Python libraries: NumPy and SciPy.
- Conducting data preparation, model training, model inference, and model evaluation.
Knowledge acquired:
- Understanding of machine learning concepts.
- Understanding of Naive Bayes algorithm.
- Understanding of machine learning model developement.
References
[1] Fisher RA. The use of multiple measurements in taxonomic problems. Annals of eugenics. 1936 Sep;7(2):179-88.
[2] Wolberg W, Mangasarian O, Street N, Street W. Breast Cancer Wisconsin (Diagnostic) [dataset]. 1993. UCI Machine Learning Repository. Available from: https://doi.org/10.24432/C5DW2B.
[3] Cortez P, Cerdeira A, Almeida F, Matos T, Reis J. Wine Quality [dataset]. 2009. UCI Machine Learning Repository. Available from: https://doi.org/10.24432/C56S3T.
[4] Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J. Scikit-learn: Machine learning in Python. the Journal of machine Learning research. 2011 Nov 1;12:2825-30.
