The Naive Bayes model uses probabilities to predict an outcome.It is a supervised machine learning technique, i.e. it reqires labelled data for training. It is used for classification and is based on the Bayes' Theorem. The basic assumption of this model is the independence among the features, i.e. a feature is unaffected by any other feture.
## Bayes' Theorem
Bayes' theorem is given by:
$$
P(a|b) = \frac{P(b|a)*P(a)}{P(b)}
$$
where:
- $P(a|b)$ is the posterior probability, i.e. probability of 'a' given that 'b' is true,
- $P(b|a)$ is the likelihood probability i.e. probability of 'b' given that 'a' is true,
- $P(a)$ and $P(b)$ are the probabilities of 'a' and 'b' respectively, independent of each other.
## Applications
Naive Bayes classifier has numerous applications including :
we can conclude that tennis can be played if outlook is overcast and wind is weak.
# Types of Naive Bayes classifier
## Guassian Naive Bayes
It is used when the dataset has **continuous data**. It assumes that the data is distributed normally (also known as guassian distribution).
A guassian distribution can be characterized by a bell-shaped curve.
**Continuous data features :** Features which can take any real values within a certain range. These features have an infinite number of possible values.They are generally measured, not counted.
eg. weight, height, temperature, etc.
**Code**
```python
#import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn import metrics
from sklearn.metrics import confusion_matrix
#read data
d=pd.read_csv("data.csv")
df=pd.DataFrame(d)
X = df.iloc[:,1:7:1]
y = df.iloc[:,7:8:1]
# splitting X and y into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# training the model on training set
obj = GaussianNB()
obj.fit(X_train, y_train)
#making predictions on the testing set
y_pred = obj.predict(X_train)
#comparing y_test and y_pred
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_train, y_pred))
We can conclude that naive bayes may limit in some cases due to the assumption that the features are independent of each other but still reliable in many cases. Naive Bayes is an efficient classifier and works even on small datasets.