Cost functions, also known as loss functions, play a crucial role in training machine learning models. They measure how well the model performs on the training data by quantifying the difference between predicted and actual values. Different types of cost functions are used depending on the problem domain and the nature of the data.
## Types of Cost Functions
### 1. Mean Squared Error (MSE)
**Explanation:**
MSE is one of the most commonly used cost functions, particularly in regression problems. It calculates the average squared difference between the predicted and actual values.
- Provides a linear error term, which can be easier to interpret.
**Disadvantages:**
- Not differentiable at zero, which can complicate optimization.
**Python Implementation:**
```python
import numpy as np
def mean_absolute_error(y_true, y_pred):
n = len(y_true)
return np.mean(np.abs(y_true - y_pred))
```
### 3. Cross-Entropy Loss (Binary)
**Explanation:**
Cross-entropy loss is commonly used in binary classification problems. It measures the dissimilarity between the true and predicted probability distributions.
- Not differentiable at the margin, which can complicate optimization.
**Python Implementation:**
```python
import numpy as np
def hinge_loss(y_true, y_pred):
n = len(y_true)
loss = np.maximum(0, 1 - y_true * y_pred)
return np.mean(loss)
```
### 6. Huber Loss
**Explanation:**
Huber loss is a combination of MSE and MAE, providing a compromise between the two. It is less sensitive to outliers than MSE and provides a smooth transition to MAE for large errors.
Log-Cosh loss is a smooth approximation of the MAE and is less sensitive to outliers than MSE. It provides a smooth transition from quadratic for small errors to linear for large errors.
- Computationally more expensive than simple losses like MSE.
**Python Implementation:**
```python
import numpy as np
def logcosh_loss(y_true, y_pred):
error = y_true - y_pred
loss = np.log(np.cosh(error))
return np.mean(loss)
```
These implementations provide various options for cost functions suitable for different machine learning tasks. Each function has its advantages and disadvantages, making them suitable for different scenarios and problem domains.