Update cost-functions.md

pull/726/head
Ankit Mahato 2024-06-02 04:26:45 +05:30 zatwierdzone przez GitHub
rodzic c7746086b9
commit f125cf4a33
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: B5690EEEBB952194
1 zmienionych plików z 19 dodań i 22 usunięć

Wyświetl plik

@ -14,9 +14,9 @@ MSE is one of the most commonly used cost functions, particularly in regression
The MSE is defined as: The MSE is defined as:
$$MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$ $$MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$
Where: Where:
- \( n \) is the number of samples. - `n` is the number of samples.
- \( y_i \) is the actual value. - $y_i$ is the actual value.
- \( y^i\) is the predicted value. - $\hat{y}_i$ is the predicted value.
**Advantages:** **Advantages:**
- Sensitive to large errors due to squaring. - Sensitive to large errors due to squaring.
@ -43,9 +43,9 @@ MAE is another commonly used cost function for regression tasks. It measures the
The MAE is defined as: The MAE is defined as:
$$MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|$$ $$MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|$$
Where: Where:
- \( n \) is the number of samples. - `n` is the number of samples.
- \( y_i \) is the actual value. - $y_i$ is the actual value.
- \( y^i\) is the predicted value. - $\hat{y}_i$ is the predicted value.
**Advantages:** **Advantages:**
- Less sensitive to outliers compared to MSE. - Less sensitive to outliers compared to MSE.
@ -76,9 +76,9 @@ For binary classification, the cross-entropy loss is defined as:
$$\text{Cross-Entropy} = -\frac{1}{n} \sum_{i=1}^{n} [y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i)]$$ $$\text{Cross-Entropy} = -\frac{1}{n} \sum_{i=1}^{n} [y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i)]$$
Where: Where:
- \( n \) is the number of samples. - `n` is the number of samples.
- \( y_i \) is the actual class label (0 or 1). - $y_i$ is the actual class label (0 or 1).
- \( y^i\) is the predicted probability of the positive class. - $\hat{y}_i$ is the predicted probability of the positive class.
**Advantages:** **Advantages:**
@ -109,11 +109,10 @@ The multiclass cross-entropy loss is defined as:
$$\text{Cross-Entropy} = -\frac{1}{n} \sum_{i=1}^{n} \sum_{c=1}^{C} y_{i,c} \log(\hat{y}_{i,c})$$ $$\text{Cross-Entropy} = -\frac{1}{n} \sum_{i=1}^{n} \sum_{c=1}^{C} y_{i,c} \log(\hat{y}_{i,c})$$
Where: Where:
- \( n \) is the number of samples. - `n` is the number of samples.
- \( C \) is the number of classes. - `C` is the number of classes.
- \( y_{i,c} \) is the indicator function for the true class of sample \( i \). - $y_{i,c}$ is the indicator function for the true class of sample `i`.
- $\hat{y}_{i,c}$ is the predicted probability of sample `i` belonging to class `c`.
- (y^i,c) is the predicted probability of sample \( i \) belonging to class \( c \).
**Advantages:** **Advantages:**
- Handles multiple classes effectively. - Handles multiple classes effectively.
@ -143,9 +142,9 @@ For binary classification, the hinge loss is defined as:
$$\text{Hinge Loss} = \frac{1}{n} \sum_{i=1}^{n} \max(0, 1 - y_i \cdot \hat{y}_i)$$ $$\text{Hinge Loss} = \frac{1}{n} \sum_{i=1}^{n} \max(0, 1 - y_i \cdot \hat{y}_i)$$
Where: Where:
- \( n \) is the number of samples. - `n` is the number of samples.
- \( y_i \) is the actual class label (-1 or 1). - $y_i$ is the actual class label (-1 or 1).
- \( \hat{y}_i \) is the predicted score for sample \( i \). - $\hat{y}_i$ is the predicted score for sample \( i \).
**Advantages:** **Advantages:**
- Encourages margin maximization in SVMs. - Encourages margin maximization in SVMs.
@ -182,8 +181,8 @@ $$\text{Huber Loss} = \frac{1}{n} \sum_{i=1}^{n} \left\{
\right.$$ \right.$$
Where: Where:
- \( n \) is the number of samples. - `n` is the number of samples.
- \(delta\) is a threshold parameter. - $\delta$ is a threshold parameter.
**Advantages:** **Advantages:**
- Provides a smooth loss function. - Provides a smooth loss function.
@ -214,7 +213,7 @@ The Log-Cosh loss is defined as:
$$\text{Log-Cosh Loss} = \frac{1}{n} \sum_{i=1}^{n} \log(\cosh(y_i - \hat{y}_i))$$ $$\text{Log-Cosh Loss} = \frac{1}{n} \sum_{i=1}^{n} \log(\cosh(y_i - \hat{y}_i))$$
Where: Where:
- \( n \) is the number of samples. - `n` is the number of samples.
**Advantages:** **Advantages:**
- Smooth and differentiable everywhere. - Smooth and differentiable everywhere.
@ -234,5 +233,3 @@ def logcosh_loss(y_true, y_pred):
``` ```
These implementations provide various options for cost functions suitable for different machine learning tasks. Each function has its advantages and disadvantages, making them suitable for different scenarios and problem domains. These implementations provide various options for cost functions suitable for different machine learning tasks. Each function has its advantages and disadvantages, making them suitable for different scenarios and problem domains.
---