Skip to content

Anomaly Detection

Note

When using AD as a secondary model to filter data for the primary model, you need not use the same input features for both

Let

  • \(\mathcal{X}, \mathcal{y} \in \mathcal{D}\) be all the data you have
  • \(\mathcal{X}_a\) be used for primary model
  • \(X_b\) be used for anomaly detection

Then, all these perfectly reasonable - \(\vert \mathcal{X}_a \vert = \vert \mathcal{X}_b \vert\) - \(\vert \mathcal{X}_a \vert > \vert \mathcal{X}_b \vert\) - \(\vert \mathcal{X}_a \vert < \vert \mathcal{X}_b \vert\)

Density Estimation

image-20231103185150834

image-20231104155930664

Procedure Methodology

Training Only non-anomalous samples
Validation Verify with known values, then validate, and then update model
Testing Verify with known values and then test

Anomaly Detection vs Classification

Anomaly Detection Classification
Anomalous training samples requirement None
(only required for tuning)
Large
Non-anomalous training samples requirement Large Large
Can handle novelties βœ… ❌
Example Unseen defects
Fraud
Known defects (scratches)
Spam mail

Feature Engineering

Include features that have very small/large values for anomalies

If anomalies don’t have such values, then try to find a combination of features such as \(x_1 \cdot x_2\) to achieve it

Dealing with Non-Gaussian Features

Transformation of training, validation, and test set.

image-20231104164633793

If you have x values as 0, then \(\log(x)\) as \(\log(0)\) is undefined. So you use \(\log(x+c)\), where \(c>0\)

Last Updated: 2024-05-14 ; Contributors: AhmedThahir

Comments