Anomaly detection is a technique now being leveraged to spot abnormal behavior in datasets – it is about identifying rare observations that stand out from normal observations.  Anomaly detection in credit card transactions, spotting anomalous behavior to thwart cyber-attacks, detecting faults with regards to mechanical components, identifying fraudulent healthcare claims are some areas magnifying the significance of anomaly detection.

Why use Anomaly detection technique?

Consider the case of fraud detection program pertaining to a financial institution. Detection of fraudulent transactions hinges on the ability to spot patterns in data that can help detect suspicious behavior. With the fast flowing data and changes to data, and ever-increasing transactions, it is imperative to act fast in analyzing data to identify patterns that point towards suspicious behaviors and thwart fraud. Anomaly detection driven by machine learning can help analyze huge data loads and enable real-time fraud detection from the data activity.

What Anomaly detection methods to use?

Anomalies can be identified by leveraging the following methods.

Contextual anomaly – Power consumption by residents in a hot city peaks during May – for instance – as compared to power consumption during September. This could be a normal pattern. But when power consumption during September shoots by quite a few notches when compared to the previous year, it is the contextual anomaly we are referring to.

Time series data is more relevant in case of Contextual anomaly. Sales of a Shampoo brand dropping drastically for a period (1-30) in the month of September as compared to the previous year (same period) is one example of how contextual anomaly is used.

Collective anomaly – A single data instance cannot point to anomaly in case of collective anomaly. Rather many data points that are related come together to point anomaly in the entire dataset. Inventory levels of wool coat, water-proof boot, gloves, and trench coat for a specific time series data are not showing marked deviations from the normal ones when taken individually, but when combined, they constitute a single anomaly revealing a problem related to very high inventory holding cost.

Point anomaly – A customer makes online purchases during weekends for not more than $1500. And when the purchase goes beyond $10000 during a weekend, it points to point anomaly and in turn to the possibility of a fraud being perpetrated. Among the weekend online purchases pertaining to this customer, this transaction (point) has been observed as the anomaly.

Supervised anomaly detection vs unsupervised anomaly detection

Using supervised anomaly detection for classifying anomalous/non-anomalous data points necessitates labeled anomaly data points. Supervised learning falls in the scheme of anomaly detection only when there is enough anomalous data. Challenges crop up by way of having a very small number of anomalous events and in dealing with novel forms of anomaly events. As anomalous events are more likely to differ considering the type of occurrence, predicting future anomalies leveraging past data will be a top concern.

Anomaly detection using unsupervised learning on the other hand help detect anomalies of all types as it also helps detect anomalies that have not been noticed in the past. Since unlabeled data is used when leveraging unsupervised learning for anomaly detection and with the normal data fitting into the model, anomalous data points are easily identified.

In case of credit card fraud detection, supervised anomaly detection is relevant when patterns of fraud can be detected based on past transactions. Considering the new ways and approaches adopted by fraudsters, unsupervised anomaly detection becomes a fitting approach to unearth fraudulent activities that have never occurred before.

Anomaly detection helps prevent potential problems before they can erupt into a big one. In extracting patterns arousing interest and patterns showing deviation from the usual behavior within data, anomaly detection aids in alerting organizations of suspicious activities and in unearthing unseen opportunities to boost profits, cut losses and prevent issues from creating a major impact on business.