top of page
shoghik4

DeepLTK Tutorial #3.6: Visual Anomaly Detection

Updated: Nov 1

Overview

This blog post provides a comprehensive overview of the Image Anomaly Detection implementation using DeepLTK toolkit.

Here we present an implementation of anomaly detection method which has several advantages:

  1. Unsupervised learning: No dataset annotations are required.

  2. Only “good” samples needed for training: Anomalous samples are unnecessary, which is especially beneficial since collecting such data can be challenging. Small dataset requirement: This method can achieve near-perfect accuracy with as few as 60 training samples.

  3. High speed: Training typically completes in under a minute, with inference times around 10ms per image.

By covering key stages—training, evaluation, and testing—this example can serve as a reference for those working in the field of anomaly detection.


Anomaly detection involves identifying anomalous instances and localizing defects in samples that deviate from expected patterns. In image-based anomaly detection, the goal is to detect unusual visual patterns that differ from the normal image samples, such as defects in manufacturing or anomalies in medical scans. DeepLTK helps solve the task of visual anomaly detection.

There are various techniques for solving image anomaly detection tasks, including Autoencoders, Feature Extraction Methods, Isolation Forests, One-Class SVMs, Gaussian Mixture Models (GMMs), Generative Adversarial Networks (GANs), Deep Belief Networks (DBNs). Here we describe how to implement PatchCore, one of the most successful feature-extraction based method, with help of DeepLTK.


PatchCore Method


PatchCore is a state-of-the-art method for image anomaly detection. One of its key advantages is that it requires only normal images for training, making it especially suitable for scenarios where obtaining abnormal samples is both challenging and costly, which is common for most of the anomaly detection projects.

PatchCore operates by extracting deep feature representations (embeddings) from image patches using a pre-trained convolutional neural network, e.g. Wide ResNet. These patch features from normal (anomaly-free) images are stored in a memory bank. To make the method efficient, PatchCore employs a coreset subsampling technique to reduce the size of the memory bank while retaining essential information. Additionally random projection technique is used to reduce the dimensionality of the feature, to further reduce memory bank size and improve the execution speed during the training and inference stages.

During inference, it compares patches from a test image to the nearest neighbors in the memory bank. Anomalies are detected and localized based on the score which is effectively a Euclidean distance computed between these patch features and their closest counterparts (nearest neighbor) in the memory bank - the larger the distance, the more likely a patch is anomalous.


PatchCore Method Overview
PatchCore Method Overview. Source PatchCore paper

DeepLTK based Visual Anomaly Detection

This blog post describes DeepLTK based implementation of PatchCore method for anomaly detection.

PatchCore Reference Project

The PatchCore method is implemented as an Add-On library for DeepLTK and being provided as a part of reference project. The project contains all the necessary files and resources to train and deploy a custom PatchCore based anomaly detection model.


PatchCore Reference Project

High level description of the project’s content is provided below:

  • “Models” folder contains feature extractor and pretrained example of a PatchCore models.

  • “PatchCore_AddOn” folder contains DeepLTK based PatchCore library.

  • “0_PatchCore_Train.vi” - trains a PatchCore model based on normal images by extracting features and building a memory bank.

  • “1_PatchCore_Eval.vi” - evaluates the performance of the trained model based on normal and anomalous dataset of images, by generating image-level and pixel-level metrics and statistics. This VI also suggests an optimal detection thresholds based on the calculated statistics.

  • “2_PatchCore_Inference(Image).vi” - demonstrates how to use the API for implementing inference for a single image and visualizes the results of anomaly detection.

  • “3_PatchCore_Report(Dataset).vi” - similar to the inference example, but iterates through the images in a dataset, predicts and visualizes the results, and stores the results as images in a separate folder.


Dataset Preparation

To begin with image anomaly detection, a dataset consisting of train and test sets is required. The training set should contain only images of normal samples, while the test set should contain both normal and anomalous images with corresponding ground truth masks.

The example project presented here is based on MVTec AD dataset, a reference dataset for benchmarking anomaly detection methods, and the dataset reader VI in the project is specifically designed for the structure of this dataset. When customizing this project for a specific dataset, it will be required to either adapt the dataset to the structure of MVTec dataset as shown in the snapshot below, or modify the dataset reader accordingly. MVTec AD dataset can be downloaded from the following link.


MVTec AD dataset structure
MVTec AD dataset structure

Training

The training process is implemented in “0_PatchCore_Train.vi”.


DeepLTK Based Anomaly Detection Project’s Training VI
DeepLTK Based Anomaly Detection Project’s Training VI

Front Panel of the Training VI

The Front Panel of “0_PatchCore_Train.vi” is shown below.

anomaly detection training
Training VI Front Panel

The NNFE_Paths cluster takes the feature extractor’s (model’s) configuration and binary files, respectively. The reference project contains the pretrained feature extractor model (customized Wide ResNet 50) which is located at “Models/NN_Feauture_Extractors/Patchcore_WRN_50_2_L2L3folder in the project.


Wide resnet 50 feature extractor
Wide ResNet 50 feature extractor model location in the project

The path to the training dataset's folder should be specified in the Dataset_Path(Train) control.

The destination folder for saving the outcomes of the training is specified in the PCore_Dest_Path(Folder) control. The paths of generated files, after the training, will be displayed in Generated_File_Paths indicator.

Under the specified controls there are 3 high-level configuration clusters: NNFE_Custom_CFG, and Coreset_Params.

The NNFE_Custom_CFG configures NN based feature extractor, specifically input dimensions, mini batch size and the device on which the computations will be performed (CPU or GPU).

The PatchCore_Init_CFG specifies the backbone network architecture used for feature extraction, currently supporting only WRN_50_2_L2L3 (Wide ResNet 50 trimmed to layer 3). The Out_Layer_Names and Patch_Size parameters in PatchCore_Init_CFG are disabled and grayed out since the WRN_50_2_L2L3 backbone already selects the second and third layers with a Patch_Size of 3.

Coreset_Params cluster defines parameters for coreset subsampling.  The Sampling_Ratio parameter specifies the portion of features to be retained in the Memory Bank after subsampling, which must fall between 0 and 1. The Start_idx parameter specifies the starting index for iterative coreset selection; if set to -1, a random index will be chosen. This parameter is designed for the reproducibility purpose. Random_Proj? specifies whether random projection should be applied to reduce the dimensionality of the features. Target dimensionality after random projection can be specified either with Proj._Dim. or Epsilon parameters. If Proj._Dim. has a positive value, then the specified dimensionality will be used for projection. Otherwise (if Proj._Dim. is negative) then the target projection dimensionality will be defined with help of Epsilon parameter by utilizing Johnson-Lindenstrauss lemma.

PCore_Device specifies the device for PatchCore computations, either “CPU” or “GPU (CuLab)”. The GPU based implementation of PatchCore is implemented with help of CuLab (GPU Toolkit for LabVIEW), which is optional dependency library for this example project. To understand how to set up the LabVIEW project to be able to utilize GPUs, please refer to the “Appendix: Accelerating Computations on GPUs with CuLab”.

Alongside the configuration parameters, a Coreset_Max_Distances graph on the right displays how the maximum distance between features changes with the coreset (memory bank) size. This graph provides qualitative information for properly choosing coreset size at training stage.

Block Diagram of the Training VI

The training process involves the following steps: dataset reading, PatchCore initialization, embedding generation, coreset subsampling and saving of training results. The block diagram of the training VI is shown below.


Anomaly detection training VI block diagram
Training VI Block Diagram

The bottom part of the code is responsible for dataset reading which is highlighted in the snapshot below.


Dataset preprocessing process
Dataset Preprocessing Process

In the dataset reading, besides reading the images, it incorporates a number of preprocessing stages, which are: image resampling with specified Interpolation Type, image normalization according to the selected Normalization Type. Normalization ensures that input images have consistent scales, enabling more accurate feature extraction, which in turn enhances the model’s ability to detect anomalies. Currently there are two normalization options: Standardize(DS_Stats), which uses precomputed mean and variance derived from the entire dataset for standardization, and Standardize(Image_Stats), which standardizes each image independently based on its mean and variance values. Experimental results have generally shown that Standardize(DS_Stats) yields more accurate results.

PreProcessing Options: Normalization Types.
PreProcessing Options: Normalization Types

The main training process begins with the use of "NNPC_Init(Train).vi", which configures and initializes a PatchCore instance. This vi initializes the backbone neural network responsible for feature extraction and incorporates all the configuration information. Following this, "NNPC_Generate_Embedding.vi" is employed to generate embeddings by extracting features from the dataset. The coreset is then created using "NNPC_Patchcore_Coreset_Subsampler.vi". Upon completion of the training, the subsampled coreset (serving as a memory bank), along with the updated backbone network and PatchCore configuration parameters, are saved to the designated folder using "NNPC_Save.vi". Finally, the “NNPC_Destroy.vi” is called to release all resources allocated during the training process.

The files generated at this stage will be used at model’s evaluation and deployment stages.


Evaluation

After the training, the next step is to evaluate the model’s accuracy and determine optimal thresholds for identifying anomalies, which is implemented in “1_PatchCore_Eval.vi”.


DeepLTK Based Anomaly Detection Project’s Evaluation VI
DeepLTK Based Anomaly Detection Project’s Evaluation VI

During the inference, embeddings from test images are compared against the features stored in memory bank (generated from normal images). However, even the embeddings from new good samples may not result in a zero difference with the memory bank. To effectively distinguish between normal and anomalous samples, a thresholding approach is necessary. This involves comparing the calculated scores to a predefined threshold. Depending on whether a score is higher or lower than this threshold, the sample is labeled as either anomalous or normal.

Optimal image-wise and pixel-wise thresholds are determined during the evaluation phase. The evaluation process includes testing on both normal and anomalous images, displaying their score distributions, and calculating key image-wise and pixel-wise performance metrics, such as maximum F1 and IoU scores, thresholds at those values and AUROC values.


Front Panel of the Evaluation VI

The Front Panel of 1_PatchCore_Eval.vi is shown below.

anomaly detection evaluation
Evaluation VI Front Panel

The files generated during the training stage, feature extractor network’s configuration and weights files, PatchCore configuration file and the generated MemBank, are provided at NNFE_Paths and PatchCore_Paths controls, respectively. The Validation_Dataset_Folder_Path(Good,_Bad) and Ground_Truth_Folder_Path specify the test dataset containing images and ground truth masks respectively. The destination for saving the evaluation results is specified in the optional Eval_Stats_Dest_Folder control. If this path is left empty, the results will be saved next to PatchCore configuration file.

The NNFE_Custom_CFG, like in the training stage, specifies custom configuration parameters for feature extractor. At evaluation stage it is recommended to leave Input3D_Custom_Dims equal to {-1,-1,-1}, so the same input image dimensions(resolution) as from the training are used.

The PCore_Device specifies the device for PatchCore computations, either CPU or GPU (CuLab). To understand how to set up the LabVIEW project to be able to utilize GPUs for PatchCore computations, please refer to the “Appendix A: Accelerating Computations on GPUs with CuLab”.

The Feature_Scorer_Params cluster defines the parameters for image-wise and pixel-wise anomaly score calculations. Nearest_Neighbours is smoothing parameter for image-wise score calculation. Only the value of 1 is supported for this parameter for this version of the PatchCore Add-On library. If Calc._Anomaly_Map is set to TRUE, the anomaly map will be generated and provided at the output. The parameters in the Gaussian_Blur cluster define the Gaussian kernel based smoothing settings, which are applied after the anomaly map has been upsampled to the original input resolution.

On the right side of the Front Panel, the distribution graphs and evaluation statistics of image-wise and pixel-wise anomaly scores are displayed. These distribution graphs for both normal (good) and anomalous (bad) images help to qualitatively assess the performance of the model. Evaluation statistics, represented with help of Eval_Stats cluster, contains the following information: the Scores(Good) and Scores(Bad) indicate the minimum and maximum scores calculated for the corresponding good and bad samples. The F1_Max represents the maximum F1 score. The Threshold (F1_Max) is the anomaly score threshold at which this maximum F1 score is achieved. The IoU_Max indicates the maximum Intersection over Union (IoU) value. The Threshold (IoU_Max) is the anomaly score threshold at which the maximum IoU is achieved. The AUROC (Area Under the Receiver Operating Characteristic curve) evaluates the model’s ability to distinguish between normal and anomalous samples across a range of thresholds, providing an overall measure of detection performance.

The path where the evaluation statistics file was saved is displayed in the Eval_Stats_File_Path indicator.


Block Diagram of the Evaluation VI

The evaluation process includes PatchCore initialization for inference, dataset reading, anomaly scores prediction for good and bad datasets, separation of anomalous regions in bad images using ground truth masks, evaluation of image-wise and pixel-wise statistics, saving statistics results in a separate file for further usage during the inference. The block diagram of the Evaluation VI is shown below.


Anomaly detection evaluation VI
Evaluation VI Block Diagram

To begin the evaluation, the “NNPC_Init(Inference).vi” is used to initialize PatchCore instance for the inference. After the initialization the instance will contain the backbone and inference parameters. After initialization, the training and testing datasets are read and preprocessed. It is recommended to use the same Preprocessing_Options as those used during the training.

The training and testing datasets are separately passed to “NNPC_Predict.vi”, which computes image-wise and pixel-wise anomaly scores. Following this, the “NNPC_Bad_Pixel_Scores_By_GT.vi” is used to identify bad pixel scores using ground truth masks. Once the image and pixel-level scores are calculated, they are processed by “NNPC_Eval_Scores_And_Stats.vi”, which generates score distributions for normal (good) and anomalous (bad) image samples, along with the calculated statistics. Upon completion, the evaluation statistics for image-wise and pixel-wise anomaly detection are saved using “NNPC_Stats(Write).vi” in the specified path. Finally, “NNPC_Destroy.vi” is called to release all resources allocated during the evaluation process.

The file generated at this stage will be used at inference stage.


Inference

Once the model is trained, evaluated and the thresholds are determined, the model can be used for inference which is implemented in “2_PatchCore_Inference.vi”.


anomaly detection inference VI
Reference Project’s Inference VI.

Front Panel of the Inference VI

The front panel of the inference is shown below.  

anomaly detection inference
Inference VI Front Panel

The files generated during the training (feature extractor network’s configuration and weights, PatchCore configuration, memory bank) and evaluation (statistics) stages are provided at NNFE_Paths, PatchCore_Paths, Eval_Stats_File_Path controls, respectively. The Test_Dataset_Folder_Path and Ground_Truth_Folder_Path specify the test dataset’s folders containing images and ground truth image masks respectively on which the trained model will be tested for inference.

The NNFE_Custom_CFG, like in the training and evaluation stages, specifies custom configuration parameters for feature extractor. At inference stage it is recommended to leave Input3D_Custom_Dims equal to {-1,-1,-1}, so the same input dimensions as from the training are used.

The Feature_Scorer_Params cluster, like in the evaluation stage, defines the parameters for calculating image-wise and pixel-wise anomaly scores. It is recommended to use the same Feature_Scorer_Params that were used during the evaluation stage to ensure consistency.

The PCore_Device specifies the device for PatchCore computations, either CPU or GPU (CuLab). To understand how to set up the LabVIEW project to be able to utilize GPUs, please refer to the “Appendix A: Accelerating Computations on GPUs with CuLab”.

The Image_idx control selects a test image from the dataset specified at Test_Dataset_Folder_Path folder, and the path of the selected image is displayed in the Image_File_Path indicator.

The Image_Thrshld(Custom) and Pixel_Thrshld(Custom) controls set the respective thresholds for the inference, and if set to -1, the optimal threshold at F1 maximum will be used (extracted from statistics file). The Image_Thrshld(Eval) and Pixel_Thrshld(Eval) indicators display the  optimal image-wise and pixel-wise thresholds, calculated during the evaluation stage. The predicted image score is displayed in the Img_Score indicator, and the pixel-wise minimum and maximum scores are displayed in Pixel_Score(Max) and Pixel_Score(Min), respectively.

The Heatmap_Config configures the settings for generating the anomaly map. The ColorMap_Type specifies the color scheme used in the heatmap generation, and Alpha represents the blending value that determines the transparency of the color mapped heatmap when overlaid on the original image.

The Report_Image_Dims control sets the X and Y resolutions for the resulting anomaly detection images. The Anomaly_Detection_Results combines the original image, ground truth, heatmap, predicted mask, and segmentation result into a single image.


Block Diagram of the Inference VI

The inference process includes PatchCore initialization, reading dataset path, reading evaluation statistics, image selection and preprocessing, prediction of anomaly scores, thresholding to identify anomalies, and displaying the anomaly detection results. The block diagram of the “2_PatchCore_Inference(Image).vi” is shown below.

Anomaly detection inference vi
Inference VI Block Diagram

To begin the inference, the “NNPC_Init_Inference.vi” is used to initialize PCore_Instance for the inference, containing the backbone and inference parameters. The “NNPC_Stats(Read).vi” is used to load the image and pixel level statistics calculated during the evaluation stage.

The main inference process starts by selecting the specified sample indicated by the Image_idx from the dataset and preprocessing it. When reading the image, as in the evaluation stage, it is also recommended to use the same Preprocessing_Options as those applied during the training. The preprocessed image is then passed to “NNPC_Predict.vi” to generate image-wise and pixel-wise anomaly scores. The predicted image-wise anomaly score is compared with the image threshold.

Subsequently, the “Combine_Anomaly_Detection_Results.vi” and “Render_AD_Results.vi” assemble the input image, ground truth and the predictions and represent them as single large image.

When the inference is halted with the Stop button, the “NNPC_Destroy.vi” is called to release all resources allocated during the inference process.



Report Generation

The project contains a utility VI which is designed to run the inference process and generate detection result for the whole dataset, which is implemented in “3_PatchCore_Report(Dataset).vi”.

anomaly detection report
Reference Project’s Report(Dataset) VI

The front panel and block diagram of the “3_PatchCore_Report(Dataset).vi” are shown in the snapshots below, respectively.

anomaly detection report
Report Generation VI Front Panel

anomaly detection report block diagram
Report Dataset Block Diagram

The implementation of this VI is very similar to that of the inference VI, with the key difference being that the user must provide a destination path in "Results_Destination_Folder" where the results will be saved.


Performance

The anomaly detection performance was evaluated on the MVTec AD dataset, with the PatchCore method configured for 224x224 input image dimensions, a 10% coreset sub-sampling ratio, and a projection dimension calculated using the Johnson-Lindenstrauss lemma with an epsilon of 0.9. The benchmark was conducted on an NVIDIA GeForce RTX 2080-Ti GPU. The table below summarizes the anomaly detection results.

It is important to note that the parameters were not specifically tuned for optimal performance for each dataset class; thus, these results represent the model's effectiveness under default settings. As it can be seen, with this approach it is possible to train a model within less than a minute, while the average inference time is 9.61ms per image, demonstrating suitability of this approach for real-time applications. While most classes achieve accuracies close to 1, there is a room for improvement for some specific classes (e.g. "screw"). We will cover results of detailed investigation and parameter tuning optimization strategies in one of our upcoming blog posts.


A video demonstration highlighting the anomaly detection process in action.



Conclusion

In conclusion, the reference project serves as a practical example of how to use the PatchCore Add-On library to address visual anomaly detection tasks. It is available for download at the GitHub link.


Appendix: Accelerating Computations on GPUs with CuLab

There are two portions of the code that can be executed on GPUs. First one is neural network related part (backbone, feature extractor) and PatchCore computations. Neural Network part is completely DeepLTK based, which simplifies the computation target (CPU or GPU) switching process. PatchCore related part in order to be executed on the GPU requires CuLab toolkit. This dependency is optional. To avoid broken VI on machines where CuLab is not installed, switching to GPU for PatchCore is controlled by Conditional Disable Symbols.

To enable GPU support for PatchCore computations, first CuLab (GPU Toolkit for LabVIEW) should be installed. After the installation, a conditional disable symbol called “USE_CULAB” should be created in Project Properties, and its value should be set to “TRUE” as shown in the snapshot below. Once configured, GPU can be selected with help of PCore_Device parameter when initializing PatchCore instance.


culab gpu
Setting Conditional Disable Symbol “USE_CULAB” to True


167 views0 comments

Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating

HAVE QUESTIONS?

bottom of page