- Introduction: Why grayscale photos may have an effect on anomaly detection.
- Anomaly detection, grayscale photos: Fast recap on the 2 foremost topics mentioned on this article.
- Experiment setting: What and the way we examine.
- Efficiency outcomes: How grayscale photos have an effect on mannequin efficiency.
- Velocity outcomes: How grayscale photos have an effect on inference pace.
- Conclusion
1. Introduction
On this article, we’ll discover how grayscale photos have an effect on the efficiency of anomaly detection fashions and study how this alternative influences inference pace.
In laptop imaginative and prescient, it’s nicely established that fine-tuning pre-trained classification fashions on grayscale photos can lead to degraded performance. However what about anomaly detection models? These fashions don’t require fine-tuning, however they use pre-trained classification fashions reminiscent of WideResNet or EfficientNet as function extractors. This raises an vital query: do these function extractors produce much less related options when utilized to a grayscale picture?
This query isn’t just tutorial, however one with real-world implications for anybody engaged on automating industrial visible inspection in manufacturing. For instance, you may end up questioning if a shade digicam is important or if a less expensive grayscale one can be enough. Or you would have issues concerning the inference pace and wish to use any alternative to extend it.
2. Anomaly detection, grayscale photos
If you’re already accustomed to each anomaly detection in laptop imaginative and prescient and the fundamentals of digital picture illustration, be at liberty to skip this part. In any other case, it offers a quick overview and hyperlinks for additional exploration.
Anomaly detection
In laptop imaginative and prescient, anomaly detection is a fast-evolving subject inside deep studying that focuses on figuring out uncommon patterns in photos. Usually, these fashions are educated utilizing solely photos with out defects, permitting the mannequin to be taught what “regular” seems to be like. Throughout inference, the mannequin can detect photos that deviate from this realized illustration as irregular. Such anomalies typically correspond to varied defects that will seem in a manufacturing atmosphere however weren’t seen throughout coaching. For a extra detailed introduction, see this link.
Grayscale photos
For people, shade and grayscale photos look fairly related (except for the dearth of shade). However for computer systems, a picture is an array of numbers, so it turns into slightly bit extra difficult. A grayscale picture is a two-dimensional array of numbers, sometimes starting from 0 to 255, the place every worth represents the depth of a pixel, with 0 being black and 255 being white.
In distinction, shade photos are sometimes composed of three such separate grayscale photos (known as channels) stacked collectively to kind a three-dimensional array. Every channel (red, green, and blue) describes the depth of the respective shade, and its mixture creates a shade picture. You possibly can be taught extra about this here.
3. Experiment setting
Fashions
We’ll use 4 state-of-the-art anomaly detection fashions: PatchCore, Reverse Distillation, FastFlow, and GLASS. These fashions symbolize several types of anomaly detection algorithms and, on the identical time, they’re extensively utilized in sensible functions as a result of quick coaching and inference pace. The primary three fashions use the implementation from the Anomalib library, for GLASS we make use of the official implementation.

Dataset
For our experiments, we use the VisA dataset with 12 classes of objects, which offers quite a lot of photos and has no color-dependent defects.

Metrics
We’ll use image-level AUROC to see if the entire picture was categorized appropriately with out the necessity to choose a specific threshold, and pixel-level AUPRO, which reveals how good we’re at localizing faulty areas within the picture. Velocity can be evaluated utilizing the frames-per-second (FPS) metric. For all metrics, greater values correspond to higher outcomes.
Grayscale conversion
To make a picture grayscale, we are going to use torchvision transforms.

For one channel, we additionally modify function extractors utilizing the in_chans parameter within the timm library.

The code for adapting Anomalib to make use of one channel is obtainable here.
4. Efficiency outcomes
RGB
These are common photos with crimson, blue, and inexperienced channels.

Grayscale, three channels
Photos had been transformed to grayscale utilizing torchvision rework Grayscale with three channels.

Grayscale, one channel
Photos had been transformed to grayscale utilizing the identical torchvision rework Grayscale with one channel.

Comparability
We will see that PatchCore and Reverse Distillation have shut outcomes throughout all three experiments for each picture and pixel-level metrics. FastFlow turns into considerably worse, and GLASS turns into noticeably worse. Outcomes are averaged throughout the 12 classes of objects within the VisA dataset.
What about outcomes per class of objects? Possibly a few of them carry out worse than others, and a few higher, inflicting the common outcomes to look the identical? Right here is the visualization of outcomes for PatchCore throughout all three experiments exhibiting that outcomes are fairly secure inside classes as nicely.

The identical visualization for GLASS reveals that some classes may be barely higher whereas some may be strongly worse. Nonetheless, this isn’t essentially brought on by grayscale transformation solely; a few of it may be common end result fluctuations as a result of how the mannequin is educated. Averaged outcomes present a transparent tendency that for this mannequin, RGB photos produce the perfect end result, grayscale with three channels considerably worse, and grayscale with one channel the worst end result.

Bonus
How do outcomes change per class? It’s potential that some classes are merely higher fitted to RGB or grayscale photos, even when there aren’t any color-dependent defects.
Right here is the visualization of the distinction between RGB and grayscale with one channel for all of the fashions. We will see that solely pipe_fryum class turns into barely (or strongly) worse for each mannequin. The remainder of the classes turn into worse or higher, relying on the mannequin.

Additional bonus
If you’re all for how this pipe_fryum seems to be, listed here are a few examples with GLASS mannequin predictions.

5. Velocity outcomes
The variety of channels impacts solely the primary layer of the mannequin, the remaining stays unchanged. The pace enchancment appears to be negligible, highlighting how the primary layer function extraction is only a small a part of the calculations carried out by the fashions. GLASS reveals a considerably noticeable enchancment, however on the identical time, it reveals the worst metrics decline, so it requires warning if you wish to pace it up by switching to 1 channel.

6. Conclusion
So how does utilizing grayscale photos have an effect on visible anomaly detection? It relies upon, however RGB appears to be the safer wager. The affect varies relying on the mannequin and information. PatchCore and Reverse Distillation typically deal with grayscale inputs nicely, however it’s good to be extra cautious with FastFlow and particularly GLASS, which reveals some pace enchancment but additionally essentially the most vital drop in efficiency metrics. If you wish to use grayscale enter, it’s good to take a look at and examine it with RGB in your particular information.
The jupyter pocket book with the Anomalib code: link.
Comply with writer on LinkedIn for extra on industrial visible anomaly detection.
References
1. C. Hughes, Transfer Learning on Greyscale Images: How to Fine-Tune Pretrained Models (2022), towardsdatascience.com
2. S. Wehkamp, A practical guide to image-based anomaly detection using Anomalib (2022), weblog.ml6.eu
3. A. Baitieva, Y. Bouaouni, A. Briot, D. Ameln, S. Khalfaoui, and S. Akcay. Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection (2025), CVPR Workshop on Visible Anomaly and Novelty Detection (VAND)
4. Y. Zou, J. Jeong, L. Pemula, D. Zhang, and O. Dabeer, SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation (2022), ECCV
5. S. Akcay, D. Ameln, A. Vaidya, B. Lakshmanan, N. Ahuja, and U. Genc, Anomalib (2022), ICIP

