Penelitian ini mengevaluasi metode Patch Distribution Modeling (PaDiM) untuk deteksi anomali visual tanpa label pada sistem inspeksi kualitas industri, menunjukkan generalisasi yang konsisten di tiga dataset benchmark dengan peta anomali tingkat piksel yang dapat diinterpretasikan secara operasional.
Penelitian menggunakan kerangka PaDiM (Patch Distribution Modeling) yang memanfaatkan fitur CNN pra-latih (ResNet-18 dan Wide-ResNet-50-2) untuk memodelkan distribusi patch normal dengan jarak Mahalanobis. Eksperimen terkontrol dilakukan pada tiga benchmark industri publik (MVTec AD, VisA, dan BTAD) yang mewakili spektrum kesulitan akuisisi yang berbeda. Evaluasi menggunakan metrik AUROC pada level gambar dan piksel untuk mengukur kemampuan deteksi dan lokalisasi anomali.
Dibuat 23 Mei 2026 · paper_summarizer_v1.0
Buka PDF di perangkat Anda
Unduh / Buka PDFVisual quality inspection in manufacturing environments requires detection methods that are label-efficient, spatially discriminative, and capable of producing outputs that support human decision-making at the point of inspection. Supervised defect classifiers address discriminative accuracy but depend on extensive annotated fault catalogues that are seldom available under operational conditions. Unsupervised one-class learning methods circumvent the annotation constraint but typically yield scalar image-level anomaly scores that convey no spatial information to inspection operators. The present study evaluates Patch Distribution Modeling (PaDiM) as an integrated framework addressing both requirements: a training-label-free method whose Mahalanobis distance scoring mechanism inherently produces pixel-level anomaly maps that can be rendered directly as colour overlay visualisations on inspection images. Controlled experiments were conducted on three publicly available industrial benchmarks representing a spectrum of acquisition difficulty: MVTec AD, VisA, and BTAD. On MVTec AD, a ResNet-18 backbone achieved a mean image-level AUROC of 0.8688 and pixel-level AUROC of 0.9730 across three evaluated product categories. Substituting a Wide-ResNet-50-2 backbone with 256 randomly projected features elevated image-level AUROC on the bottle category to 1.0000. On VisA, mean pixel-level AUROC attained 0.9829, the highest value recorded across all three datasets, indicating reliable sub-region defect localisation under elevated intra-class appearance variability. On BTAD, whose images were collected on an operational production line, mean image-level AUROC reached 0.9350, with two of three product categories exceeding 0.98 and one achieving 1.00; pixel-level AUROC remained above 0.95 across all categories. These results indicate that PaDiM generalises consistently across datasets of markedly different character, and that the heatmap overlays produced by the pipeline provide spatially accurate, operationally interpretable defect indicators without requiring fault labels at any stage.