Transformer-Based and Generative AI Framework for Automated Defect Detection in Semiconductor Wafer Inspection
Balachandar Jeganathan
*
ASML, San Jose, CA, USA.
*Author to whom correspondence should be addressed.
Abstract
The continuous downscaling of semiconductor feature sizes and the introduction of complex multilayer architectures have made wafer defect detection increasingly critical to maintaining high device yield and process stability. Conventional automated inspection systems that rely on rule-based image analysis or convolutional neural networks (CNNs) often fail to generalize when exposed to subtle, pattern-dependent variations or unseen defect morphologies. Such limitations hinder rapid feedback and adaptive process optimization in modern fabrication environments. To address these challenges, I propose a hybrid Vision Transformer–Generative Adversarial Network (ViT–VAE-GAN) framework that integrates discriminative and generative learning for robust, interpretable wafer defect detection.
The Vision Transformer (ViT) leverages global self-attention mechanisms to model long-range spatial dependencies, enabling the identification of subtle and context-sensitive wafer anomalies that CNNs typically overlook. Complementing this, the VAE-GAN branch performs unsupervised learning of the defect-free wafer distribution, supporting both anomaly detection and realistic defect synthesis for data augmentation. Together, these components provide a comprehensive framework that balances classification accuracy and model interpretability.
Experimental evaluations were conducted using the WM-811K public wafer map dataset and a synthetically generated optical/SEM-style wafer defect dataset. The proposed method achieved a 12.8% improvement in classification accuracy and a 26% reduction in false alarm rate (FAR) compared with strong CNN and pure ViT baselines. Qualitative analyses using attention visualizations revealed that the model not only identifies known defect types but also generalizes effectively to novel and complex anomalies. The hybrid ViT–VAE-GAN framework demonstrates strong adaptability across diverse operational conditions, providing a scalable foundation for future deployment in real-time fab-level manufacturing environments. The results indicate that transformer-driven architectures, when coupled with generative learning, can significantly enhance automated defect inspection pipelines. This research contributes toward the development of adaptive, self-learning inspection systems for next-generation semiconductor fabs, improving both productivity and yield sustainability.
Keywords: Vision Transformer (ViT), wafer inspection, VAE-GAN, semiconductor manufacturing, machine vision, anomaly detection, yield optimization