Improving Visual Quality Inspection

Researching highly compact deep neural network architecture for visual quality inspection in high throughput manufacturing

AMD has submitted this post.

An example of a surface defect visual inspection system.(Image courtesy of AMD)

An example of a surface defect visual inspection system.(Image courtesy of AMD)

A critical aspect in the manufacturing process is the visual quality inspection of manufactured components for defects and flaws. Human-only visual inspection can be time-consuming and is a significant bottleneck for high-throughput manufacturing scenarios.  

Given significant advances in the field of deep learning, automated visual quality inspection can lead to highly efficient and reliable detection of defects and flaws during the manufacturing process. However, deep learning-driven visual inspection methods need significant computational resources, which itself is a bottleneck to widespread adoption for enabling smart factories.  

In this study, we investigated the utilization of a machine-driven design exploration approach to create TinyDefectNet, a highly compact deep convolutional network architecture tailored for high-throughput manufacturing visual quality inspection.  

TinyDefectNet comprises of just ∼427K parameters and has a computational complexity of ∼97M FLOPs, yet achieving a detection accuracy of a state-of-the-art architecture for the task of surface defect detection on the NEU defect benchmark dataset.  

As such, TinyDefectNet can achieve the same level of detection performance at 52× lower architectural complexity and 11× lower computational complexity. Furthermore, TinyDefectNet was deployed on an AMD EPYC 7R32, and achieved 7.6× faster throughput using the native Tensorflow environment and 9×faster throughput using AMD ZenDNN accelerator library.  

Finally, explainability-driven performance validation strategy was conducted to ensure correct decision-making behaviour was exhibited by TinyDefectNet to improve trust in its usage by operators and inspectors.  

The promises of the new era for machine learning have motivated different industries to augment and enhance the capabilities and efficiency of their existing workforce with artificial intelligence (AI) applications that automate highly repetitive and time-consuming processes. In particular, the advances in deep learning have led to promising results in different applications ranging from computer vision tasks such as image classification and video object segmentation to natural language processing tasks such as language translation and question-answering. 

However, success in adopting deep learning in real-world applications have been found in newer technology sectors such as e-commerce and social media where the tasks are inherently automation-friendly without human intervention under a controlled environment (e.g., product/content recommendation), with very limited adoption in traditional industrial sectors such as manufacturing where the tasks are currently done manually by human agents in unconstrained environments.  

As an example, the visual quality inspection of manufactured components to identify defects and flaws is a critically important task in manufacturing that is very laborious and time-consuming (See Figure 1 for example of a surface defect visual inspection system). As such, automating this process would result in significant cost savings and significantly improved production efficiencies. However, there are a number of key challenges that make it very difficult to adopt deep learning for visual quality inspection in a real world production setting, including:

Small data problem: the availability of annotated data is limited and thus makes building highly accurate deep learning models a challenge 

Highly constrained operational requirements: designing high-performance deep learning models that satisfy very constrained operational requirements with regards to inference speed and size is very challenging, especially in high-throughput manufacturing scenarios. 

A common practice to the development of custom deep learning models for new industrial applications is to take advantage of off-the-shelf generic models published in research literature such as ResNet and MobileNet architectures and apply transfer learning to learn a new task given the available training data samples. While this approach enables rapid prototyping of models to understand feasibility of the task at hand, such generic off-the shelf models are not tailored for specific industrial tasks and are often unable to meet operational requirements related to speed and size for real-world industrial deployment. As such, it makes it very challenging to adopt off-the-shelf generic models for tackling visual quality inspection applications under real-world manufacturing scenarios.  

A very promising strategy for the creation of highly customized deep learning models tailored for manufacturing visual quality inspection applications is machine-driven design exploration, where the goal is to automatically identify deep neural network architecture designs based on operational requirements.  

One path towards machine-driven design exploration is neural architecture search (NAS), where the problem is formulated as a large-scale, high-dimensional search problem and solved using algorithms such as reinforcement learning and evolutionary algorithms. While they have shown promising results in designing new deep neural network architectures, they are very computationally intensive and require very large-scale computing resources over long search times.  

More recently, another path towards machine-driven design exploration is the concept of generative synthesis, where the problem is formulated as a constrained optimization problem and the approximate solution is found in an iterative fashion using a generator-inquisitor pair. This approach has been demonstrated to successfully generate deep neural network architectures tailored for different types of tasks across different fields and applications. In this study, we explored the efficacy of machine-driven design exploration for the design of deep neural network architectures tailored for high throughput manufacturing visual quality inspection. Leveraging this strategy and a set of operational requirements, we introduce TinyDefectNet, a highly compact deep convolutional network architecture design automatically tailored around visual surface quality inspection. Furthermore, we evaluate the capability of the ZenDNN accelerator library for AMD processors in further reducing the run-time latency of the generated TinyDefectNet for high-throughput inspection. 

Methodology  

The macro- and micro-architectures of the proposed TinyDefectNet are designed by leveraging the concept of generative synthesis for the purpose of machine-driven design exploration. The concept of generative synthesis revolves around the formulation of the design exploration problem as a constrained optimization problem.  

More specifically, we wish to find an optimal generator G? (·) given a set of seeds S which can generate networks architectures {Ns|s ∈ S} that maximize a universal performance function U, with constraints defined by a predefined set of operational requirements formulated via an indicator function 1g(·), G ? = max G 0 U  G(s)  s.t. 1g(G(s)) = 1 ∀s ∈ S. (1) Finding G? (·) in a direct manner is computationally infeasible. As such, we find the approximate solution to G? (·) through an iterative optimization process, where in each step the previous generator solution G¯(·) is evaluated by an inquisitor I via its newly generated architectures Ns, and this evaluation is used to produce a new generator solution. The initiation of this iterative optimization process is conducted based on a prototype, U, and 1g(·). In this study, we define a residual design prototype φ based on the principles proposed in [1], define U based on [15], and define the indicator function 1g(·) with the following operational constraint: number of floating-point operations (FLOPs) is within 5% of 100M FLOPs to account for high-throughput manufacturing visual inspection scenarios. The network architecture of TinyDefectNet is demonstrated in Figure 2, and there are two key observations worth highlighting in more detail.  

First, it can be observed that the micro- and macro-architecture designs of the proposed TinyDefectNet possess heterogeneous, lightweight characteristics that strike a strong balance between representational capacity and model efficiency. Second, it can also be observed that TinyDefectNet has a shallower macroarchitecture design to facilitate for low latency, making it well-suited for high-throughput inspection scenarios. These architectural traits highlight the efficacy of leveraging a machine-driven design exploration strategy in the creation of high-performance customized deep neural network architectures tailored around task and operational requirements needed for a given application. 

Results The generated TinyDefectNet architecture is evaluated using the NEU-Det benchmark dataset for surface defect detection, with the performance of the off-the-shelf ResNet-50 architecture also evaluated for comparison purposes. Both architectures were implemented in Keras with a Tensorflow-backend. The NEU-Det benchmark dataset used in this study is a metallic surface defect dataset comprising of six different defects including rolled-in scale (RS), patches (Pa), crazing (Cr), pitted surface (PS), inclusion (In) and scratches (Sc)  

The dataset contains 1,800 grayscale images with equal number of samples from each classes; from 300 images of each class, 240 (a) (b) (c) (d) (e) (f) Figure 3. Examples of different surface defect types: (a) Scratches, (b) Pitted surface, (c) Rolled in scale, (d) Patches, (e) Inclusion and (f) Crazing. images are assigned for training and 60 images for testing. The input image size to the evaluated deep learning models is 200 × 200.  

Performance Analysis  

Table 1 shows the quantitative performance results of the proposed TinyDefectNet. It can be observed that the three proposed TinyDefectNet network architecture comprises of only ∼427K parameters, which is 56× smaller compared to an off-the-shelf ResNet-50 architecture. Furthermore, in terms of computational complexity, the proposed TinyDefectNet architecture requires only ∼97M FLOPs to process the input data compared to 1.1B FLOPs required by the ResNet-50 architecture with the same input size, and thus requires 11× fewer number of FLOPs in comparison. This high efficiency is highly desirable especially given that the proposed TinyDefectNet architecture performs with the same accuracy as ResNet50 architecture. To further evaluate the efficiency of the proposed TinyDefectNet model, its running time latency is examined (at a batch size of 1024) on an AMD CPU (in this study, an AMD EPYC 7R32 processor) and compared with the ResNet-50 architecture.  

The proposed TinyDefectNet architecture can process an input image in 2.5 ms, which is a 7.6× speed gains when compared to the ResNet50 architecture which needs 19 ms to process the same image. The significant speed gains and complexity reductions demonstrated by the proposed TinyDefectNet over off-the shelf architectures while achieving high accuracy makes it highly suited for high-throughput manufacturing inspection scenarios and speaks to the efficacy of leveraging a machine-driven design exploration strategy for producing highly tailored deep neural network architectures catered specifically for industrial tasks and applications.  

ZenDNN is a run-time accelerator library that is easy to use as it does not require model re-compilation or conversion and can be applied to different models for different tasks. To further improve the running time latency of the proposed TinyDefectNet to improve throughput performance, we perform inference of TinyDefectNet within the ZenDNN environment.  

Explainability-driven Performance Validation  

The proposed TinyDefectNet was audited using an explainability-driven performance validation strategy to gain deeper insights into its decision-making behavior when conducting visual quality inspection and ensure that its decisions are driven by relevant visual indicators associated with surface defects.  

In particular, we leverage the quantitative explainability strategy, which has been shown to provide good quantitative explanations that better reflect decision-making processes than other approaches in literature, and has been shown to be effective at not only model auditing but also identifying hidden data issues.  

It can be observed based on the quantitative explanation that TinyDefectNet correctly decided that this particular surface image exhibit scratch defects by correctly leveraging the different scratch defects found on the surface during its decision-making process. In addition to ensuring correct model behaviour, conducting this explainability-driven performance validation process helps to improve trust in its deployment and usage by human operators and inspectors.  

Conclusion

In this study we introduced TinyDefectNet, a highly effective and efficient deep neural network architecture for the high-throughput manufacturing visual quality inspection. We take advantage of generative synthesis for machinedriven design exploration to design micro- and macroarchitectures of the proposed TinyDefectNet architecture in an automated manner. Experimental results show that the proposed model is highly efficient which can perform 9× faster when it runs with ZenDNN accelerator on an AMD CPU compared to an off-the-shelf ResNet-50 architecture in a native Tensorflow environment. This is very desirable especially since the proposed TinyDefectNet performs with 98 percent accuracy at the same level as ResNet-50 architecture.  

Explainability-driven performance validation strategy was conducted in this study to ensure correct decision-making behavior was exhibited by TinyDefectNet to improve trust in its deployment and usage. Future work involves leveraging this machine-driven design exploration strategy for producing high-performing, highly efficient deep neural network architectures for other critical manufacturing tasks such as component localization and defect segmentation.