From training to reasoning:creating neural networks for image recognition

Traditional image processing software relies on algorithms for specific tasks,while deep learning software uses user training algorithms implemented on the network to identify good and bad images or regions.
Fortunately,the emergence of special algorithms and graphical user interface(GUI)tools for training neural networks enables manufacturers to conduct neural network training in a more convenient,faster and more affordable way.What can manufacturers get from in-depth learning GUI tools?How do you feel about using these tools?
1.Training:creating a model for deep learning
Training is the process of”teaching”the deep neural network(DNN)to perform required tasks(such as image classification or converting voice into text)by providing learnable data to the deep neural network(DNN).DNN predicts the content represented by the data.Then the error in the prediction is fed back to the network to upgrade the connection strength between artificial neurons.The more data received,the more DNN will learn,until the prediction made by DNN can reach the expected accuracy level.


For example,consider training a DNN whose purpose is to recognize the image as one of three different categories-human,automobile or mechanical gear.
Normally,data scientists using DNN will form a pre-assembled training data set,which is composed of thousands of images,each of which is marked as”human”,”car”or”gear”.This dataset can be a ready-made dataset,such as Google’s Open Images,which includes 9 million images,nearly 60 million image-level tags,and more.
Annotation methods in Google Open Images:image-level labels,bounding boxes,instance segmentation and visualization relationships.
If the application of data scientists is too professional for the existing solutions,it may be necessary to build its own training data set,collect and mark the images that best represent DNN needs to learn.
During training,each image will be transferred to DNN,and then DNN will predict(or infer)the content represented by the image.Each error will be fed back to the network to improve the accuracy of the next prediction.
The neural network here predicts that the image of a”car”is”gear”.Then,the error will be sent back through DNN,and the connection in the network will be updated to correct the error.The next time the same image is submitted to DNN,DNN is likely to make a more accurate prediction.
This training process continues,the image is input to DNN,the weight is updated,the error is corrected,and the process is repeated dozens or thousands of times until DNN can predict with the required accuracy.At this time,DNN will be considered”trained”and the generated model can be used to classify new images.
2.Adjusting neural network
The number of inputs,hidden layers and outputs of the neural network is highly dependent on the problem you want to solve and the specific design of your neural network.During the training,the data scientist tried to guide the DNN model to achieve the expected accuracy.This usually requires running many or even hundreds of experiments to try to carry out different DNN designs,which vary with the number of neurons and layers.
Between input and output is the connection of neuron and network-hidden layer.For many deep learning projects,1-5 layers of neurons are sufficient,because only a few features are evaluated for prediction.However,more complex tasks,more variables and considerations require more layers of neurons.Processing image or voice data may require tens to hundreds of layers of neural networks(each layer performs specific functions),as well as millions or billions of weights connected with neural networks.
Multi-layer DNN simplified example,which contains the types of tasks that each layer may perform.
3.Start from sample collection
Generally,hundreds or even thousands of manually classified images are needed to train the system and create a highly predictable object classification model.However,it has been proved that collecting and annotating such complex data sets is a major obstacle in the development process,which hinders the adoption of deep learning in mainstream vision systems.
Deep learning is very suitable for environments where light,noise,shape,color,texture and other variables are common.A practical example showing the advantages of deep learning is to detect scratches on textured surfaces such as brushed metal.Some scratches are not bright enough,and the contrast is close to the texture background itself.Therefore,traditional techniques can not reliably locate these types of defects,especially when the shape,brightness and contrast of different samples are different.Figure 1 illustrates the scratch detection of metal sheet.The defect is clearly displayed through the thermal image,and the pixel at the defect position is highlighted.
The surface inspection shows the brushed metal plate with scratches on the left side,and the thermal map output by the classification algorithm shows the defects on the right side.The classification algorithm is automatically generated when the neural network is trained with input samples.It should be noted that we have added a yellow circle to show the correspondence between the original image and the heat map.
The deep neural network trained from scratch usually needs to collect hundreds or even thousands of image samples.However,today’s deep learning software is usually pre-trained,so users may only need to collect dozens of additional samples to adapt the system to specific applications.
On the contrary,inspection applications built using conventional classification need to collect”high-quality”and”inferior”images for training.However,when using new classification algorithms such as anomaly detection,users can only train high-quality samples and only need a small number of low-quality samples for final testing.
Although there is no shortcut for image sample collection,the process has become simpler and simpler.To capture images,technicians can use Sapera LT,a free image acquisition and control software development kit(SDK)that can be used for Teledyne DALSA’s 2D/3D camera and image acquisition card.The GUI tool Astrocyte used for training neural network can be connected with Sapera LT interface to collect images from the camera.For example,when users collect images on PCB components in manual mode,they will move the PCB manually,change the position,angle and distance of the camera,and generate a series of views of PCB components.
4.Training neural network with visual tools
The user should conduct neural network training after obtaining the image.Just click the”Train”button to train in Astrocyte and start the training process with the default super parameters.Users can modify the super parameters so that the final model can achieve higher accuracy.
To verify the accuracy,users can use a different set of images to test the model,and choose to use diagnostic tools,such as the confusion matrix of the classification model.The confusion matrix is an NxN table(where N=number of categories),which shows the success rate of each category.In this example(see Figure 2),color coding is used to represent the precision/recall rate of the model,and green indicates that the precision/recall rate exceeds 90%.
In the confusion matrix,double-click the result field to open the relevant image in the Astrocyte Image tab for further research.
Another very important diagnostic tool is heat map.For example,when performing anomaly detection,the thermal map will highlight the location of the defect.After seeing the heat map,the user will evaluate the quality of the image according to the appropriate reasons.If the image quality is good but is classified as inferior,users can view the heat map for more details.The neural network will follow the input provided by the user.
A good example is the use of heat maps in screw inspection applications:
Astrocyte displays the defects circled at the top right through the corresponding heat map.The top left corner shows the perfect image.
The heat map can also reveal the image details or features that the model focuses on,which are independent of the expected analysis of the image target scene or object.According to the Astrocyte module,different types of heat map generation algorithms can be used.
5.Run GUI Tools
The best way to explain the method of in-depth learning GUI tools is to demonstrate.The training of anomaly detection model is the basis of training neural network,so a short tutorial is provided here,including the method of using Astrocyte to detect anomaly step by step.
Step 1:Start the Astrocyte application and select the Anomaly Detection module in the startup interface:
Step 2:In the Dataset tab,right-click and select Add Dataset.
Step 3:Enter the dataset name and description,right-click in the Databases panel and select Add Database.
Step 4:In the”Add”location dialog box,navigate to the folder containing the training image data set.Select the normal(good)and abnormal(bad)directories at the same time,and then click OK.
Step 5:Each directory uses the drop-down list to assign a class label:Normal or Abnormal.Then click Generate to add the dataset to the internal Astrocyte server.
After the generation process is completed,if the image size in the dataset is different,the image size distribution analyzer dialog box will be displayed;Otherwise,if the image size is the same,it will be automatically adjusted to the specified maximum image size,and the dialog box will not be displayed.If necessary,use the Image Correction dialog box to correct the image.
Step 6:In the Image tab,use the Dataset drop-down list to select the desired dataset.Then verify the dataset image and label and make necessary changes.If you modify the dataset,click Save to update and save the dataset on the Astrocyte server.
Step 7:In the Training tab,select the dataset and click Training;Update the training loss and measurement chart at the completion of each batch,and display the training statistics.
Step 8:When the training is completed,a prompt to save the model will be displayed.Click Yes.Enter the model name and description,and then click OK.Now you can use the model for testing.Astrocyte also uses the same intuitive GUI to guide users through this process.
6.Do we need to improve the trainees to optimize reasoning?
Once the training is completed with an acceptable level of accuracy,we will get a weighted neural network-essentially a huge database.This database works well,but it may not achieve the best results in terms of speed and power consumption.Some applications cannot tolerate high levels of delay:for example,intelligent transportation systems and even autonomous vehicle.Autonomous UAVs or other battery-powered systems may need to operate in a small power range to meet flight time requirements.
The larger and more complex DNN is,the more computation,memory and energy will be consumed for training and operation.Therefore,it may not be applicable to your given application or device.In this case,it is necessary to simplify DNN after training to reduce power consumption and delay,even if this simplification will lead to a slight reduction in prediction accuracy.
In deep learning,this optimization is a relatively new field.Chip and AI accelerator suppliers usually create
SDK to help users perform such tasks-using software tailored to specific architectures.The range of chips involved is very wide,including GPU,CPU,FPGA and neural processor.Each chip has its own advantages.For example,NVIDIA’s TensorRT highlights the company’s professionalism in GPU core.In contrast,Xilink’s Vitis AI supports the company’s SoC,such as Versal,including CPU,FPGA and neural processor.
Suppliers usually provide two workarounds:pruning and quantification.Pruning refers to the operation of deleting the neural network part that contributes less to the final result.This operation can reduce/reduce the size/complexity of the network,but will not significantly affect the output accuracy.The second method is quantization-reduce the number of bits of each weight(for example,replace FP32 with FP16 or quantization INT8/4/2).Performing less complex and difficult calculations can improve the speed and/or reduce the required hardware resources.
7.Put into production:turn to reasoning
Once our DNN model has been trained and optimized,it can be put into use:forecast against previously invisible data.Like the training process,DNN attempts to classify images as input.Astrocyte。
Teledyne DALSA provides two software packages,Sapera Processing and Sherlock,including a set of image processing tools and a reasoning engine for running artificial intelligence models.
Teledyne DALSA AI training and reasoning software package
Users can implement reasoning on PCs or embedded devices using GPU or CPU.According to the size,weight and power consumption(SWAP)requirements of the application,users can use various technologies to realize deep learning reasoning on embedded devices(such as GPU,FPGA and dedicated neural processors).
8.Deep learning:increasingly simplified?
In essence,neural network is a complex and powerful tool.There are almost unlimited opportunities to adjust and optimize each neural network to solve your specific problems in the best way.The wide range of optimization and the rapid development of new research and tools may be overwhelming,even for experienced practitioners.
But this does not mean that you cannot integrate the advantages of these tools into your future vision system.The evolution to GUI tools is democratizing deep learning in the visual system.With the help of software that liberates users from the strict requirements of AI learning and programming,manufacturers can use deep learning to better analyze images,and the effect is better than any traditional algorithm.In the near future,this type of GUI tool may perform better than our traditional inspectors.

Scroll to Top