Duet3D FDM Anomaly Detection Research
This project was carried out by Yasas Wijetilake as a part of his master’s programme in AI & Robotics. Conceptually it started as a continuation of some of Duet3D's early work and has also been inspired by other research such as 1, 2, and 3.
Duet3D’s original work uses a 2D, top down approach, comparing images with G-code renders layer by layer. However, this approach is relatively slow as the camera needs to capture multiple images directly from the top of each layer taking time to pause the print, move the head away, and to build up an array of images. It also relies on mounting a camera as a separate tool or otherwise to the printer gantry, rather than having a side view used commonly for 3d printer timelapses.
The research focus is based on rendering an image of an object from a G-code file and comparing it to the actual print in real time to detect print anomalies. The idea was to generalise anomaly detection where an anomaly is defined as any instance the actual print is different to the expected output in the G-code.
For example, the following shows a sample image and its rendered G-code from the same camera poses at the same layer height and bed position.
Intuitively the G-code provides information about the correct outcome. What we found surprising was that the current anomaly detection tools we looked at are not using this information. Most tools are equipped with machine learning algorithms that were trained only to detect a specific type of error or errors, summarised as unwanted extrusion or "spaghetti". These approaches are really good at being resource efficient and fast. However, they can be ineffective at times when the print has gone wrong with none of the anomalies for which the algorithm was trained are there. For instance when there are significant dimensional inaccuracies without any spaghetti, the algorithm may fail to notice that there is an anomaly.
We sought to discover if a more generalised approach for 3D print anomaly detection, using the G-code toolpaths, was practical to implement.
The tools/approaches that were used to verify this idea are as follows:
|A 3D printer with a network interface to read coordinates (XYZ)||Ender 3 printer equipped with RRF and a Duet3D board|
|A calibrated/pose estimated camera setup used for imaging of 3D prints||Raspberry Pi/MotionEyeOS connected Logitech C270 (640x480 streamed)|
|A tool that can recreate a digital twin of the image from the camera using the G-code file and printer XYZ parameters||Blender for rendering and visualising; Blender headless for rendering and ThreeJS for visualising|
|An algorithm that can compare two images of the same print and determine if there are any anomalies||Machine learning - Siamese Networks|
B. Conceptual Workflow
There are many methods to perform image comparisons, for instance, approaches such as measuring the Euclidean distance between raw pixel intensities can be effective in the case of a simple task.
For a complex task such as comparing images of an actual print and a digital render taken under varying lighting conditions and imperfect pose-estimations, an approach involving machine learning is far more suitable. Therefore an approach based on Siamese neural networks, an architecture that can be used for comparing two images to retrieve a similarity measure between them, was chosen.
Siamese networks use multiple (2 or 3 depending on the case) parallel sub-networks consisting of feature extraction layers (usually a CNN-convolutional neural network model). The outputs from each branch are used for determining the image similarity between the input images using a loss function such as “contrastive loss”.
The following figure is of a Siamese network with 2 subnetworks.
Siamese Network with 2 subnetworks
As shown in the above example, actual and rendered images (preprocessed) can be the inputs of this Siamese network.
As the method was to train a machine learning model following the Siamese architecture, a dataset with matching image pairs was needed for the neural network to learn weights to find a correlation between an anchor image and a candidate image. So far 3 datasets have been constructed using images of actual 3D prints and their corresponding G-code. In this post, we will be discussing the first two datasets.
For the creation of the dataset, images were sourced both by printing and from images of prints available in the community. Image renders were generated using the model G-codes, constructing an image pair with the same camera pose.
When creating the dataset, multiple images taken from different angles of each model were used. An intuitive approach of creating a dataset would be to capture multiple images of the print while in progress, in a similar manner to how the model would be used when deployed. However this approach was not taken as there is a risk of imparting unwanted bias as early parts of the print would be added many times, whereas later parts would be added less frequently.
Image augmentation processes were performed by splitting and rotating the images to increase generalisability and the ability to discover anomalies in a focused area. The source images (both actual and render) were divided into 2x2, 4x4, 8x8 segments so the network can capture and train to identify anomalies at different resolutions. (For example, during an actual print as the bed moves close to and away from the camera, the size of print in the image will not be consistent). This provides the ability to make comparisons of much smaller areas of the print instead of comparing the whole print at once.
Once the images were augmented, they were of different resolutions but in general above 100x100. The image size for training was chosen as 100x100 pixels for each dataset and the augmented images were scaled down to this resolution.
A higher image resolution such as 100x100 allows for a substantially higher level of detail in data relative to a 24x24 or 48x48 image. For instance, image sizes such as 24x24 can lead to significant information loss during the downscaling process. However, 100x100 retains more of the original image's context and content allowing the capturing of fine details which can be valuable when identifying subtle differences. Moreover, this can lead to more accurate similarity assessments, reducing the likelihood of false positives or negatives.
|Dataset||Image pair count||G-code render method||Model reference list|
|v1||1455||Blender||Ananas Vase; London Telephone Box; Radcliffe; Torre de Belem; Chromatic Vase; Lattice Bowl|
|v2||3168||Blender/ThreeJS||Crag Vase; Lattice Bowl; Minion Pot; Owl Pot; Darth Vader Holder; Darth Vader Bust; Moa Vase; Zephyr Vase; Cat; Tiki Mug|
Dataset v1 (1455 image pairs) This version of the dataset was built using Blender and manually aligning with the actual image of workpieces. This approach was used to quickly meet the academic timelines, disregarding the lack of scalability of the approach.
Dataset v2 (3168 image pairs) When building this dataset the previous freehand approach was replaced with the workflow that follows later in this blog. This allowed much easier parametric manipulation of the 3D model’s poses to align with the actual image. Despite the ease, there are still some manual steps involved as custom images were used instead of creating the dataset from calibrated-printer camera images
When using Siamese networks it is possible to leverage well-known architectures such as Resnet, VGG and use their pretrained weights. This approach often is preferred as with pretrained models it is only needed to customise the model for the task. In our case we compared a ground up design with an architecture based on Resnet. Keras with TensorFlow backend was used as the library of choice due to familiarity with these tools.
|Ground up Design||Pretrained Resnet|
|Used pretrained weights||No||Yes|
|Number of channels the model requires||1||3|
|Number of actual channels in data||1||1 *|
*Since we've used contours, the images have only black and white channels. So those channels were duplicated into two more channels with the same information.
The code for these models will be provided in a future blog post.
The models described in the previous section needed to be integrated into a workflow for detecting anomalies in real-time from a video/series of still images.
This workflow was constructed as a web application where a browser-run client application sends images of the workpiece and the digital twin to the backend which then performs anomaly detection tasks and returns a labelled image. The backend was also used for returning a rendered GBL file of the G-code upon request.
The browser-run client application was developed with a printer/camera calibration and anomaly detection features. These two are detailed below.
One of the prerequisites of this project was to have a pose estimated camera. The initial attempt was to lay a calibration pattern such as a checkerboard (OpenCV-Github) on the print bed and calculate the pose and position of the camera with respect to a fixed point on the printer. However, this task required too many manual measurement steps so a simpler and a more reliable method was developed using a three-part calibration sequence.
Calibration Step 1 - Pose Estimation
Calibration Step 2 - Base Height Estimation
First the area of interest in the image was divided into much smaller segments for increased resolution. Anomaly detection was then performed on each segment, where the similarity score was assigned. If the score was greater than 0.5, it was considered as similar, while one below 0.5 was considered to have an anomaly. (The value 0.5 was chosen as it was when the actual results aligned mostly with the expected outcome, anomaly/not an anomaly, of the model as opposed to 0.3 and 0.7 which yielded more false positives and missed true positives respectively.)
The following are a series of such segments from the actual and rendered image pairs.
Overall Anomaly detection workflow
The previously discussed models were trained on different datasets and evaluated using augmented test data and against an actual print timelapse.
The test dataset was augmented using the images from the same dataset and rotating them to ensure that the data was not seen by the model previously while maintaining the overall tone of the data.
For the real-world test scenario, a pre-recorded timelapse with image pairs was used.
Up to the point of creation of Dataset v2, the workflow heavily depended on Blender and a backend server to generate GLB files and visualising using ThreeJS. However, from the next iteration this is to be streamlined by using the visualiser developed by Juan Rosario and already integrated into Duet3D's Machine control UI: DuetWebControl.
As the current workflow is browser-dependent, in the event the browser crashes or closes, the user has to reload the calibration configs and resume the process manually.
The existing workflow is written as a web application, which means that a web browser must continuously run to perform anomaly detection. As the architecture requires a backend anyway to provide the analysis, this could be leveraged to provide headless functionality, as long as a suitable method to send the images to the backend and send control commands back to the printer was developed.
When looking for anomalies, the AI model tended to find shadows or highlighted areas in the actual image with varying colour intensities as dissimilar to the digital render. This is a significant issue as it contributes to quite a bit of false positives.
This work is continuing and in a future blog post or posts we will describe the results with Dataset 3 and share the code for the ML models.