Closing the Edge AI Gap: From Model Training to Real-World Deployment

Jun 12, 2026 | Length: 01:07:41

This on-demand webinar walks through a complete edge AI workflow, from model training and optimization to containerized deployment and fleet management, using a real-world industrial inspection application built on the Digi ConnectCore 95 System-on-Module (SOM).

Industrial teams face significant challenges when moving AI models from development into real-world edge deployments. This webinar, presented by Digi International and RBZ Robot Design, uses a live optical inspection system from a bakery in Valencia, Spain to demonstrate how the full edge AI lifecycle can be managed end to end.

Topics include dataset preparation, model training and quantization, inference across multiple hardware accelerators (CPU, internal NPU and external RBZ ARA240 module), and containerized deployment using Docker and LXC.

The webinar also explores cloud-based fleet management with over-the-air model updates, device monitoring and retraining loops, all running on the Digi ConnectCore 95 System-on-Module (SOM) and powered by Digi ConnectCore Cloud Services.

Connect with Digi

Want to learn more about how Digi can help you? Here are some next steps:

Contact us to talk to a Digi expert
Sign up for our newsletter to learn about emerging trends and new solutions
Shop for solutions from Digi and our partners

Q&A: Closing the Edge AI Gap: From Model Training to Real-World Deployment

The following Q&A took place at the close of the webinar, moderated by Keith Kreisher, Executive Director of the IoT M2M Council, with presenters Andreas Burghart, Senior Product Manager at Digi International, and Daniel Amor, Founder and Innovation Manager at RBZ Embedded Logics.

Daniel, you mentioned that a loss in accuracy can be recovered after quantization. How often does that happen — is it the exception, or is it just part of the process?

Daniel: It happens all the time. Once you move from a floating-point to a quantized model, there is always a loss of accuracy. Sometimes you're lucky, and it's around 1% or 2%, but that's really unusual. You always need to plan for it.

What happens if a model update causes a regression in the field? Can you roll back a container, and if so, how quickly?

Andreas: Yes, you can roll back, and you use the same OTA mechanism as the forward update — you just push the prior container version back to the fleet. Because the containers are versioned and the previous image typically still sits in the repository, you can roll back quickly. There is no special undo feature. It's simply deploying that earlier version after the new one caused problems. You can also use the template feature I showed to define a specific container version for your fleet and trigger an update to the entire fleet at once through that template infrastructure. You declare the desired state, and the system enforces it. That way, you're not chasing individual devices — you're defining compliance, and the platform reconciles all deviating devices automatically.

Once a model is deployed at scale, how does ConnectCore help detect if a model is starting to drift or degrade? And what does that retraining trigger actually look like in practice?

Andreas: ConnectCore Cloud Services handles the device side of that loop. Your inference container provides metrics — confidence scores, latency, prediction distribution — through the data streams in ConnectCore Cloud Services, and that data is collected across the fleet. The drift detection logic itself sits in your monitoring stack above that layer. When degradation crosses a threshold, it can trigger your retraining pipeline externally, you produce a new container image, and that image gets pushed back via ConnectCore Cloud Services over the air. So ConnectCore Cloud Services is more of the execution layer for that response. The drift detection intelligence sits above it — that's more in Daniel's territory.

Daniel: Once something is flagged on the platform through those metrics, you need to pull information from the device — video frames, image samples — and cross-check against your staging environment. Maybe there's a problem with illumination, or something changed in the product you're looking at. Then you need to collect more data and go back to the training loop. The training loop takes a lot of time, so it's not run continuously. You need to trigger it deliberately, especially if you're using a pay-per-use training system.

How did you handle class imbalance during training for the bakery? Did you use augmentation, synthetic data, or something else?

Daniel: We developed a lot of data augmentation to improve localization of the different bread types. For the defect side, we had to go to the bakery and physically create defective breads with the people working there, so we could rebalance the classes. Once we had the data set, we checked how the classes were distributed, and when we found imbalance, we went back to the bakery and worked with their team to intentionally produce the defects we needed. Some breads had to go to waste in the process, unfortunately, but it gave us control over the environment and let us generate the right defect samples.

You showed response times dropped. What did the optimization workflow actually look like, and where did you hit the most friction moving from a cloud-trained model to the neural processing unit?

Daniel: The first model in floating point was working well. Once we moved to quantization, we saw a significant accuracy loss and had to go back to training. That meant hours of work modifying what's called the hyperparameters — the configuration settings that govern how the model trains — until we found a setup that performed well for the target accelerator. We had a model working cleanly in floating point, and then spent several weeks getting the quantized version to match that performance on the hardware.

Andreas, the ConnectCore 95 is a powerful platform — but what if an application doesn't need that level of performance? What are the options for lower-demand AI use cases?

Andreas: Digi has a scalable SOM portfolio with lower-performance, lower-cost options. For example, the ConnectCore MP25 and ConnectCore 93 both include NPUs and are well-suited for use cases that don't need the full power of the ConnectCore 95 — things like single-camera setups, lower resolutions, people counters, and similar applications. The good news is that all these SOMs share the same Digi Embedded Yocto software ecosystem and the same ConnectCore tools and services. So depending on your requirements, you can scale up or down within the same platform without starting over.

Are there real cases where you'd run hybrid pipelines across the NPU and Ara240 at the same time?

Daniel: This was a demo context, so switching on the fly like that wouldn't be typical in production. But the concept is very real. We've built applications where a mid-size model runs on the internal accelerator, and when something interesting is detected, it hands off to the external accelerator for deeper analysis. Then a final refinement step runs on the CPU, using a small, targeted model trained for a narrow set of cases. There are also setups where a small model on the CPU acts as a trigger — only activating the heavier models when something actually changes in front of the camera, so you're not doing the heavy lifting every frame.

For a factory environment like the bakery — what are the connectivity options if there's no reliable Ethernet network?

Andreas: That's actually not uncommon. ConnectCore 95 includes Wi-Fi 6E on the module, so you're not dependent on wired Ethernet, which is often impractical in food production or kitchen environments. Wi-Fi 6E also offers higher throughput and lower latency compared to older standards, which covers most factory environments with a wireless infrastructure. If Wi-Fi isn't available and you need a more resilient fallback, cellular connectivity is an option through our XBee cellular product line, which provides LTE connectivity and comes with global data plans. The edge device never stops running inferencing when connectivity drops — the model runs locally on the device at all times. The cloud connection is for management, telemetry collection, and OTA updates, not for the inferencing itself. So the system keeps doing its job regardless of connection state, and syncs when it can.

Is the ONNX runtime part of Digi Embedded Yocto?

Andreas: Digi Embedded Yocto provides the underlying system that everything runs on. For the runtime specifically — Daniel, can you take that?

Daniel: The Yocto recipe layer includes components from NXP. ONNX and TFLite are both part of that offering, with the connectors for the accelerators included.

Andreas: And Yocto is flexible and modular, so you can enable that recipe and have that runtime baked directly into your OS image. You have full control over what's included.

Download the Digi ConnectCore 95 Datasheet
Get the full specs on the production-ready SOM at the center of this Edge AI deployment.

Digi XBee for Wi-SUN

Closing the Edge AI Gap: From Model Training to Real-World Deployment

Complete the form once and enjoy seamless access to all Digi resources: turn off cookie blockers so your preferences can be saved for next time.

Recorded Webinar

Closing the Edge AI Gap: From Model Training to Real-World Deployment

Connect with Digi

Q&A: Closing the Edge AI Gap: From Model Training to Real-World Deployment

Daniel, you mentioned that a loss in accuracy can be recovered after quantization. How often does that happen — is it the exception, or is it just part of the process?

What happens if a model update causes a regression in the field? Can you roll back a container, and if so, how quickly?

Once a model is deployed at scale, how does ConnectCore help detect if a model is starting to drift or degrade? And what does that retraining trigger actually look like in practice?

How did you handle class imbalance during training for the bakery? Did you use augmentation, synthetic data, or something else?

You showed response times dropped. What did the optimization workflow actually look like, and where did you hit the most friction moving from a cloud-trained model to the neural processing unit?

Andreas, the ConnectCore 95 is a powerful platform — but what if an application doesn't need that level of performance? What are the options for lower-demand AI use cases?

Are there real cases where you'd run hybrid pipelines across the NPU and Ara240 at the same time?

For a factory environment like the bakery — what are the connectivity options if there's no reliable Ethernet network?

Is the ONNX runtime part of Digi Embedded Yocto?

Have a Question? Connect with a Digi Team Member Today!

Contact Information

Solutions

Packaged Solutions

Managed Services

Products

Technical Support

Resources

Company

By Industry

By Application

By Technology

Embedded Systems

Cellular and Networking

Professional Services

Digi

Digi Ventus

Opengear

Particle

SmartSense

Resource Library

Technical Support

About Digi

Partners

Closing the Edge AI Gap: From Model Training to Real-World Deployment

Complete the form once and enjoy seamless access to all Digi resources: turn off cookie blockers so your preferences can be saved for next time.

Recorded Webinar

Closing the Edge AI Gap: From Model Training to Real-World Deployment

Connect with Digi

Q&A: Closing the Edge AI Gap: From Model Training to Real-World Deployment

Daniel, you mentioned that a loss in accuracy can be recovered after quantization. How often does that happen — is it the exception, or is it just part of the process?

What happens if a model update causes a regression in the field? Can you roll back a container, and if so, how quickly?

Once a model is deployed at scale, how does ConnectCore help detect if a model is starting to drift or degrade? And what does that retraining trigger actually look like in practice?

How did you handle class imbalance during training for the bakery? Did you use augmentation, synthetic data, or something else?

You showed response times dropped. What did the optimization workflow actually look like, and where did you hit the most friction moving from a cloud-trained model to the neural processing unit?

Andreas, the ConnectCore 95 is a powerful platform — but what if an application doesn't need that level of performance? What are the options for lower-demand AI use cases?

Are there real cases where you'd run hybrid pipelines across the NPU and Ara240 at the same time?

For a factory environment like the bakery — what are the connectivity options if there's no reliable Ethernet network?

Is the ONNX runtime part of Digi Embedded Yocto?

Have a Question? Connect with a Digi Team Member Today!

Packaged Solutions

Managed Services