Design & Reuse

NanoEdge AI Studio v5, 1st AutoML tool with Synthetic Data Generation

NanoEdge AI Studio v5 is the first AutoML tool for STM32 microcontrollers capable of generating anomaly data out of typical logs, thanks to a new feature we call Synthetic Data Generation. Additionally, the latest version makes it easier to create datasets, configure AI libraries, visualize information, run data logging sessions, and more. In a nutshell, as data remains at the center of the AI revolution, NanoEdge AI Studio v5 not only helps process but generate quality data. If machine learning algorithms are only as good as the data that shapes the neural network, then NanoEdge AI Studio v5 has just made creating AI at the edge significantly easier and more accurate by solving one of the biggest development challenges.

blog.st.com, Sept. 22, 2025 – 

Would the real AI challenges please stand up?

The unfair burden

The question is no longer what category of embedded systems uses AI, but rather which ones don’t. Avid readers of the ST Blog would be hard-pressed to name one market that isn’t even minimally influenced by smart sensors, predictive maintenance, data analysis, automated decision-making, and more. The problem is that engineers must now be experts in so many new fields. In a typical application, teams must understand data science, neural network architectures, inference optimization, and other related concepts. Before they can even create an application, they must gather data, select the appropriate models, work with new tools, and even programming languages unrelated to their primary activities. In other words, engineers must venture out of their comfort zone.

The exorbitant costs

The crisis is real. According to a [2020 report by Anaconda], dedicated data scientists spend more than 65% of their time loading, cleaning, and visualizing data. Even if machine learning algorithms at the edge are less complex, they are still costly to create from scratch. Smaller companies may, therefore, feel pressure to adopt AI at the edge, but lack the financial resources to hire talents and invest in necessary resources, which can rapidly overwhelm existing teams. For instance, a speech recognition project shared by the Barcelona Supercomputing Center required logging 2,250 hours of speech “from approximately 20 thousand distinct speakers.” That’s a scale that’s out of reach for nearly all small to medium-sized enterprises.

The data impasse

Another issue is the gathering of quality data. Industrial applications sometimes find it impossible to replicate certain conditions. For instance, no company knows how a motor will break, but they must find a way to replicate that situation, or no one will have data to train AI models. It can lead engineers down an impossible path where they have to find a way to anticipate how to detect a failure without knowing what it will look like. And trying to imagine an adverse event, or simulating one, can be time and resource-consuming with very little to show for if the real-world event is significantly different from the training data.

Stayin’ Alive: 3 practical solutions from NanoEdge AI Studio v5 to fight the office night fever

 

Synthetic Data Generation

It’s precisely those issues that drove ST to work on an application like NanoEdge AI Studio v5. For instance, Synthetic Data Generation solves the challenge of replicating anomalies by analyzing nominal datasets and then simulating an issue by adding noise or a vibration drift. After working on anomaly detection for more than five years, since the first release of NanoEdge AI Studio, ST has acquired expertise that enables us to anticipate abnormal behaviors and create synthetic data based on real-world use cases we’ve encountered. It explains why we are the only ones to offer this feature for time series applications. The tool currently focuses on vibration data, but future releases will target other applications.

Click here to read more