What Is tinyML?

Tiny machine learning (tinyML) is a subset of machine learning focused on the deployment of models to microcontrollers and other low-power edge devices. It brings AI to the edge of a networked system, enabling real-time, low-latency, and energy-efficient inference directly on the device without relying on cloud connectivity. Unlike broader Edge AI, which can encompass powerful edge servers and IoT devices, tinyML targets devices at the smallest end of the spectrum, often running with milliwatt power budgets. Engineers in this area are primarily concerned with optimizing algorithms and models to maintain performance while minimizing power consumption and footprint, enabling intelligent features in the smallest devices and sensors.

Essential stages within the tinyML workflow are:

Model development and training: Training your chosen model using preprocessed data, employing techniques such as transfer learning or data augmentation to achieve the desired accuracy while considering the limitations of the target device.
Model optimization and evaluation: Optimizing the trained model to make it more resource-efficient, employing techniques such as quantization, pruning, projection, and data type conversion to reduce memory and computational requirements without sacrificing significant accuracy.
Deployment: Deploying the optimized model onto the target device, ensuring it can perform real-time inference with low latency.
Testing and validation: Testing and validating the deployed model on the target device using representative data to verify its performance in real-world scenarios and identify any potential issues or limitations.

A mobile robotic manipulator platform that employs real-time, AI-enabled decision-making on edge devices potentially enabled by the tinyML workflow. — MATLAB and Simulink support the entire tinyML workflow, enabling design, testing, and deployment of AI-based systems at the edge.

A deep learning Simulink block with generated code connected by an imaginary wire to a microcontroller, representing the process of deploying tinyML applications. — Automatic code generation from MATLAB and Simulink enables rapid prototyping and deployment of tinyML applications on embedded devices, bridging the gap between theory and practice.

tinyML with MATLAB and Simulink

MATLAB^® provides a high-level programming environment for prototyping and experimenting with machine learning algorithms. Simulink^® offers a block diagram environment for designing and simulating models of systems, facilitating iteration and validation before moving to hardware. The details below describe some capabilities of MATLAB and Simulink that enable the tinyML workflow.

Model Development and Training
To develop and train tinyML networks, you can use MATLAB and Simulink, which offer machine learning and deep learning via apps and a high-level language and block diagram modeling environment. You can import networks from TensorFlow™, PyTorch^®, and ONNX with Deep Learning Toolbox™ to speed up your network development and training.
Model Optimization
To optimize your machine learning models for resource-constrained edge devices, you can use Deep Learning Toolbox. MATLAB and Simulink include tools for model quantization, projection, pruning, and data type conversion that allow you to reduce the memory footprint and computational requirements of your models while maintaining acceptable accuracy. This enables efficient execution on low-power devices without sacrificing the performance of the model.
Code Generation and Deployment
You can generate optimized C/C++ code from your trained models using Embedded Coder®. The generated code can include processor-specific optimizations and device drivers that can be directly deployed on microcontrollers or embedded systems, enabling efficient deployment of tinyML. MathWorks works with its partnered semiconductor companies to support a wide range of popular microcontroller platforms, making it easy to target your specific hardware.
Real-Time Testing and Verification
Hardware-in-the-loop (HIL) simulation enables you to simulate and test your tinyML models in real time. This allows you to validate the performance of your models in a virtual real-time environment that represents your physical system before deployment to hardware. MATLAB and Simulink enable integration between simulation and deployment, which helps ensure reliable and accurate results through targeted hardware support packages (HSPs).

Examples and How To

Software Reference

Deep Network Quantizer - Documentation
Embedded Coder Support Package for STMicroelectronics STM32 Processors - File Exchange
Embedded Coder Support Package for Infineon AURIX TC4x Microcontrollers - Hardware Support
Generate Generic C/C++ Code for Deep Learning Networks - Documentation