300x250 AD TOP

Search This Blog


Featured Post

USART with DMA on STM32

 I've been working with many projects that use the USART and not one was like the other alghough hardware resources were pretty similar....

Paling Dilihat

Powered by Blogger.

Feature Label Area

Monday, November 15, 2021

USART with DMA on STM32

 I've been working with many projects that use the USART and not one was like the other alghough hardware resources were pretty similar. 

So I've sat down and decided to make a boilerplate for USART with DMA implementation that uses binary semaphores to notify when data arrives and buffers the output to create as little delay as possible as well as leave as much CPU as possible for the rest of the system.

For this demo I'll be using the STM32F446 Nucleo-64.

By default, it has the USART2 pins connected to the on board ST-Link so its possible to just open a terminal, watch logs and send commands to the MCU with as little effort as possible.

Once we have the basics setup in the IDE and the USART2 Enabled as Asynchronous, We'll go ahead and add DMA Channels:

One for read, one for write and set them both to Normal mode.

Enable global interrupts:

We then go ahead and add FreeRTOS, so we can demo a general application:

And go ahead and USE_NEWLIB_REENTRANT so we can use printf:

And lastly we'll go to project manager and mark the Generate peripheral initialization as pair of '.c/.h' files per peripheral for just to keep our application a bit cleaner:

A known bug (1,2,3,4) in HAL generated projects is that the DMA is not initialized in order, a simple solution will be to duplicate the DMA initialization call to the 'USER CODE BEGIN SysInit' section in main.c so whenever the project is regenerated, the change won't get lost.

/* USER CODE END SysInit */

Once our project is generated, we'll add a circular buffer of choice, in this case I've chosen to use Tilen Majerle's lwrb - Lightweight ring buffer manager.

Next in our usart.c, we'll add 2 semaphores for the tx and rx buffers, 2 aligned buffers for the DMA and 2 buffers for rx and tx, we'll use our "USER  CODE BEGIN 0" for that so we'll keep them when the project is regenerated through STM32CubeMX/IDE. 

Feel free to change the buffer sizes, though for my needs I didn't see a reason to go higher.


#include <cmsis_os.h>
#include "lwrb/lwrb.h"

static SemaphoreHandle_t readSemaphore;
static osSemaphoreId writeSemaphore;

__aligned(32) uint8_t TX_DMA_buffer[TX_DMA_BUFFER_SIZE];

__aligned(32) uint8_t RX_DMA_buffer[RX_DMA_BUFFER_SIZE];

lwrb_t rx_buffer;
uint8_t rx_buffer_container[255];

lwrb_t tx_buffer;
uint8_t tx_buffer_container[255];

void initialize_buffers(void) {
	writeSemaphore = osSemaphoreCreate(osSemaphore(WRITESEM), 1);

	if (readSemaphore == NULL) {

	if (lwrb_init(&rx_buffer, rx_buffer_container, sizeof(rx_buffer_container)) != 1){
	if (lwrb_init(&tx_buffer, tx_buffer_container, sizeof(tx_buffer_container)) != 1){


Note we included also our buffer initialization routine in the header.

Next we'll add the DMA start in our MX_USART2_UART_Init function in usart.c:

  HAL_UARTEx_ReceiveToIdle_DMA(&huart2, RX_DMA_buffer, RX_DMA_BUFFER_SIZE);
  __HAL_DMA_DISABLE_IT(&hdma_usart2_rx, DMA_IT_HT);
  /* USER CODE END USART2_Init 2 */

Thanks for the tip about DMA_IT_HT from ControllersTech.

Next we'll add our USART tx/rx functions in usart.c. If you're wondering about the xSemaphoreGiveFromISR at line 23, its used to notify the waiting thread about new data rather than continuous polling that will either waste CPU time or cause a delay between received bytes until the thread realizes it.


static int tx_next_chunk(void) {
	int number_of_items_in_tx_buffer = lwrb_read(&tx_buffer, TX_DMA_buffer, TX_DMA_BUFFER_SIZE);
	if (number_of_items_in_tx_buffer > 0) {
		if (HAL_UART_Transmit_DMA(&huart2, TX_DMA_buffer,
				number_of_items_in_tx_buffer) != HAL_OK) {
		__HAL_DMA_DISABLE_IT(&hdma_usart2_rx, DMA_IT_HT);
	return number_of_items_in_tx_buffer;

void HAL_UARTEx_RxEventCallback(UART_HandleTypeDef *huart, uint16_t Size) {
	if (huart->Instance == USART2) {
		if (lwrb_write(&rx_buffer,  RX_DMA_buffer, Size) != Size ){
			//buffer overrun

		HAL_UARTEx_ReceiveToIdle_DMA(huart, RX_DMA_buffer, RX_DMA_BUFFER_SIZE);
		BaseType_t xHigherPriorityTaskWoken;

int get_rx_data(uint8_t *buffer, size_t buffer_length, uint32_t timeout) {
	xSemaphoreTake(readSemaphore,pdMS_TO_TICKS(timeout ));
	return lwrb_read(&rx_buffer, buffer, buffer_length);

void HAL_UART_TxCpltCallback(UART_HandleTypeDef *huart) {
	if (huart->Instance == USART2) {

void put_tx_data_with_wait(uint8_t *buffer, size_t buffer_length) {
	int retries = 1000;
	while (retries > 0) {
		int pushed_bytes = put_tx_data(buffer, buffer_length);
		buffer_length -= pushed_bytes;
		buffer += pushed_bytes;
		if (buffer_length <= 0) {

int put_tx_data(uint8_t *buffer, size_t buffer_length) {
	int ret = 0;
	if (osSemaphoreWait(writeSemaphore, osWaitForever) == osOK) {
		ret = lwrb_write(&tx_buffer, buffer, buffer_length);

	if (huart2.gState == HAL_UART_STATE_READY) {
	return ret;


And our function prototypes in usart.h:

/* USER CODE BEGIN Prototypes */
void put_tx_data_with_wait(uint8_t *buffer, size_t buffer_length);
int put_tx_data(uint8_t *buffer, size_t buffer_length);
int get_rx_data(uint8_t *buffer, size_t buffer_length, uint32_t timeout);
/* USER CODE END Prototypes */

And lastly we'll create our echo demo in StartDefaultTask in our freertos.c:

void StartDefaultTask(void const * argument)
  /* USER CODE BEGIN StartDefaultTask */
	while (1){
		uint8_t temp_buffer[64];
		size_t read_bytes;
		read_bytes =get_rx_data(temp_buffer, sizeof(temp_buffer), 100);
  /* USER CODE END StartDefaultTask */

What the demo does is essentially waiting for up to 64 bytes or 100ms and transmitting back what it got. so this thread is waiting most of the time, the DMA does most of the work and the ring buffer is just there to make sure everything plays together nicely.

The demo project can be found here:


Tags: , , ,

Tuesday, November 5, 2019

Introduction to ESP32 Debugging

Note: This article is a prelude to a talk I'm having about ESP32 Unit Testing and Debugging on November 27th 2019.

If you read any of my previous articles you could probably guess I'm not a big fan of debugging. I truly believe that production should not be debugged (with exceptions) and therefore, its better to change one's thought process and build beneficial logging abilities.

But probably the most widely used debugging 'technology' is the printf way, by locating the crash or stack trace, the magical printf can tell us the current state which lead to the bug or crash and we can fix it. If we're thorough, we'll probably add a unit test to avoid that bug in the future.

As much as I would like it to be, in embedded systems, logging is not always realistic, it can affect timing, occupy UARTS and kill eMMC.

So what are our other options?

Sometimes the we only need an indication something is not happening, an "if" statement we're not sure is actually happening, how about using a GPIO to turn a led on?

Two more relatively fast options are I²C and SPI, we can use a very simple program that dumps the values being sent.

To further improve the logging abilities of these facilities, you can encode only the values, rather than a text log message.

Debugging Port

JTAG has been around for quite some time (1990!), these days most MCUs have a debugging port, be it JTAG, SWD or debugWire (AVR).
JTAG in particular is very capable, its designed to be chained across all the chips, processors and DSPs on the board, so a single port can be used to debug many components.


While printf debugging can provide a short term or a localized debugging option, as developers, we need to consider longer term and production problem solving and these solutions either keep state changes in a log file by either saving rolling logs or by having some sort of circular buffer of logging messages.

While logging is pretty straight forward implementation, the ESP32 logging facilities provides a few interesting points:
1. logs are divided by TAGs
2. logs can be turned off/on/set logging level by each tag
3. internal esp-idf components also have a log tag
4. printf is always sent to UART0
5. logs can be captured, this is one of the more interesting features since it allows you, as a developer to have a device in the field that is misbehaving and you can turn on logging remotely and ask for the log files.

While not directly related to logging, the ESP32 and FreeRTOS provides a few more interesting mechanisms for debugging problems:
1. get reset reason, this is very important, think of it as extra information you can write to your logs when the device starts, did it reboot because of power failure? brownout? watchdog?
2. Core Dump, when the device in the field, who is going to monitor the stack trace? it cannot be written to the log file, nothing is usually monitoring the UARTs, so where does it go? you can configure a core dump to place it on the flash, so next time you're asking for logs, you can also retrieve the core dump and analyze it.
3. FreeRTOS Memory Analysis, heap corruption, maximum stack use, maximum heap use and even memory tracing similar to crtdbg.
4. FreeRTOS CPU Utilization

PlatformIO Unified Debugging

PlatformIO became my favorite development platform, its simplistic, near zero configuration and simple extensibility gives one ability to do almost anything with very little effort.
In ESP32 case, the openocd-esp32 and esp-idf are integrated with its unified debugger, making it so simple, I just had to add one line to platformio.ini:
debug_tool = esp-prog
debug_tool = jlink

But that will only get you so far, if you start the debugger, the esp32 debugging configuration is missing, so you'll need to add a debug env as well with build flags to add debug symbols to the firmware:
build_flags = -ggdb 


JTAG is a standard debugging port, its common with most of the modern systems, it can help you to physically test a board using Boundary Scan, Stop and Start CPU cores, read and write variables and memory, add breakpoints, read and write registers, execute code and commands and even write firmware.

Unfortunately ESP32 does not provide boundary scan capabilities, but you can achieve that and more if you have your test fixture flash MicroPython, script logic analyzer commands and analyze the results.


The rest of the features JTAG is enabling are great and on top of that the Tensilica TRAX module enhances debugging facilities by adding real-time log tracing and even FreeRTOS event tracing.

So what is TRAX?

TRAX is TRace Analyzer for Xtensa, is a module that the CPU and JTAG share to transfer data between the host and Tensilica Processor.

With that in mind, we can use that data for almost anything, Espressif provided us with two interesting examples, trace logs and FreeRTOS events, but the sky is the limit.

Getting Started

This is the ESP32-DevKitC, it's one of the most popular ESP32 development kits, its a low-footprint board with the essentials, it comes with either WROOM or WROVER modules. Its drawback is the lack of JTAG connectors, but you can add it by wiring directly to the pins.

Get yours here

On the other hand, this adapter exposes the JTAG pins in both 10pin esp-prog format and 20pin standard JTAG / Segger J-Link format, it can stack between the DevKit and your breadboard or development PCB, it made me a lot less lazy connecting the debugger, is it a positive or a negative thing, you decide.

In the end of this article you can find other options from Espressif.

As a side note, I've experienced different problems with different debuggers, the Segger J-Link would freeze every once in a while, needing a complete disconnect and power down of both the debugger and devkit, the FT2232 based debuggers would succeed to upload the sketch through the J-Link interface but it was an inconsistent experience.

So how to debug?

1. compile and upload the firmware using -ggdb flag.
2. in VSCode, go to Debug View, click PIO Debug (skip Pre-Debug), wait about 10-20 seconds and your first breakpoint will be caught.

We have a few interesting points here.
1. The top left PIO Debug will start the debugger, you should switch to the lower right debug console tab to see progress and execute debugger commands.
2. Debugger specific sidebar where Variables, Watch, Call Stack, Breakpoints etc' are visible.
3. The gutter in the editor can set a breakpoint or conditional breakpoint, please note that since conditional breakpoint is implemented in the debugger, the execution will be paused each time the breakpoint is hit and evaluated, this affects timing and performance.
4. Top right bar shows debugger controls, Continue, Step Over, Step In, etc', Note that they might not work if no hardware breakpoint is available.

The Debug Console view in VSCode exposes GDB, I'm saving it for my next article, its going to be about ESP32 log tracing and event tracing abilities, exciting!

Debugging Supported ESP32 Development Kits

1. ESP-WROVER-KIT - JTAG on board (using FT2232HL chip)
ESP-WROVER-KIT-VB is a highly integrated ultra-low-power development board which includes Flash and PSRAM with dual-core 240 MHz CPU.
Create Internet cameras, smart displays or Internet radios by connecting LCDs, microphones and codecs to it.


2. ESP32-LyraTD-MSC - JTAG connector
Designed for smart speakers and AI applications. Supports Acoustic Echo Cancellation (AEC), Automatic Speech Recognition (ASR), Wake-up Interrupt and Voice Interaction.

3. ESP32-Ethernet-Kit - JTAG on board (using FT2232HL chip)
Consists of two development boards, the Ethernet board A and the PoE board B


Wednesday, July 17, 2019

Is ESP32 Ready for some AI?

IoT is a passion of mine for quite some time, so imagine how happy I was to receive a gift from Semix, the all new ESP-EYE v2.1!

Semix specializes in representation and distribution of world leading manufacturers of Electronic Components, Modules and Integrated solutions in Israel, in this case Espressif and Manica.

While the board looks like it was pretty thought out (ferrite beads all over it!), it does lack GPIO connections, looking like it was directly made to demonstrate the ESP32 capabilities rather than a maker Swiss army knife. In the end of this article there are some other options if you're curious about combining these capabilities with your other crazy ideas. :-)

I've started by looking up some information, videos, design reference and anything I can find on that module and eventually I've cloned the esp-who project.

I've followed a few getting started examples but I really love what PlatformIO did with Visual Studio Code so I had to set it up to compile in PlatformIO. BTW, PlatformIO already has the esp-idf framework, which makes it very easy to use with ESP32!

You'd be surprised how much faster a good IDE can help you understand a project structure!

Eventually I got curious enough to see how they did it so I've begun to dig up a bit. the face recognition part seems to be based on MTCNN, where it's actually 3 separate networks integrated with algorithm glue.

The audio keyword recognition seems to be very similar to TensorFlow demo I've seen, but during my research I've seen a few other examples that gave me the impression Espressif did not use TensorFlow.

However, TensorFlow lite could be used for another project I was doing research for, so I've decided to take the plunge and see if I can compile it for ESP32.

The getting started is pretty straight forward, download, compile, run and of-curse learn. but to really get started you need to get your feet wet and test the hardware compatibility since TensorFlow lite was not specifically ported to ESP32. So what do you do? you run the suite tests on the ESP32.

Let me assure you, all the tests passed, some did take some time to complete but that's because it wasn't optimized for tensilica yet.

I've also run the micro speech demo, but since it wasn't optimized, it took 360ms to process 100ms of audio. I did find some optimizations for the inference engine and boom, this thing is fast! (80ms for FFT + inference!!)

If you'd like you can find more pretrained models and examples in TensorFlow website.

To me this little exploration opened a whole new world and ideas of AI on ESP32, if you had any doubts, you should definitely check out TensorFlow lite on ESP32!


Please note that there are other variations of the ESP-EYE (or ESP-CAM) with different capabilities:

1. ESP32-CAM
The ESP32-CAM also has 4MB of external PSRAM, it exposes some GPIOs for extensibility and even has an SD-CARD slot but no Mic.
Notice there's no USB plug, so you'll need an external USB-TTL adapter to program this device.

2. M5Stack Official ESP32 Camera Development Board

Almost looks like a copy,I couldn't find any reference of external PSRAM, so if anyone knows, leave your comments please.

There are empty footprints for MPU6050, BME280, Mic and lithium battery connection, so it can be easily used for your wearable projects.

Notice it has USB-C connector.

3. TTGO T-Camera Plus

That thing is sweet!
8MB of PSRAM(!!!)
1.3 inch LCD
SD card slot
Battery connection/charger
and a USB connection.

Another notable module, very similar to the T-Camera Plus.
We all know that video/imaging takes power, how are we expected to write low power applications when our MCU takes most of our power? well, if your particular application doesn't require you to always scan your camera, you can put your MCU to sleep and wake it up only when there's movement with a simple PIR sensor.

Tags: , ,

Friday, March 8, 2019

A Million Times

A while ago someone at work approached me with an idea to build a replica of  "A Million Times" by Humans Since 1982, while the project did eventually die off as far as I know, the idea looked very interesting, many clocks, synchronized to display animation, text and time, what can be bad about it? or as the original creator wrote:

"Metaphorically speaking, we liberated the clock from its sole function of measuring and reporting the time by taking the clock hands out of their 'administrative' roles and turning them into dancers." – Humans since 1982

A Million Times at Changi, 2014-2018
While they did not expose much of the design for their work, David Cox, who is the engineer of this project, shared a few hints in his facebook:


From what I could deduce, the project is probably using 2 types of MCUs, one to control each of the motors (such as ATtiny85) and another to control the whole block (ATmega of sort) and then connected via USB to a PC to control the entire assembly.

This article has been collecting links and paragraphs for quite some time (since July 2017!), I've decided to finish it after I've started to learn more about PCB design and actually took it off the breadboard.

Assuming we would like to design our own, my first thought was that I'll need to use pipes, gears, Plexiglas and plenty of patience, just like Cornelius Franz tried:

He actually implemented what I thought to make, I've had a single motor with the same driver, I've had to add a step up (since the driver needed a minimum of 8v and used an ATmega328 instead of the STM32F103), I've had to use a a higher voltage than actually needed since the motor would miss steps and the maximum speed wasn't great since I've used AccelStepper which isn't very efficient. I've also thought about using the micro-switches for addressing but i don't think its a good use of available pins, even if used with resistor ladder.

I've thought about the following options for controllers:
- ATmega328 for board MCU, connected to a driver, hall effect sensor and canbus, which can be used without a transceiver on short distances.
- ATmega328 for board MCU, connected to a driver, hall effect sensor and i2c/spi bus
- ATmega32u2/4 for USB connectivity to i2c/spi

My colleague, Allan Schwartz from whatimade.today suggested the i2c route, I had doubts it will work over long distance, but as it turns out there have been uses for i2c over long distances with repeaters (such as PCA9515), I still didn't get around to test the long distance repeater solution, but the datasheet does specify you may not use more than one repeater, but does it also include parallel repeaters?...) Using the i2c as a bus for the entire assembly makes things simpler over communicating with 20 or more USB virtual com ports.

Different layouts, either a PCB per motor or a PCB for 4 motors, which makes things a lot simpler on one hand but won't work if I wanted to add some more visual effects such as addressable Leds.

For drivers, the following options:
- no driver, these motors do not consume too much power (about 20ma VERIFY), so in theory the Arduino can power it, however, when I tried it, the motor produced inconsistent movement for various speeds, I suspect due to the fact I didn't implement micro-stepping in my source code. I did find out that Wojtek Kosak did make it work without any driver, so it might have been my fault it did not work.
- 2 x ULN2003 ($0.2) or DRV8836 ($1.5) (there are many alternatives, just an h-bridge) per motor
- 2 x A3967SLB ($2) per motor
DRV8821 ($4.5), minimum 8v which might complicate things
- L298N ($1) might work
X12.017 stepper driver / VID6606BY8920 /  AX1201728SG ($1), should be able to control 4 motors (or two dual shaft motors) - source, I've tested the AX1201728SG  and it was very stable even in high speeds as long as acceleration control is implemented.

After testing a few drivers, the one that worked best is the AX1201728SG.

And for motors it turns out there are dual shaft stepper motors, the following look pretty promising:
- X40 8798 Stepper Motor ($6.8) - datasheet
- Sonceboz 6407 (27€)
- vid28-05 Stepper Motor
- BKA30D-R5 ($3.8) with a stop, but it turns out the manufacturer already realized that people would like to use them without a stop, so they started manufacturing them without a stop!!

Last but not least is to use a real clock and modify the circuit, the motor is Lavet-type stepping motor and someone made a crazy clock with it, I love it!

All in all, I think Cornelius Franz did a some amazing work, it seems like he's on a good path to have a working replica!


I've also found out that Wojtek Kosak Główczewski actually completed a replica, you might want to look at his schematic and his project:


Eventually I wanted to build my own, so I went with the parts I could source from Aliexpress as it was available and didn't cost $50 to ship unlike packages from DigiKey or Mouser.

I got a recommendation to check the VID28-05, however it was harder to find, it seems like they are either no longer manufactured or perhaps I didn't look very hard after finding the replica BKA30D-R5, it's also a plus that the manufacturer is on Aliexpress, its a drop in replacement anyway.

The original BKA30D-R5 had a hard stop, which should be removed if you want to rotate it 360 degrees

However, the manufacturer took it upon themselves to supply motors without stops!

The vid28 series comes with a thorough datasheet, explaining how to drive the motors, the pinout, measurements and a lot more, if you're planning to use these motors, its definitely worth to read!

I've made a small breadboard with the motor, an arduino and DRV8825 and it kinda worked, I had to jack up the voltage to 12v (outside the specs) so it won't miss any steps and I've even tried the A4988, but it produced a high pitched noise.

So Allan attempted the same thing using shift register (74HC595), from a video he sent me, I saw it was missing some steps and made a-lot of noise, I suspect its due to the lack of micro-stepping.

To zero the hands, I chose a hall effect sensor + 2mm magnets, I've attempted to use the SS49E but it turned out to be not sensitive enough (1.8mV/G), so I'm now attempting to use the SS495 and while its a bit more sensitive (3.3mV/G), its a lot more expensive, so perhaps using a larger magnet or a a sort of magnetic flux concentrator will be a better solution.

It did work properly on one side, I'm not sure if its the N or P, so I'm thinking about building a 3D magnetic sensor (using MLX90393) to diagnose the problem more precisely.

Alternative methods can be to use a reflective optical sensor or a reed switch, but the magnet needed for the reed is too big and heavy to fit on the hands.

I think the research we did on this project makes it relatively simple to implement hardware wise, you may find software other developers wrote in the end of this article. in my opinion,probably the easiest build will be a combination of the original PCB form factor, X12.017 driver, a hall sensor and atmega328p, wire a few assemblies with i2c and to a PC via USB.

I've started to design a more modular PCB, you can shape it into cubes, spheres, towers, what ever your imagination can conjure.

The first revision was a partial success, the motor fits perfectly, the lights work, the homing more or less works.

A few design issues were discovered during the first test, the brownout trips when either the leds turn on or the motor driver resets, and although I covered most of the power requirements by adjusting the trace widths, you can guess what the problem was, decoupling capacitors for example and a main capacitor.

I've also used relatively heavy acrylic hands with magnets on them, so the motors missed steps here and there, I've rewrote AccelStepper to use interrupts and s-curve instead of linear acceleration, which was a lot of fun and should probably affect the overall life of the motors but no noticeable difference except for top speed.

I've also added the famous WS2812B addressable Leds to see what else can this board do.

Eventually I've decided to learn KiCad and the redesign was done from the ground up while learning, at first, I've hated it, but now I'm really enjoying myself designing PCBs!

I'm not sure where this little project is going, but my key take away are the interesting research, the world of stepper motors is not a complete stranger, but this is not NEMA, magnetism and of-curse KiCad!

Humans since 1982 are Bastian Bischoff (b. 1982, Germany) and Per Emanuelsson (b. 1982, Sweden). Since meeting as postgraduate students at HDK Göteborg in 2008, the duo have produced works that defy easy categorisation, situated between visual art and product design. Creating objects and experiential installations, their work manages to be analytical with a healthy dose of escapism.
Facebook: https://www.facebook.com/HumansSince1982/

If you want to buy one and not interested in the engineering part, just head to MoMA Design and you can also get a black one!

Another interesting idea which is somewhat related is Clock by Christiaan Postma, not sure if they are related in any way, still worth a look.

If you're looking only for source code:
Conor Hunt was also inspired by Humans since 1982 and wrote a javascript demo and shared the source code.
Jos Fabre was also inspired and made this demo.
Dmitry Yakimenko wrote an iOS app and published the source code and a demo.
Carlos Cabo wrote a webGL demo and shared the source code.
Takahashi Yuto wrote a demo in elm and shared the source code.
Nicolas Daniel wrote a demo and shared the source code.
Malte Wessel wrote a demo and shared the source code.
Ubayd Rahmonzoda wrote a demo and shared the source code.

There are already existing projects for trying to build a replica:

You might want to take a look at a discussion from mikrocontroller.net, its pretty old but there is some progress there and people are sharing their experiences.

Also, there's a single motor with breakout that someone sells on tindie.


Wednesday, February 13, 2019

ESP32 IoT Device Management using LWM2M

Device Management means to connect, configure, control, monitor and update devices, individually or collectively.

The challenge can range from managing a single or multiple devices on the same location to managing thousands spread all over the world in other cases, to complicate things further, devices can connect in various ways, either through Ethernet, WiFi,Cellular, Lora, SMS and numerous other ways. It can use a slow or fast network, it can be connected all the time or just ping your servers once a day to save power, so many things to consider.

At about 2015 Open Mobile Alliance realized these challenges and released the first version of  Lightweight Machine to Machine standard, the standard describes device management using COAP protocol with UDP/DTLS or SMS transports, recently added TCP/TLS and even MQTT was added though its not part of the standard yet.


So how does LWM2M help you with your device management needs?

LWM2M standardizes the way your devices will talk to your servers, the clients are very lean and designed to work on constrained devices, the connectivity has very low bandwidth needs and doesn't even require the devices to stay connected and that can fit a very broad spectrum of products, it can help you track your temperature sensors throughout your facility or even a buoy in the middle of the ocean, just imagine you can plan a firmware update to a device you'll never see again with maximum safety!

I've chose to implement firmware updates because I had an itch to try it on ESP32, while doing it I've also implemented a basic device that can turn a light on or off and report the current time and a few other properties.

The way I did it was to port wakaama and tinydtls to ESP32, about 90% of it worked without any modification, I've added WiFiManager and NTP Client to the mix and it just worked. (not very efficiently in terms of size though, but good enough for my experiment, the firmware was about 1.2MB in case you're wondering, out of it DTLS and Wakaama were about 100kb-200kb each and took about 6-10kb of RAM, leaving me with about 210kb of RAM which is not bad.)

As soon as it started connecting to my local Leshan Server, I've hooked into the firmware object the update function I've wrote, most of it based on esp-idf OTA with custom certificate validation and I've had a complete solution, the firmware and the signature would be sent over http/https, the completed firmware will be validated against the stored certificate and if it worked, the device will attempt to boot into the new firmware,a few diagnostics will be executed and if the device achieved connectivity and stability, it will mark the new firmware as valid.

The fun part started when I've sent the package URI (/5/0/1) to the device through Leshan Server and Executed the update (/5/0/2), the firmware was downloaded from a local web server, the device validated the firmware and finally rebooted into the new firmware.

Why firmware over-the-air is so important?

Lets agree that time to market is a critical business need, more than once we hear about superior devices and software being ditched because a significant player already got a grip on the market and trying to push a new product is hard if not impossible at times. Many times its the first product or the first player that wins the game.

To achieve a significantly short time to market compromises needs to be made, sometimes a scaled down product, less features and even compromises on a thinner layer of security are essential to get the product out of the door. Once a product gains significant traction and more resources becomes available, a better product can be developed, a more robust firmware, features as well as better security or even security vulnerabilities needs to be deployed while there is no physical access to devices. More over, device recall can kill a business and most companies try to avoid it at any cost. literally.

This is where firmware over-the-air comes in, assuming the device can connect to some kind of network, either by WiFI, or GPRS/3G or even SMS and LORA, it should be able to pull a firmware update when needed or it becomes available.

How does firmware over-the-air works?

Once a device has connectivity, it can either pull a firmware, for example from HTTP server or can be pushed a firmware, for example through COAP blockwise transfers, lets assume its relatively simple to implement it or already has a library available.

To support firmware updates more than once, the device will keep track on two partitions and switch between them every time the firmware is updated, this way the old firmware is kept until the device determines the new firmware is good enough to switch to permanently.

Lets drill down to the specifics of one way OTA can be implemented

  1. a device is configured with 2 application partitions, one for the factory application and a 2nd is left empty for future update.
  2. a device is notified of an available firmware, the OTA process can start.
  3. the device determines the next OTA partition for use, if it just went out of the factory, the 2nd partition is empty, if OTA was completed successfully, the first partition is available for the next update.
  4. firmware is downloaded directly to the available partition.
  5. the device validates the new firmware, for example with a checksum and/or a certificate.
  6. the device boots into the partition and makes sure it works as designed, so it checks the various sensors, network connectivity can still be achieved and either marks the new firmware as valid or invalid, if it crashes, either intentionally or through a watchdog, it will reboot back into the old partition.
I would argue that adding a secure validation on the new firmware is very important even if its only so it will not become another zombie in a large botnet.

The way I've implemented security in my experiment is as follows:

1. Generate a self signed certificate:
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365
2. Extract the public key:
openssl x509 -pubkey -noout -in cert.pem > pubkey.pem
3. Copy the public key into a certificate.h file
4. Create a new firmware and sign it:
openssl dgst -sha256 -sign key.pem -out firmware.signature firmware.bin
5. Copy both files to a web server, set the package URI and Execute the Update.

Though if you want to implement secure firmware on ESP32, you can do it "by the book".

OMA LWM2M Includes many device management options and includes on-boarding procedures, which is very important once your production reaches a certain number of devices since you won't be able to do it manually anymore.

You can monitor, configure and control your devices, you can request to be notified using the Observe mechanism and you can query the device state, send commands so the device will turn on or off a certain actuator, reset to factory defaults or even remote wipe a device if it comes to it.

And we can't really talk about LWM2M without talking about IPSO Smart Objects as well, which is a list of objects defining the structure of various sensors and actuators so you won't have to do it.

LWM2M is a lot of fun, you should definitely consider it in your next product!

Saturday, November 3, 2018

Christopher Avery - The Responsibility Process®

I've had the honor of attending Christopher Avery's The Responsibility Process® workshop on October 28-29th 2018. We all cringe when we hear excuses or blame but it eludes some of us that shame and obligation are not the nirvana of responsibility though they feel almost as bad. To some people shame and obligation IS responsible behavior. Christopher worked with all of the participants on understanding how to responsibility is not Obligation, Shame, Justifying or Blame and how team members and management can start working on their own responsibility and inspire responsibility in others.

Christopher started the workshop by asking us to find the "Taxes" and "Dividends" of responsible teams and members, helping us to grasp what it is and what are the costs of not achieving responsibility within teams and our lives.

He then proceeded to explain the difference between these states and how they hurt us from both a personal view, team members and especially in management positions.

He explained why more accountability in organizations does not equal more responsibility, how to inspire responsible behavior and the 3 keys to responsibility:

  • Clear Intention
  • Focused Attention
  • Effective Action
We discussed the circles of control and power, how clarity leads to trust and how power comes from being able to stop-think-act (I'm paraphrasing from scuba diving rule book) while control circle is actually hurting decision making by having the illusion of fixing the problem while doing nothing to really solve anything other than feeding the stressful behavior.

We talked about why responsible teams are more productive and why collaboration will produce much higher results than winning, but what is winning if not collaborating?... (did you ever hear "its not our problem" or "you can't touch that" from your supervisor?)

Later we discussed trust issues between teams and members, how to build and rebuild trust and how to avoid losing it since its so hard to rebuild.

In the end of the workshop we all understood why responsible people and teams do a better job than any other team and how we can build our own teams and companies to be responsible for all of our successes.

I would like to say thanks to Practical Agile for enabling my attendance. Thank you Lior, Ilan and Dalit.



Tuesday, June 12, 2018

Tiny Model for MNIST Dataset

The MNIST database is a database of handwritten digits, its used as an ideal beginner dataset for learning how to do simple image classification and as the dataset only contains 10 characters, its relatively easy to work with.

It has 60000 tranining examples and 10000 testing examples and it is sufficiently large. as it has 60000 images of 28 x 28 grayscale images, it takes about 50MB, so it does not add complexity for batch generation since it can all fit in memory.

Like I said, ideal.

I think the first lesson I did on image classification was with MNIST database as well. it was very amusing to see a 1.2 million parameters model for a 50MB dataset, so I've decided to see how low I can go.

Lets start with the big one.

1.2m parameters - LeNet

CNN Error: 0.80%
train_loss 0.0082
train_acc 0.9984
val_loss: 0.0552
The graph does look like its overfitting by a bit.

My first experiment was using huge convolutions, I've managed to train 99% on 300k parameters, but I was not satisfied, surely there is a better way.

This model at 0.992 accuracy, with only 36k parameters (!!)

CNN Error: 0.72%
train_loss 0.0224
train_acc 0.9928
val_loss 0.0255
val_acc 0.9928

36k only? well, that's huge, I've looked around and found out its possible with under 4k parameters, so I set up for the challenge and came up with this model.

CNN Error: 0.90%
train_loss: 0.0369
train_acc: 0.9880
val_loss: 0.0285
val_acc: 0.9910

model is under 4k parameters (3.8)

To summarise what I've learned from this exercise is that larger models will learn faster but also overfit faster, smaller models need more training to find a better fit.


I would like to say thank you to EliteDataScience.com for getting this little exercise started

My 36k model:

My 4k model:


Friday, April 27, 2018

Practical Sensing - RF

We, as humans, are so used to know who we are and where we are that we sometimes forget that it comes at a cost, a person sees where they are, they have an awareness (or general awareness) of their location in a room or on the street, they can also see a step (or feel if one is blind) and eventually know where obstacles are, get from a place to place and plan routes around obstacles.


What can computers do? 

Radio Sensing

Radio sensing has been invented 1904 and while only capable of detecting a presence it has since evolved into many other sensors.

RF Radar has a few implementations, among them is presence, distance and movement, the principle is the same, an RF wave is transmitted, it is then reflected (or not) and the returned radio waves are detected, in more advanced scenarios, the received signal goes through FFT to detect the reflected timing/phase, which is further processed to get distance and/or speed of one or more objects.

Another implementation of radio sensing is localization, in its simplest form triangulation and more advanced is GPS which uses clocks to indicate when the signal was sent to better localize the receiver. BLE have been used for indoor navigation as well.

A different method of obtaining general location as long as you have network connectivity is using a service such as google geolocate which uses your IP and near by WIFI networks to guess the location.

An attempt was made to discover the hackability of Bosch Radars (diydrones, mikrocontroller) but so far without success.

Another form of electromagnetic sensing is a Geiger counter, which can be used to detect radiation, very useful if you want to detect Radon in your basement or take a relatively safe hike near Chernobyl.

Lastly Radio can be used as a cheap way to find out if a certain device is near another device or even communicate a secret of some sort.


Doppler - Doppler type sensors detect change or movement, one such cheap sensor is the HB100.


Distance - Sensors such as FM24-NP100 ($110) provide distance to the biggest reflection but also spectrum data which can be used to monitor multiple objects. These type of sensors measure the phase difference between two wavelengths.

Presence - Modules like the HW-MS03 (about $2) are in essence a Doppler radar combined with a timer to switch a pin/relay on or off.

Some radar modules (such as CFK024-5A for about $50) have FMCW tuning capabilities, which is very useful if you want to do a sweep which can be used to detect distance of multiple objects but it requires more than basic knowledge.


GPS receivers determine location by triangulating the timestamps and signals received from satellites, the more satellites, the more accurate the location will be. but GPS technology is limited by atmospheric conditions, limiting the accuracy possible.

GPS receivers have advanced over the years, GPS L1 and L2, GloNass, Galileo, Beidou and more, but the accuracy stayed more or less the same at this moment, the peak is around 2.5 meters accuracy for private use.

To overcome the accuracy limitations, a few augmentations were developed, some are over the air such as SBAS and QZSS, some are based on static base stations like DGPS and RTK.

Commercial RTK solutions are provided by drotek and Emlid to name a few.


BLE beacons are low energy (hence LE) devices which transmit a message once in a while, by reading the received power level (RSSI), it is possible to estimate the distance to the beacon. By knowing where the beacons are, it is possible to triangulate (called trilateration) the location of the receiver.

proximi.io (no affiliation) is one of the companies that does that kind of indoor positioning.

Nuclear Radiation

Following Fukushima disaster and Chernobyl tourism, personal radiation detection devices became more and more popular. 

Dosimeters available in Film Badges, MOSFET and Geiger-Muler tubes to name a few.


Wireless identity devices are devices that contain a chip with a small coil, the coil is used to power up the chip and transmit data. two such devices are RFID and NFC devices.

Some devices have static data, some can store custom data, some can encrypt and authenticate but the principle is the same.

RFID more common types are the 125KHz and 13.56MHz and contain a 20 bytes ID and anywhere between 0 and 64 bytes of custom data.

13.56MHz module
125Khz module

One of the more interesting things about RFID is the tag sizes

NFC works much in the same way, memory capacity is between 48 bytes and 32kb.


While radio communication is not a sensor per say it is however a way to communicate and locate and can be used to sense location, state and various data.

Standard modules come in various frequencies and modulations, some even implement protocols, error correction and buffers.

Among the more popular ones are the 315Mhz, 433Mhz, 868Mhz, 915Mhz, depending on the country and local regulations and considered ISM band, which can be used for anything from car remotes, multirotor telemetry and various remote switches.

433Mhz RF transmitter and receiver 

XBEE, Bluetooth and WIFI are also considered ISM band but more robust implementations are available and are mostly used for higher bandwidth application.


Some of the more popular 2.4Ghz modules are the NRF24L01, A7105 and CC2500 which are used in RC Toys.

FrSky Taranis Q X7S

Lora, LoraWan, SigFox and NB-IOT are mostly used for smart appliances such as water meters, power meters and relatively long range and big coverage requirements and are very low bandwidth.

Cell (2G/3G and up) are used nowadays for anything from messaging to video playback.




Tags: ,