Let’s C if we can do bare-metal programming


Remember blinking the LED on Arduino? It’s almost magical how easy it is. That’s because Arduino abstracts all the low-level stuff from us. This is extremely helpful for a beginner, but surely there must be a trade-off, right?

That’s where bare-metal programming comes in. Now there’s no abstraction, so we configure everything manually from registers to toggling ports directly. In my projects, I don’t need bare-metal programming, but it’s a good chance to learn some basics, and see for myself if it’s really that much more performant 🤔

The world of ESP

The ESP family is a line of microcontroller chips made by a company called Espressif Systems. These chips are special because they come with built-in Wi-Fi (and sometimes Bluetooth)—which makes them perfect for connecting devices to the internet. That’s a big deal for modern electronics, especially in the world of IoT.

When you search for ESP, you’ll mostly see ESP8266 and ESP32, the two most popular chips in the ESP family. You may want to upgrade from an Arduino UNO to an ESP8266 to get built-in Wi-Fi, more memory, and a much faster processor. The ESP32 is an upgrade from the ESP8266, adding Bluetooth support, even more speed, and additional features.

We will use  LilyGO TTGO T-Base ESP8266 which is a nice development board. We can just plug it into our computer.

Blinking that LED like a normal person

After connecting the board, open Arduino IDE. Go to File > Preferences > “Additional Boards Manager URLs”, and paste this link: http://arduino.esp8266.com/stable/package_esp8266com_index.json

Now you will be able to install esp8266 from Tools > Board > Boards Manager.

We will turn the built-in LED on for 0.1 seconds then turn it off and wait for a second. This is our code:

void setup() {
 pinMode(LED_BUILTIN, OUTPUT);
}

void loop() {
 digitalWrite(LED_BUILTIN, LOW); // LED on
 delay(100);                  
 digitalWrite(LED_BUILTIN, HIGH); // LED off
 delay(1000);                  

}

Before we get into bare-metal, did you notice something odd? Shouldn’t HIGH = ON and vice versa? 

In Arduino, the LED is wired like this: PIN → LED → GND. When we set the pin to HIGH, current flows through the LED, and it lights up. On many ESP8266 boards, it’s wired like this: VCC → LED → PIN. When we set the PIN to HIGH, both sides of the LED are at the same voltage, so there is no voltage difference and no current — the LED stays off. But when we set the PIN to LOW, there’s a voltage difference (3.3V at the top, 0V at the pin), so current flows, and the LED lights up! This is called active-low — meaning the component becomes active (ON) when the control signal is LOW.

Now that we know what’s going on in the code, how can we measure Its performance without an oscilloscope? Maybe we can print execution time?

void setup() {
 pinMode(LED_BUILTIN, OUTPUT);
 Serial.begin(115200);  // Open serial monitor to see the output
}

void loop() {
 unsigned long start = micros();
 digitalWrite(LED_BUILTIN, LOW);
 delay(100);
 digitalWrite(LED_BUILTIN, HIGH);
 unsigned long end = micros();
 Serial.print("Time to toggle LED ON then OFF: ");
 Serial.print(end - start);
 Serial.println(" µs");
 delay(1000);  // Wait a second before repeating

}

micros() stores the current time in microseconds (1/1.000.000 seconds 🤯). Then it runs the code that turns the LED on, waits for 100 ms, and turns it off. After that, we call micros() again to get the end time, and subtract the two to see how long the entire operation took.

Serial.begin() and Serial.print() are new

Serial.begin(115200); → starts the serial communication between microcontroller and computer. 115200 is baud rate. It basically means the communication line should have a speed of 115200 bits per second.You can open serial monitor from the top right of the Arduino IDE interface:

If you select a different baud rate in the Serial Monitor than what you used in your code you will see weird icons like this : �L��a�:FL��a�. It happens because your computer tries to “listen” at the wrong speed and misinterprets the bits flying by.

115200 is kind of the “fast but safe” maximum standard rate for most serial setups, that’s why I specify it in the code.

Let’s check the serial monitor:

We were expecting 100,000 microseconds for each interval, since we used delay(100), but instead we are seeing 100,033. Where is this extra 33 microseconds coming from?

  • micros() itself takes ~3–5 µs to execute
  • digitalWrite() is also relatively slow on ESP8266 (because it’s a high-level function), and may take ~4–6 µs
  • delay(100); doesn’t guarantee exactly 100,000 µs — it guarantees at least 100 ms
  • Serial print also takes time.

We need micro and serial print to see so maybe we can get rid of the delay function and only calculate the difference between digitalWrite and its bare-metal version.

Timing the LED Toggle Only

Let’s only calculate the time for LED on and off:

#define LED_PIN LED_BUILTIN  

void setup() {
 pinMode(LED_PIN, OUTPUT);
 Serial.begin(115200);

}

void loop() {
 uint32_t start = micros();  // Arduino's microsecond timer
 digitalWrite(LED_PIN, LOW);   // Turn LED on
 digitalWrite(LED_PIN, HIGH);  // Turn LED off
 uint32_t end = micros();
 Serial.print("digitalWrite toggle-only time: ");
 Serial.print(end - start);
 Serial.println(" µs");
 delay(1000);  // Wait 1 second before repeating
}

So how can we convert it to bare-metal?

Let’s try with this version:

extern "C" {
 #include "user_interface.h"
 uint32_t system_get_time(void);  // <-- required for Arduino
}

#define LED_PIN 2  // GPIO2 on ESP8266 boards

uint32_t read_ccount() {
 uint32_t ccount;
 asm volatile ("rsr.ccount %0" : "=a" (ccount));
 return ccount;

}

void setup() {
 pinMode(LED_PIN, OUTPUT);
 Serial.begin(115200);

}

void loop() {
 uint32_t start_time = system_get_time();
 uint32_t start_cycles = read_ccount();
 GPOS = (1 << LED_PIN);  // HIGH (LED OFF)
 GPOC = (1 << LED_PIN);  // LOW (LED ON)
 uint32_t end_cycles = read_ccount();
 uint32_t end_time = system_get_time();
 uint32_t cycles = end_cycles - start_cycles;
 float time_us = (float)cycles / 52.0;
 Serial.print("Toggle: ");
 Serial.print(cycles);
 Serial.print(" cycles ≈ ");
 Serial.print(time_us, 3);
 Serial.print(" µs (system time: ");
 Serial.print(end_time - start_time);
 Serial.println(" µs)");
 delay(1000);

}

ESP8266 runs at 52 MHz by default, so dividing cycles by 52 gives time in microseconds.

Toggle: 19 cycles ≈ 0.365 µs (system time: 0 µs)

Wow, even getting rid of digitalWrite makes a difference! But if bare-metal means no abstraction, doesn’t including libraries and using setup() and loop() break that rule? In that case, the previous code isn’t really bare-metal, it’s barely-metal. (I’m sorry about the joke)

C code, linker scripts and makefiles

Thankfully I have stumbled across Tom Trebisky’s Github. Now we have a roadmap to achieve bare-metal programming with C language.

  1. We will install the compiler called XTensa toolchain. It will turn our code into something that ESP8266 can understand. https://espressif-docs.readthedocs-hosted.com/projects/esp8266-rtos-sdk/en/v3.4/get-started/index.html#get-started-get-esp-idf
  2. We will take a look at Tom’s bare-1-hello folder. It has:
  •  a single C file called hello.c (the actual program)
  •  a tiny linker script called esp.lds. With this file we specify the code’s physical location on the hardware.
  • a Makefile, it’s a build automation file. instead of writing lots of commands in the terminal we make it easy as: make image, make flash…

In esp.lds and Makefile i didn’t change anything except some configurations. Here’s the final C file:

#include <stdint.h>

#define UART_CLK_FREQ   (52 * 1000000)   // ESP8266 default clock: 52 MHz
#define LED_PIN         2
#define LED_MASK        (1U << LED_PIN)
#define GPIO_BASE       0x60000300
#define GPIO_OUT_W1TS   (*(volatile unsigned int *)(GPIO_BASE + 0x04))  // Set HIGH
#define GPIO_OUT_W1TC   (*(volatile unsigned int *)(GPIO_BASE + 0x08))  // Set LOW
#define GPIO_ENABLE_W1TS (*(volatile unsigned int *)(GPIO_BASE + 0x24)) // Enable pin output

extern void uart_div_modify(int uart_no, unsigned int div);
extern void ets_delay_us(unsigned int us);
extern void ets_printf(const char *fmt, ...);

// ✅ Read ESP8266 CPU cycle counter
static inline uint32_t read_ccount(void) {
   uint32_t ccount;
   __asm__ __volatile__("rsr.ccount %0" : "=a"(ccount));
   return ccount;
}

void gpio_init(void) {
   GPIO_ENABLE_W1TS = LED_MASK;  // Set LED pin as output
}

void call_user_start(void)
{
   uart_div_modify(0, UART_CLK_FREQ / 115200);  // UART setup
   ets_delay_us(500000);                        // Let USB settle
   gpio_init();                                 // LED pin output enable
   ets_printf("\nESP8266 LED Cycle Timing\n");

   for (;;) {
       uint32_t start = read_ccount();
       GPIO_OUT_W1TC = LED_MASK;   // LED ON (active-low on ESP boards)
       GPIO_OUT_W1TS = LED_MASK;   // LED OFF
       uint32_t end = read_ccount();
       uint32_t cycles = end - start;
       ets_printf("LED ON->OFF: %u cycles\n", cycles);
       ets_delay_us(1000000);      // 1 second delay
   }

}

To make this code work we have 3 steps.

  1. make clean all: This deletes any old compiled files and builds everything from scratch. It ensures we’re uploading the latest version of your code.
  2. make image: This converts our compiled code into a binary format (.bin) that the ESP8266 can understand and flash to memory.
  3. make flash: This uploads the binary to our ESP8266 over USB, so the chip can run our program the next time it powers on.

After that we open our terminal and type: screen /dev/cu.usbserial-2120 115200

10 cycles? That’s just incredible. I’m not saying this is production-ready firmware — I’m just thrilled the LED turned on and didn’t brick my ESP8266 in the process. If you’re into bare-metal and want to help refine this or stop me from committing embedded crimes, please do. But still, it seems like getting from digitalWrite() to GPOS, then C++ to C, we shaved off CPU cycles in the process. Maybe in our little ESP8266 it doesn’t matter much — but in the world of low-power or real-time systems every cycle counts.

Yorum bırakın