Skip to main content

Successful Debugging Requires Bridging the Hardware-Software Gap

case study debugging

Don’t ignore the default MCU settings

 

Debugging is an important part of embedded design; one that requires bridging the hardware/software gap. At a system level, the functionality of an embedded system is increasingly defined by firmware, so avoiding bugs requires engineers with disparate disciplines to work closely together during development.

The level of functionality and configurability features offered by embedded components, such as MCUs, is growing but they offer many features that aren’t required in every design. Often, these extra functions can just be ignored and rarely cause problems

These features will typically be controlled by registers that can be modified through software. They have a default setting at power-up and, if left unchanged, will continue to operate under those default settings. In many cases this may not present a problem.

However, if these features remain unused or untested, there is a chance that their impact will be felt in some unforeseen way. Bugs may develop in the system, caused by perfectly legitimate features that have perhaps been overlooked.

Finding faults can be difficult, time-consuming and costly even under ideal conditions. If the cause of a fault is a low-level feature that hasn’t been initialised correctly then finding it could become even more challenging. Understanding how the initial state of the hardware platform could impact an entire design requires a much higher appreciation of the overall system.

For example, consider an SPI bus on an MCU accessing a serial Flash memory, which is a relatively simple feature used in many embedded systems. If an error is detected in the stored value it would indicate that the memory, rather than the MCU, was suffering from a fault. This was one customer’s experience when successive reads from the Status register of a Flash memory showed it was detecting read/write errors.

The engineers believed these symptoms pointed to the serial memory failing, even though it was still well within its specified endurance limit, having only completed around 60K write cycles. When the serial Flash memory device was returned to Dialog for further tests, no fault was found, even after over 300K write cycles were executed.

In order to track down the real fault, Dialog engineers investigated the customer’s application and probed the SPI signals. What appeared to be a fault with the memory device actually turned out to be a system noise issue and one that could be easily corrected.

Even though it appeared to be a PCB or circuit design issue, the noise was in fact overshoot and undershoot on the SPI signals, caused by excessive drive strength of MCU GPIO signals interfacing with the Flash Memory.

It was discovered that the design was based on a relatively new MCU which allowed the drive strength of the I/O pins to be modified in firmware and where the default was set to MAX drive levels. Reducing the drive strength of the signals was enough to eradicate the overshoot and undershoot on the SPI signal lines; effectively removing the source of system-level noise.

An important point here is that although the Flash memory device was doing its best to contend with a significant amount of system noise, a configurable feature on an MCU introduced effects that were easily interpreted as faults in a separate part of the design.

In this instance, the fault was detected through a robust approach to design and was resolved through the diligence of Dialog engineers working with the customer design teams. Even though default settings are meant to help, they should be verified.

The working relationships between hardware and software engineers, as well as customer and supplier, was strong enough to meet the challenges that designing with the latest technology can present.  

Read the full case study “How system level noise in digital interfaces can lead to spurious errors in serial flash memory"