Enhancing your embedded Linux updates reliability - OTA
The update of embedded Linux product fleets represents a strategic imperative. Risks and solutions - discover how to enhance reliability and standardize your update process
25/03/2024
Nathanaël Landais
Updating Embedded Linux Systems: A Strategic Imperative
Whether you're an established industrial company or a startup, you inevitably face the challenge of updating your products that embed Linux. Once deployed, these systems need regular updates to fix bugs, enhance performance, add features, or bolster security.
But how do you efficiently manage updates for these embedded Linux systems? What off-the-shelf solutions are available in the market? What are the advantages and disadvantages of each?
In this article, I will present the various challenges and solutions related to updating embedded Linux systems. We'll particularly explore how to enhance the robustness of your update processes to reduce risks, detect bugs, and continuously improve the quality of updates.
Managing Embedded Linux Fleets
The Challenges of Updating an Embedded Systems Fleet
Keeping your products up to date is crucial, both to ensure the security and reliability of your systems and to allow your customers to benefit from the latest features and improvements.
However, updating a fleet of embedded Linux systems is also critical: the major risk is completely locking a system and requiring human intervention to repair it.
Having a few units blocked is a problem in itself, but if you have tens or hundreds of thousands of systems blocked or at least malfunctioning, it becomes a complete disaster.
Therefore, it is essential to do everything necessary to prevent this scenario from occurring and to minimize the consequences if it does.
Ensuring Update Reliability
The first essential element is to ensure the reliability of the update process. The system must be robust, capable of withstanding unexpected scenarios: network outages, power failures, transmission errors.
An interrupted or non-functional update at the end of the process should be able to revert to the previous state without human intervention.
Deploying Incrementally
The second element aims to mitigate risks in case of major malfunction of the update process. To achieve this, the deployment strategy must be incremental, limiting the number of systems exposed to a potentially problematic update.
This is called canary testing.
It is an effective strategy to measure the impact of a new version in the field with limited risk.
Gathering statistics, especially successes or failures, helps validate the update before deploying it to all users.
The Benefits of These Strategies
These strategies offer several advantages for managing fleets of Linux-based products:
- They allow easy rollback, reverting the update if it proves problematic.
- They reduce risks associated with updates by limiting the number of users exposed to potential errors or malfunctions.
Proprietary Solutions for Updating Embedded Linux Systems
Many industrialists develop their own solutions internally to manage updates for their Linux-based products. These solutions are often specific to each project and rely on proprietary protocols and data formats.
The reasons for this choice can vary, including the age of the solution (before the arrival of robust off-the-shelf alternatives), unfamiliarity with existing tools, and the search for a tailor-made solution to problems that seem very specific to their use cases.
However, these solutions may have significant limitations, especially in terms of costs, flexibility, security, and reliability. Indeed, these solutions often are:
- Difficult to adapt to changes in context, hardware, or software. Developing new products may require significant adaptations to the internal tool, which may not be formalized as a tool but often consists of many components, more or less reusable.
- Expensive to maintain and evolve, as a consequence of the previous point.
- Limited to essential use cases. Custom solutions rarely offer updates resilient to unexpected scenarios: network outages, power outages, or transmission errors. The result of these scenarios could lead to a bricked system.
So, if you want to improve your update process reliability and reduce costs, how do you go about it?
Open-Source Solutions for Updating Embedded Linux Systems
If you know Lenewt, you've seen this coming; I want to talk to you about open-source update deployment solutions. There are several mature solutions today, actively supported by the community. More comprehensive, flexible, and robust than any custom-developed solution can be.
SWUpdate, Mender, RAUC
Among the most popular are SWUpdate, Mender, and RAUC. These solutions are based on the principle of atomic updates: an update is either applied in its entirety or canceled.
Implementing Atomic Update
In detail, these systems operate with two partitions (or images) for the system:
an active partition, which contains the currently used system, and a passive partition, which receives the new version of the system.
During the update, the passive partition is written with the new image, then the system restarts on this partition. If the update fails or if the new system does not function correctly, it is possible to revert to the active partition, which contains the old version of the system. This mechanism avoids the risks of unbootable system or corrupted update.
The robustness of this update strategy eliminates the need for human presence with the machine and allows for simple deployment of updates across large fleets of Linux-based products.
Too Specific Use Cases?
Do you find your specific case incompatible with these generic solutions? These tools are more flexible than you might think. For example, if you need to update firmware for another component communicating with your Linux system via UART, SWUpdate allows you to implement what they call a handler, which even allows you to integrate any proprietary tools and processes into your update packages.
These tools are used in large industrial projects with very diverse use cases; your specific use case is likely to be configurable in these tools.
And if it's not? It's open-source; you can contribute to enriching the tool of your choice,
with the bonus of community support that can help you integrate your needs into the tool for the benefit of everyone.
Solutions for Progressive Deployment
And what about progressive deployment, how do you perform canary testing to ensure their stability and mitigate the risks of major malfunction?
The three mentioned tools allow for progressive deployment in one way or another.
- Mender allows you to create update campaigns and track deployment and performance statistics.
- SWUpdate and RAUC can be associated with hawkBit for similar results.
Conclusion
Updating embedded Linux systems is a strategic challenge for companies, which must ensure the quality, security, and reliability of their products, while meeting the expectations and needs of their customers. The consequences of failed update deployments can be significant both financially and in terms of reputation.
While inertia and existing processes may discourage abandoning internally developed update solutions, they can also represent a real barrier to product evolution. Modern open-source solutions improve the user experience and make new use cases possible by enabling remote (OTA) updates of entire fleets.
It's a turning point best taken with expert guidance. Contact us to benefit from our expert advice on embedded Linux topics.