How Ampere Discovered Success by Testing for Failures
When the California-based startup introduced Ampere Altra — what they…
Three years ago, Graham Christensen sent a cold email on behalf of NixOS: “Look, I need a lot of hardware—for free.”
“There was probably a one in 100 chance of that working,” says Christensen. “I sent it to a dozen different companies. Only one replied: It was Packet, and it was an exuberant yes.”
NixOS, a Linux distribution, is a declarative configuration manager that also provides atomic upgrades and rollbacks. The open source project allows for software reproducibility by compiling and building software hundreds of thousands of times each week, ranging from minor packages to mission critical software.
“You can point NixOS at pretty much anything and get it to work,” says Christensen, whose team of volunteers operate the NixOS build farm and the build servers.
When NixOS started in 2005, it relied on grants to fund the infrastructure used to manage all the builds. “We had a set amount of hardware,” explains Christensen, who only had a small budget. “It worked, it was managed manually, and it didn’t change very often.”
At a time when their servers were becoming unstable, Christensen says Packet gave him “a blank check” for as much hardware as they needed, which allowed his team to evolve NixOS’s infrastructure from a static style of servers reminiscent of the early-2000s to a more modern, cloud native one.
It was through Packet that Christensen also discovered Works on Arm, a collaboration between Packet and Arm that encourages the adoption of the Armv8 architecture in data centers.
Christensen calls NixOS and Works on Arm “a natural fit.”
“Works on Arm cares a lot about the software ecosystem and is focused on making various software work on Arm,” says Christensen. “And NixOS—both the community and the companies behind it—are quite committed: Once you have a defined problem, we’re very good about attacking that problem.”
Within a few months of getting Works on Arm sponsorship, the NixOS infrastructure team had 20,000 to 30,000 packages building on Arm and ready to go for its user base. Christensen also leveraged the Packet API to create servers, set up monitoring, and import systems into the NixOS infrastructure.
The results were immediately clear. “One of the hard things with a project the size of NixOS is that with tens of thousands of software packages, it’s very difficult for somebody to casually maintain it,” says Christensen. “If something breaks, it can take several days to see if a fix actually worked when you’re on small, cheap hardware. With the generosity of the Works on Arm project, instead of taking several days to validate a fix, we tackle the order in a matter of minutes to hours.”
While NixOS started as academic-focused, there are now more commercial IoT and edge use cases in play.
In European cities where bike culture is thriving, there is designated parking for cyclists as well as electronic signs throughout the city that show how many spots are available in each lot. The company behind that service uses Nix and NixOS to build, deploy, and manage all of its infrastructure. “They have hundreds of embedded Arm devices running NixOS using software built by these machines provided by Works on Arm that help bicyclists park their bike,” says Christensen.
And for edge deployments where there is a high amount of variability, such as deploying software to various cities, NixOS allows users to establish a base level of configuration that keeps systems as coherent, simple, and consistent as possible. “NixOS really excels in terms of its ability to deal with various types of systems,” says Christensen. So even if there is not identical hardware to execute one function throughout a project, “one of the things that NixOS lets you do is write down and keep track of the basic job and then tune it for every type of device that you have in the field so that it works correctly.”
And if it doesn’t, users know prior to production. “Usually you don’t find out that your package is broken until you’re actually deploying it to production,” says Christensen. But with NixOS, “even if you have 10 different types of systems or 100 that you’re deploying to, you’re able to build and test that locally in your office or on your laptop, without touching anything in production.”
Another use case is a company that monitors shipping channels near a sizable country using physical systems such as small servers and embedded devices located in piers, buoys, and in the ocean itself.
When the internet connection isn’t optimal, “they found that these remote servers were easy to break,” says Christensen. If an update was interrupted or restarted halfway through while the server is not in a good state, problems arise and “they would have to physically fly people to this remote place and ship them out on a ferry to figure out what happened. In some cases it could take well over a week to just get to the system they needed to fix.”
The model that NixOS uses to deploy and manage software and servers reserves the most dangerous or risky work of the deployment until the very end. “Our model is much safer for things like edge deployments, IoT, things that are embedded and in the field that maybe don’t have a reliable internet connection,” says Christensen.
After the NixOS team introduced its deployment model and its safe rollback of a failed deployment that the Works on Arm machines help facilitate, “the client ended up giving NixOS a try,” he says. “Now all of their monitoring stations in the ocean run NixOS to make sure the shipping channels are safe.”
Given this valuable access to Arm servers, says Christensen, “our community has been working on improving Arm, and it has grown to well over 100 people who are actively working on it.”
Now if Packet has an experimental kit of hardware or they inherit some sleds from a manufacturer, the NixOS community is ready and willing to put them to the test.
“We have a community that understands Arm and is able to dive into the nitty gritty of how Linux works, how processors work, how hardware works. We could find a way to put it into our build farm and run 100,000 jobs on it within a month, or any high number of jobs that really stresses the machine to see if it’s behaving and performs well,” says Christensen.
With a more stable infrastructure, NixOS has been able to start experimental research projects such as Mobile NixOS, an initiative to get NixOS running on cell phones. “We have at least one person working on it full time, and that’s also been enabled by the Works on Arm hardware.”
With the support from Packet and Works on Arm, Christensen sees the potential of NixOS as limitless. “Ultimately I would like to see NixOS running in space,” says Christensen. “That’s my pie-in-the-sky dream.”