A Few Notes About macOS CI
Update 2020-12-03 17:26
After searching a little and with help from Matt I was able to find the resource that stated that Github Actions is using MacStadium. The current version was wiped of that information but thanks to the Internet Archive I was able to find the actual information written out and disclosed by them.
If you haven’t yet I would recommend to donate to the Internet Archive, I just sent them $25,85
Following the Amazon AWS announcements that they will be joining the circle of few to offer Macs in their datacentres the topic around Mac hosting, macOS CI and which provider to pick has been widely discussed once again. The current overlap with the release of M1 based Macs and the remaining only half-to-poorly answered question of how to virtualize another OS on these Macs lead to various very interesting blog posts and public conversations on Twitter. Peter Steinberger sent me a DM asking me to fact check his article, and I figured I do this in public for others to read as well. I think he nailed pretty much everything but I wanted add a few things, starting by addressing his conclusion.
There’s no one-size fits-all solution when it comes to running macOS in the cloud. Both virtualization technology and bare metal are valid choices depending on organizational structure and requirements, but we hope this has given you a good overview of what’s possible.
He is absolutely correct here and missing critical information at the same time. There is no one-size fits-all solution to this, but there will always be one reason to choose virtualization: ephemeral builds.
These two magic words make anybody dealing with CI/CD infrastructure very excited for a very simple reason, which is predictability. It’s something our industry has seemingly been chasing for decades now and thanks to modern container technologies this goal is within reach or has been partially reached on the Linux side of things by the big CI-As-A-Service providers. Ephemeral builds means nothing more than always starting with the same environment. It provides a clean, predictable environment for your tests to run or your software to be packaged up and deployed. No lingering artifacts, crashed simulators or other things should exist which could disturb your fragile CI/CD pipeline. On the macOS side of things it’s our Linux on the Desktop. Next year will be the year, I am certain this time.
The way to achieve ephemeral builds “fairly” easily is by virtualizing the OS. A new VM is created every time a new build is started based on a copy of an environment that was setup specifically to your needs before. I know that this setup is possible with VMware vSphere as well as with KVM (MacStadium Orka). I helped maintain CircleCI’s VMware setup and build my own crazy little setup with KVM prior to the release of Orka, though I hadn’t used it for long. I think this would be possible with Veertu Anka as well but I have not tried it (yet).
A lot of MacStadium customers were interested in having this exact setup but had no idea how to get there or how to maintain it and ended up using virtualization without an ephemeral build system. They basically gave up before they even got there or a few months into it. The answer as to why mostly comes down to missing tooling, especially from Apple. You are playing Mac admin on extra hard mode and if that isn’t your day-to-day job it may be very hard to find the motivation to keep things running.
I had only started working on it when I left MacStadium and got very distracted since, but I am still absolutely convinced that ephemeral builds are possible on a bare metal system with APFS snapshots. Maybe I am wrong about it, but in theory this should be possible. If you know more I’d love to hear from you!
Virtualizing macOS will always present you with weird bugs since you are interacting with macOS in a way which it absolutely hates. The OS fundamentally wants to be operated by a human with a keyboard and a mouse attached to it, not by various automation tricks and scripts. Look at the state of macOS Automation and tell me that I am wrong. The guy that ran that entire thing for decades was fired (if I recall correctly) and left to join Omni and do cool things there. It perfectly shows that Apple’s leadership has no idea what to do with their best talent even if they already live in the Bay Area and have been working for them for years knowing all the ins and outs of how to navigate Apple internally.
Just don’t. You will be so much happier if you just don’t do that.
From Peter’s post:
I haven’t found a single writeup that takes price into consideration when discussing macOS virtualization. This is in some ways understandable, as most articles are from large companies, and engineers aren’t included in their price decisions. However, for smaller teams without venture capital, it’s an important metric.
Total cost is difficult to measure, since the promise of virtualization is less ongoing work, which should translate to reduced ongoing maintenance costs, and often employees are the most expensive cost factor.
Price mostly comes down to what your time is worth to you and/or whether or not you can find somebody capable and willing to maintain this infrastructure for you. Can you find somebody if that person leaves your company? What does it take to retain that talent and is there room for personal growth in this area?
To give you a real world example of my own, after giving a talk at Otto, which are a MacStadium customer, about exactly this topic I was approached by engineers from 4 different companies within 15 minutes asking if I could maintain their infrastructure for them as a contractor. There is a very good reason why this market is as underserved as it is.
Comparing Anka and Orka
The biggest (potential) upside to me with Anka is that you have the ability to turn all your VMs off and run a big build on bare metal macOS. This would require a sort of CI runner setup like Gitlab had for a long time or Buildkite and a sane CI tagging system. Maybe this would even be a way for some companies to run Android builds on the same host and get around nested virtualization, because after all I can’t stress enough how you should absolutely not try nested virtualization.
Fully Managed Services
TravisCI runs on machines at MacStadium and so does Github Actions. This was previously disclosed publicly by Microsoft when they ran their weird CI system but I would bet a lot of money that all of that has since been rolled into Github Actions after the acquisition.
There is chatter about the changed macOS EULA and how this relates to these services and my uneducated guess is this: They aren’t going to go anywhere. Companies will continue renting entire machines and will offer build minutes and Apple will keep looking the other way. Maybe Apple will ask for some changes but the fundamental services will not go away.
All of these are expensive for various reasons, the biggest is the manual labour involved in getting hosts online. AWS is not going to bend over backwards to get you anything and is not solving any of the hard problems with maintaining Mac build infrastructure.
What about the Mac Pro?
While I adore the Mac Pro and the solution Apple came up with to sorta-kinda-resurrect the Xserve, I think the underlying product is too expensive. If you don’t need the extra RAM or really know what you’re doing with a virtualized setup I would not recommend it. My idea of leveraging Anka comes into play again, but I have not tried this yet or compared any numbers.
The current Mac Pro is overly expensive and over engineered in ways that make it a very bad fit for datacentre use. I am assuming that the 19" rack mount version only exists for the music and video industry and that Apple thought little to none about it being used as a CI machine or living in a datacentre. In the words of a friend of mine at Apple about the Pro Display XDR: “This is not for you”.
This announcement from AWS left me a little confused. MacStadium has great compliance certifications so I doubt that most customers were waiting for AWS to join this space for these reasons. The prices are also outrageous and individual support probably a lot worse as compared to MacStadium. Unless you have to go with AWS for whatever reason, I see no point in signing up with AWS because as usual, they offering isn’t better than anything else by other vendors but they charge a 10x premium on it. I would recommend that you run macOS bare metal on Mac minis hosted by MacStadium if you can and use Buildkite to automatically kick of any builds. At that point you still haven’t solved any of actually hard problems of dealing with macOS and CI but I am sure that Peter will keep posting interesting things.