Cross product#

Sun, 31 Oct 2021 15:47:10 +0000

I had cause this afternoon to remember the Monad Tutorial Fallacy, which has been summarised as saying that when you finally understand them, you lose the ability to explain it to others.

I hypothesise that the same is probably true of cross-compilation in the Nix package system, and therefore I present these notes not as a superior alternative to any of the existing documentation, but because I wrote them down to aid my understanding and now need to put them somewhere I can refer back to them.

So. Let's suppose we're building NixWRT. Most likely we're building on an x86-64 system, to produce an image which will run on a MIPS device. In the event that there are any programs in that image which generate code (which is unlikely as we're not shipping compilers), we want them also to generate MIPS code. Thus, in the standard terminology we have:

build: x86-64
host: MIPS
target: MIPS

(This naming convention comes from Autoconf, and so we are stuck with it. To make it make sense, consider the built product rather than the build process: we are describing a thing that was built on x86-64, is hosted on MIPS, and would - if it emitted any code - emit code that runs on MIPS)

However, not all of the software we create (or depend on) will be needed on the MIPS system - some of it (e.g. buid tools, compilers and other kinds of translators) will have to run on x86-64. So how do we keep track of it all?

Let's look at some examples:

Package A contains source code which is translated into some other form using programs provided by package B (e.g. B provides an SVG to PNG convertor). The programs in B must run on the build machine: thus, the host for B is the build for A. Provided that B is not generating executable code - i.e. we don't have to worry about the target - then we represent this by including B in the nativeBuildInputs attrribute of A's derivation.
Package A contains source code which is compiled using programs provided by package B for execution on the build system. For example, B is a C compiler which we are using to build nconf, which we will then run to create the .config file that the linux kernel build process uses. In this case the program in B must run on the build system of A and also must target the build system of A. We represent this by putting B in the depsBuildBuild attribute of A's derivation.
if we have a package A which depends at runtime on another package B (e.g. the build for A creates a shell script, one of whose commands is provided by B) both those derivations have the same host. In this case B doesn't care about the target, so we add it to the buildInputs for A (if it does care, that's more complicated). As the developers of A we must ensure that programs in B are reachable from A, either by embeddding the full pathname of B into the script or using a wrapper that sets $PATH.
if A required at run-time some source code contained in B (e.g. A is a script for some interpreter, and B is a source-distributed library, for it) then B has no host to speak of. If there is any native code component in that library, though, it must be code that runs on the same system as A's host - so buildInputs again. See abcde for an example. Note also the wrapProgram call which sets PERL5LIB to ensure that the code in A can find the code in B at runtime.
if A depends when it is built on source code contained in B (suppose: the build invokes a Ruby script, and B is a gem required by that script) then B must be runnable on the build system of A. Host(B) = Build(A) implies nativeBuildInputs unless there is some target shenanigans. Consulting the manual it seems that for some interpreters there is support for adding the files in B to interpreter's search path while A is built.
if A is a program that runs on the host, and is linked to binary static libraries provided by B, the host for B must be the same as for A, so my reading is that this is buildInputs. Note that A must be able to find B at compile time, which is handled by the CC Wrapper adding appropriate flags.
if A is a program that runs on the host, and depends on binary shared libraries provided by B, the host for B must be the same as for A so this is similar to the previous case. The absolute pathname of the shared library provided by B will be embedded into the binary of A.

Why am I caring about this right now? I rearranged bits of NixWRT and updated it to a recent Nixpkgs revision, causing OCaml to stop building for reasons I didn't fully understand

So, here is what I think is happening:

the kernel is being built on x86-64 (build) for execution on MIPS (host). It doesn't generate code when it's run (er, as far I know - at any rate it can't be configured to generate code for some other system than the one its running on) so I don't care about target.
to create the kernel source tree we run Coccinelle on x86-64 - so Coccinelle's host is the kernel's build. Coccinelle produces C source files, so again I don't care about target. This means we should use nativeBuildInputs to declare it as a dependency.
Coccinelle is written in OCaml, which is a compiler and generates code for some target system according to its build options. This means that OCamls host and target are both coccinelle's build system, so we use depsBuildBuild to declare it as dependency.

Clear? If this doesn't help, I invite you to consider the possibility that cross-compilation is like a burrito.