Cross product#
Sun, 31 Oct 2021 15:47:10 +0000
I had cause this afternoon to remember the Monad Tutorial Fallacy, which has been summarised as saying that when you finally understand them, you lose the ability to explain it to others.
I hypothesise that the same is probably true of cross-compilation in the Nix package system, and therefore I present these notes not as a superior alternative to any of the existing documentation, but because I wrote them down to aid my understanding and now need to put them somewhere I can refer back to them.
So. Let's suppose we're building NixWRT. Most likely we're building on an x86-64 system, to produce an image which will run on a MIPS device. In the event that there are any programs in that image which generate code (which is unlikely as we're not shipping compilers), we want them also to generate MIPS code. Thus, in the standard terminology we have:
- build: x86-64
- host: MIPS
- target: MIPS
(This naming convention comes from Autoconf, and so we are stuck with it. To make it make sense, consider the built product rather than the build process: we are describing a thing that was built on x86-64, is hosted on MIPS, and would - if it emitted any code - emit code that runs on MIPS)
However, not all of the software we create (or depend on) will be needed on the MIPS system - some of it (e.g. buid tools, compilers and other kinds of translators) will have to run on x86-64. So how do we keep track of it all?
Let's look at some examples:
- Package A contains source code which is translated into some other form using programs provided by package B (e.g. B provides an SVG to PNG convertor). The programs in B must run on the build machine: thus, the
host
for B is thebuild
for A. Provided that B is not generating executable code - i.e. we don't have to worry about the target - then we represent this by including B in thenativeBuildInputs
attrribute of A's derivation. - Package A contains source code which is compiled using programs provided by package B for execution on the build system. For example, B is a C compiler which we are using to build
nconf
, which we will then run to create the.config
file that the linux kernel build process uses. In this case the program in B must run on the build system of A and also must target the build system of A. We represent this by putting B in thedepsBuildBuild
attribute of A's derivation. - if we have a package A which depends at runtime on another package B (e.g. the build for A creates a shell script, one of whose commands is provided by B) both those derivations have the same
host
. In this case B doesn't care about the target, so we add it to thebuildInputs
for A (if it does care, that's more complicated). As the developers of A we must ensure that programs in B are reachable from A, either by embeddding the full pathname of B into the script or using a wrapper that sets $PATH. - if A required at run-time some source code contained in B (e.g. A is a script for some interpreter, and B is a source-distributed library, for it) then B has no
host
to speak of. If there is any native code component in that library, though, it must be code that runs on the same system as A's host - sobuildInputs
again. See abcde for an example. Note also thewrapProgram
call which setsPERL5LIB
to ensure that the code in A can find the code in B at runtime. - if A depends when it is built on source code contained in B (suppose: the build invokes a Ruby script, and B is a gem required by that script) then B must be runnable on the build system of A. Host(B) = Build(A) implies
nativeBuildInputs
unless there is some target shenanigans. Consulting the manual it seems that for some interpreters there is support for adding the files in B to interpreter's search path while A is built. - if A is a program that runs on the host, and is linked to binary static libraries provided by B, the
host
for B must be the same as for A, so my reading is that this isbuildInputs
. Note that A must be able to find B at compile time, which is handled by the CC Wrapper adding appropriate flags. - if A is a program that runs on the host, and depends on binary shared libraries provided by B, the
host
for B must be the same as for A so this is similar to the previous case. The absolute pathname of the shared library provided by B will be embedded into the binary of A.
Why am I caring about this right now? I rearranged bits of NixWRT and updated it to a recent Nixpkgs revision, causing OCaml to stop building for reasons I didn't fully understand
So, here is what I think is happening:
- the kernel is being built on x86-64 (build) for execution on MIPS (host). It doesn't generate code when it's run (er, as far I know - at any rate it can't be configured to generate code for some other system than the one its running on) so I don't care about target.
- to create the kernel source tree we run Coccinelle on x86-64 - so Coccinelle's
host
is the kernel'sbuild
. Coccinelle produces C source files, so again I don't care abouttarget
. This means we should usenativeBuildInputs
to declare it as a dependency. - Coccinelle is written in OCaml, which is a compiler and generates code for some target system according to its build options. This means that OCamls
host
andtarget
are both coccinelle'sbuild
system, so we usedepsBuildBuild
to declare it as dependency.
Clear? If this doesn't help, I invite you to consider the possibility that cross-compilation is like a burrito.