Tuesday, December 30, 2008

Interesting discussion about Linux and the GPL...

I was browsing the web about the existence of binary modules for Linux (i.e. nVidia drivers), and I wondered:

The linux headers are GPL'ed. So what happens if I write a program that makes use of those headers, but is not a kernel module?

This led me to find the following 2003 discussion in Kerneltrap:

Linux: The GPL And Binary Modules

What called my attention the most was Linus' closing statement.

From: Linus Torvalds [email blocked]
Subject: Re: Linux GPL and binary module exception clause?
Date: Thu, 4 Dec 2003 17:58:18 -0800 (PST)

On Thu, 4 Dec 2003, Larry McVoy wrote:
> >
> > linux/COPYING says: This copyright does *not* cover user programs
> > that use kernel services by normal system calls - this is merely
> > considered normal use of the kernel, and does *not* fall under
> > the heading of "derived work".
> Yeah, and the GPL specificly invalidates that statement. We're on thin
> ice here. Linus is making up the rules, which is cool (since I tend to
> like his rules) but the reality is that the GPL doesn't allow you to
> extend the GPL. It's the GPL or nothing.

Larry, you are wrong.

The license _IS_ the GPL. There's no issue about that. The GPL rules apply

But a license only covers what it _can_ cover - derived works. The fact
that Linux is under the GPL simply _cannot_matter_ to a user program, if
the author can show that the user program is not a derived work.

And the linux/COPYING addition is not an addition to the license itself
(indeed, it cannot be, since the GPL itself is a copyrighted work, and so
by copyright law you aren't allowed to just take it and change it).

No, the note at the top of the copying file is something totally
different: it's basically a statement to the effect that the copyright
holder recognizes that there are limits to a derived work, and spells out
one such limit that he would never contest in court.

See? It's neither a license nor a contract, but it actually does have
legal meaning: look up the legal meaning of "estoppel" (google "define:"
is qutie good). Trust me, it's got _tons_ of legal precedent.


So yes, userspace programs *ARE* allowed to #include the kernel headers.

Later, some anonymous guy added this brilliant piece of insight:

Including Kernel Header Files
December 8, 2003 - 6:23am

It seems that most of the people on this list have forgotten their compiler design course. If the kernel file only contains things like variable declarations and function prototypes then it will NOT end up in the object code after compilation. It will be used to inform the compiler of things like how much memory to set aside for a particulair variable or to make sure that the correct number of arguments are passed to a function in the correct order. None of that type of "code" from a ".h" file ends up in the executable (unless debugging code is left in - but that's a little bit of a different story).

Where things get gooey, as was being pointed out in one of the comments in the original story, is when you start to include ".h" files that contain things like macro definitions and inline functions. Those two things DO end up in the compiled object/executable code that the compiler produces. When a macro is used in the source code and a compiler comes across it it will be expanded; in other words, the macro in the source will be expanded, or replaced, by the macro's definition as was given in the ".h" file and then it will be compiled and then the object code will be written to the output file. At that point the argument can be made that the GPLed "code" in the kernel's ".h" file made it into the final user space program.

A similair thing happens with inline functions. By definition of inline, the function is expanded inline. This means that when the compiler comes across the function call it is expanded inline; it is NOT called in the normal way that a function is called. A normal function is called by placing the parameters, or variables that you are sending to the function, on the stack, along with the return address and a few other things, and then jumping to the location in memory where the function is stored (by placing a new value in the IP, or Instruction Pointer, register in the CPU). When the function has completed it will issue a "ret" to return to where the function was called from. This is accomplished by looking at the return address that you previously stored on the stack and loading it into the IP register on the CPU.

As you can see this whole process is kind of a pain in the ass; in computer speak: it takes a long time. Macros and inline functions were employed as a way to avoid this process for very simple functions. First came macros which worked well but were sometimes difficult to predict exactly how the compiler would expand them. Further they provided no way to type check the variables that were being sent to them. The solution was inline functions. Instead of having one copy of the function in memory like a normal function, the "code" of an inline function is placed in the program everywhere it is called from. This makes the program a little bit larger but makes it execute much faster because it doesn't have to put a bunch of stuff on the stack and jump all over the place in memory. And macros are expanded and then compiled and put in the code everywhere they are called as well.

This is where the apprehension of including the kernel's ".h" files comes from. If the ".h" file only has function prototypes and variable declarations (and the debugging code is stripped out) then none of it will be in the compiled program. It is used by the compiler and then discarded. But, if the kernel's ".h" file has things like macros and inline functions then some of the kernel's GPLed code will make it into your compiled program and your program will become a derived work and thus must be licensed under the GPL.


Interesting, isn't it?

No comments: