Tuesday, December 30, 2008

Interesting discussion about Linux and the GPL...

I was browsing the web about the existence of binary modules for Linux (i.e. nVidia drivers), and I wondered:

The linux headers are GPL'ed. So what happens if I write a program that makes use of those headers, but is not a kernel module?

This led me to find the following 2003 discussion in Kerneltrap:

Linux: The GPL And Binary Modules

What called my attention the most was Linus' closing statement.

From: Linus Torvalds [email blocked]
Subject: Re: Linux GPL and binary module exception clause?
Date: Thu, 4 Dec 2003 17:58:18 -0800 (PST)

On Thu, 4 Dec 2003, Larry McVoy wrote:
> >
> > linux/COPYING says: This copyright does *not* cover user programs
> > that use kernel services by normal system calls - this is merely
> > considered normal use of the kernel, and does *not* fall under
> > the heading of "derived work".
> Yeah, and the GPL specificly invalidates that statement. We're on thin
> ice here. Linus is making up the rules, which is cool (since I tend to
> like his rules) but the reality is that the GPL doesn't allow you to
> extend the GPL. It's the GPL or nothing.

Larry, you are wrong.

The license _IS_ the GPL. There's no issue about that. The GPL rules apply

But a license only covers what it _can_ cover - derived works. The fact
that Linux is under the GPL simply _cannot_matter_ to a user program, if
the author can show that the user program is not a derived work.

And the linux/COPYING addition is not an addition to the license itself
(indeed, it cannot be, since the GPL itself is a copyrighted work, and so
by copyright law you aren't allowed to just take it and change it).

No, the note at the top of the copying file is something totally
different: it's basically a statement to the effect that the copyright
holder recognizes that there are limits to a derived work, and spells out
one such limit that he would never contest in court.

See? It's neither a license nor a contract, but it actually does have
legal meaning: look up the legal meaning of "estoppel" (google "define:"
is qutie good). Trust me, it's got _tons_ of legal precedent.


So yes, userspace programs *ARE* allowed to #include the kernel headers.

Later, some anonymous guy added this brilliant piece of insight:

Including Kernel Header Files
December 8, 2003 - 6:23am

It seems that most of the people on this list have forgotten their compiler design course. If the kernel file only contains things like variable declarations and function prototypes then it will NOT end up in the object code after compilation. It will be used to inform the compiler of things like how much memory to set aside for a particulair variable or to make sure that the correct number of arguments are passed to a function in the correct order. None of that type of "code" from a ".h" file ends up in the executable (unless debugging code is left in - but that's a little bit of a different story).

Where things get gooey, as was being pointed out in one of the comments in the original story, is when you start to include ".h" files that contain things like macro definitions and inline functions. Those two things DO end up in the compiled object/executable code that the compiler produces. When a macro is used in the source code and a compiler comes across it it will be expanded; in other words, the macro in the source will be expanded, or replaced, by the macro's definition as was given in the ".h" file and then it will be compiled and then the object code will be written to the output file. At that point the argument can be made that the GPLed "code" in the kernel's ".h" file made it into the final user space program.

A similair thing happens with inline functions. By definition of inline, the function is expanded inline. This means that when the compiler comes across the function call it is expanded inline; it is NOT called in the normal way that a function is called. A normal function is called by placing the parameters, or variables that you are sending to the function, on the stack, along with the return address and a few other things, and then jumping to the location in memory where the function is stored (by placing a new value in the IP, or Instruction Pointer, register in the CPU). When the function has completed it will issue a "ret" to return to where the function was called from. This is accomplished by looking at the return address that you previously stored on the stack and loading it into the IP register on the CPU.

As you can see this whole process is kind of a pain in the ass; in computer speak: it takes a long time. Macros and inline functions were employed as a way to avoid this process for very simple functions. First came macros which worked well but were sometimes difficult to predict exactly how the compiler would expand them. Further they provided no way to type check the variables that were being sent to them. The solution was inline functions. Instead of having one copy of the function in memory like a normal function, the "code" of an inline function is placed in the program everywhere it is called from. This makes the program a little bit larger but makes it execute much faster because it doesn't have to put a bunch of stuff on the stack and jump all over the place in memory. And macros are expanded and then compiled and put in the code everywhere they are called as well.

This is where the apprehension of including the kernel's ".h" files comes from. If the ".h" file only has function prototypes and variable declarations (and the debugging code is stripped out) then none of it will be in the compiled program. It is used by the compiler and then discarded. But, if the kernel's ".h" file has things like macros and inline functions then some of the kernel's GPLed code will make it into your compiled program and your program will become a derived work and thus must be licensed under the GPL.


Interesting, isn't it?

Thursday, December 18, 2008

Ubuntu Virtualbox problems.

I hate it! Since I installed Ubuntu I've had nothing but trouble.

Let me show you what my problem is: I need to install Virtualbox at the job to do some VM tests. My first choice was installing the newly-released 2.1 with support for OpenGL and whatnot.

The problem? Virtualbox 2.1 requires libqt-network >= 4.4.3. The one supported in my distro (Hardy) is 4.4.0.

So I needed to install an earlier version of Virtualbox. 1.5.6-OSE seemed fine to me. So I run it, and what happens? the vboxdrv module isn't present there. Alright, I browse the web and i find out I need to run /etc/init.d/vboxdrv setup

* Usage: /etc/init.d/vboxdrv {start|stop|restart|status}

WTF? Where's the setup command? Whatever, I was told to "apt-get install linux-headers-`uname -r`"

But guess what, my kernel version is 2.6.24-22-generic. And turns out the packages available DON'T cover 2.6.24-22, just up to 2.6.24-21.

Is it because I installed Hardy and not Intrepid? But Hardy was supposed to be supported until 2010, why is this happening?

In MEPIS I never had these problems. even with the annoying beta bugs. I could install and run Virtualbox in there with no hassle. I keep wondering why Ubuntu is so hyped as "the next big thing in Linux".

I I keep having these problems, I'll download and install debian.


It seems the problem will solved by adding the "proposed" packages to your repository options (I did it via synaptic). Now it will install the 2.6.24-23-generic kernel (let's hope we don't screw up).

Also, I found a post in the Ubuntu Hardy launchpad page regarding the 2.6.24-22 bug:

Steven Willis wrote on 2008-12-05:

It's even simpler than that:

sudo apt-get install virtualbox-ose-source
sudo module-assistant auto-install virtualbox-ose-source
sudo /etc/init.d/vboxdrv start

(the last step basically just loads the module with modprobe, but it also does a little bit of house keeping)

And you might only need to run the last two steps from above; according the the module-assistant man page:

"auto-install is followed by one or more packages desired for installation. It will run prepare to configure your system to build packages, get the package source, try to build it for the current kernel and install it."
Clem wrote on 2008-12-05: (permalink)

Thanks Steven, it works !

Let's see what happens after I finish installing this stuff.

Monday, December 8, 2008

Python: Not for real software development

Some guys at the job wanted to deploy some software they made in Python. And that's when the problems started. Turns out they want to embed all the libraries they used - they want to include all of them so that the end user won't have to move a finger. Just run the binary installer, and voila (just like in Windows).

But guess what, turns out python doesn't let you choose which path you want to run some libraries from (not the .py plugins, but the .so the plugins depend on). They tried setting up LD_LIBRARY_PATH (or whatever it's called, I don't remember), other environment variables, and nothing.

Just because the program in particular depends on some python bindings to some libraries.

My question is, what the hell was Python made for? For quick-and-dirty configuration scripts to come by default in Linux distros? Maybe. For students to learn programming? Probably. For real software development? IN YOUR DREAMS.

See, everything is perfect in Python (metaphorically speaking, of course - NOTHING is ever perfect in Python) until you face the problem of deploying your python program on a variety of machines running different Linux distros, and you want the program to run WITHOUT HAVING THE USER MOVE A FINGER. Turns out you can't. If the user needs to open the commandline, you know you've failed.

So what should I compare Python to?

A sandbox for kids to make their sand castles. Sure, they can be wonderful castles - they can have bells and whistles! Even better, they can be LEGO castles! But try to move them away from the sandbox (your development environment), and they'll crumble.

A sandbox. That's what Python really is about, isn't it? Hey, at least in Java you could embed everything your program needed!

My Solution

My solution is simple: Copy the most used python functions, classes, etc. and their parameters. And why stop at that? You can copy the most useful functions from PHP. Add a variant class, it's simple.

Then bundle all those functions and classes in "libeasycpp".

Voila. You can have all the rapid prototyping you had in python. And it will work on any setup because it's compiled to a binary executable!

There! Was it that difficult?

Wednesday, December 3, 2008

On Python, unit tests and braiiiiiiiiiiinssss

This has been a hectic week. I've been staying too much time at work because there's some ... UGH EEW python UGH! work that I needed to finish.

And because bugs are really hard to catch in python, I've even been undersleeping trying to fix at home what I didn't fix at work.

A few surprises about Python.

  • The good: There's a command line. Just type "python" and a prompt appears.Wow. If C++ had a command line to test your own files, it'd be neat.

  • The bad: At the office, the team began googling for ways to deliver python binaries with all the dependencies resolved. So we learned about python eggs and easy_install. Congratulations to the developer who did that - but guess what? C++ already solves your dependencies for free. Why can't I just write my C++ program? Whatever, the pay is worth it ;-)

  • The ugly: Python's archaic attempt at error checking makes it NECESSARY that you not only use the interpreter I mentioned above, but also that you write unit tests to catch all the runtime errors that might come out in your program.
That leads me to my second topic for today: Unit tests.

Coding without Unit Tests is like playing Jenga(TM).

I've also realized that the way I've programmed all my life has been much less structured than I thought. Sure, my knowledge of separation of concerns, patterns, etc. has helped me a lot in coding - making my code heavily resistant to exceptions, memory leaks, etc. However, I haven't been always fond of writing tests for my programs. Which leds to writing HUGE chunks of code that are difficult to debug.

So what happens if you DON'T write unit tests? You'll end up adding temporary chunks of code like this one:

curpos = curpos + calculate_something;
# print "curpos right now is %d" % curpos
if curpos > len(s):
# print "there was an error in here!"
return None
some test code
additional test code
# commented test code
# more commented test code

# here's a huge chunk of temporary "debugging" code
here's a huge chunk of temporary "debugging" code
# here's a huge chunk of temporary "debugging" code
here's a huge chunk of temporary "debugging" code
here's a huge chunk of temporary "debugging" code
more code
more code
here's a huge chunk of temporary "debugging" code
# here's a huge chunk of temporary "debugging" code
more code
more code
# here's a huge chunk of temporary "debugging" code
# here's a huge chunk of temporary "debugging" code
more code
more code
more code
# here's a huge chunk of temporary code
# here's a huge chunk of temporary code
here's a chunk of temporary "debugging" code
here's a chunk of temporary "debugging" code
more code
more code
more code
more code
more code
more code
# another temporary test line
even more code
even more code
even more code
even more code
even more code
# here's old code, just in case
# here's old code, just in case
# here's old code, just in case
# here's old code, just in case
# here's old code, just in case
# here's old code, just in case
even more test code
even more test code
even more test code
clippy("hello there! Looks like you're \
trying to debug some code! Want some help?")

So THAT's what happens when you're not accustomed to writing unit tests. I know it very well, because that's how I've been programming for YEARS!
Just because you remove all that test code at the end, doesn't make it good code.

So, how do you write good code? I found out just yesterday.

When I rewrote the python program I was coding (a lightweight JSON parser, no less, which turned out to be completely unnecessary as I could write the configuration data using plain and simple .ini files, but I disgress), I suddenly decided to write simple use cases (about 10 or 20 sets of three-liners, which made up around 40% of the lines in my code) to see what was going wrong.

Here's more or less what I wrote:

def myfunc():
small chunks of code
small chunks of code
small chunks of code
small chunks of code

def myfunc2():
small chunks of code
small chunks of code
small chunks of code
small chunks of code

def myfunc3():
small chunks of code
small chunks of code
small chunks of code
small chunks of code

def myhugefunc():
huge chunks of code
huge chunks of code
huge chunks of code
if blablablah:
for blablablah:
huge chunks of code
huge chunks of code
# And that's it!

def unittests():
# These are primitive tests that have to be examined
# by hand - but they're still light years ahead
# of the ugly Jenga(TM) code I posted above.
print "Testing KNOWN_INPUT"
print myfunc("KNOWN_INPUT")

print "Testing KNOWN_INPUT2"
print myfunc("KNOWN_INPUT2")

print "Testing KNOWN_INPUT3"
print myfunc("KNOWN_INPUT3")


print "Testing KNOWN_INPUT"
print myfunc2("KNOWN_INPUT")

So I ran the unit tests. Wham! Poof! Beef! Zonk! Suddenly, one after another, a horde of lemming-like runtime errors started appearing before my eyes. Wheeeeeeeeeeeeeeeeeee!

I also added some assertions (you knew there was a python AssertionError exception, didn't you? Ah, God bless the Code::Blocks IDE autocomplete, it's shown me some stuff about python I didn't even know) and was finally able to get my parser going. And guess what, turns out that the configuration data I was writing in the first place, had bad JSON syntax, and my program caught it, pointing at the exact line and position.

The moral of the story?

  1. Python sucks. Sorry, had to say it :P
  2. Write unit tests. They're easy to write, and they'll save you HOURS of debugging. No, i'm not kidding. I speak from experience.
  3. Use assertions. They look ugly in your code but they make your code act pretty - which is what matters.
  4. Divide your code in small chunks that can be unit-tested. If those chunks are used in various parts of your code, no matter how easy they are to code, you need to put them in their separate functions so you can test them with the unit tests. Yes, I'm writing it bold because it's THAT important.
  5. No more Jenga(TM) coding! Hurray!!
Now I need to go to sleep, my brain's entering zombie mode now. I have been sleeping 5-hours-a-day for the whole week.


Monday, December 1, 2008

I hate python, part trois: The mysterious case of the anorexic syntax checker

One thing that's really annoying me about python: Python sucks at syntax checking.

Let's say I create a class in python, named myclass:

class myclass:
def __init(self):
print "Hello world!"

And I'm using this class for some part in my program. Well, everything goes fine in my program UNTIL I call this class. I get this message:
TypeError: this constructor takes no arguments

I have two problem with this:

First, the message really doesn't tell me ANYTHING. It doesn't say "myclass constructor not defined".
Second, it doesn't even tell me when I START the script! Why, oh why, python doesn't have a correct syntax checker? It should spit out right when I start the script: "Hey, you! You forgot to add the trailing __ to myclass.__init, duh!"

I'm really starting to miss C++. If I compiled a program in C++, I would have gotten an error about an undefined constructor *RIGHT ON SPOT*. There shouldn't be the need for a unit test when the problem is a SYNTAX ERROR!

I don't know who was the moron who said he loved python "because I can start writing the unit tests faster, so my development cycle is more efficient". That's bullcrap.

The speed you gain in omitting the "c++ formalities" is lost tenfold at debugging simple syntax errors... that appeared minutes after you started your application.

Let me say it again: A trivial error like this would be spotted by a C++ compiler INSTANTLY!