Real Time Kernel On Nvidia Jetson TX2

This is still under construction, but I wanted to publish it fast so a friend could duplicate the work…

So you want to build a real time kernel on your TX2, eh? Shouldn’t be much of an issue, right? Eh… It’s a little annoying, but here’s how I did it.

Before I begin, I’d like to thank the guys over at Jetson Hacks, because they made all of this MUCH easier for me. Now, let’s get started.

Getting Started

Start by cloning the Jetson Hacks build Jetson TX2 Kernel repo from here:
https://github.com/jetsonhacks/buildJetsonTX2Kernel

git clone https://github.com/jetsonhacks/buildJetsonTX2Kernel.git

Since I’m using Linux For Tegra 28.1, checkout their vL4T28.1 release tag.

git checkout vL4T28.1

Run the get kernel sources script (note: this will take a while):

sudo ./getKernelSources.sh

Make sure that loadable kernel modules are enabled, and go ahead and write the kernel configuration file to .config, then let’s try to build the vanilla kernel to see if we have any issues to begin with.

sudo ./makeKernel.sh

Everything builds fine for me, so let’s get to patching the kernel with the PREEMPT_RT patch.

Kernel builds are typically located in the /usr/src directory, so let’s cd to where this kernel build is occurring:
cd /usr/src/kernel/kernel-4.4

Get the rt preempt patch that EXACTLY matches our linux kernel version:

wget https://www.kernel.org/pub/linux/kernel/projects/rt/4.4/older/patch-4.4.38-rt49.patch.xz

Make sure we have the xz-utils package to unpack .xz files:
sudo apt install xz-utils
unxz patch-4.4.38-rt49.patch.xz

Dry Run The Patch

Now, let’s do a patch dry run to see what we’re getting ourselves into:
patch -p1 --dry-run <patch-4.4.38-rt49.patch | grep FAIL

Ok, so we get some hunks that fail, but it doesn’t look like it will be anything intractable. Let’s just go ahead and patch it, redirect the output to a log file and see what we have to manually fix.

patch -p1 <patch-4.4.38-rt49.patch > patch.log
cat patch.log | grep FAILED>patch_fail.log
cat patch_fail.log

So, it looks like we have three files that have been rejected, with rejection details being saved to the corresponding .rej files. Let’s take a look at the first, cpu.c.rej.

cd kernel
ls

We see four files that we are interested in: cpu.c, which is the patched file, cpu.c.orig, which is the original, unpatched file, cpu.c.rej, which shows the rejected patch attempt, and cpu.o, which is the created object file.

Let’s open up cpu.c.rej and see what the issue is:

The exact line numbering that’s indicated by the reject file for kernel patches on this system has yet to make sense to me. It appears to be indicating that the issue in the original file starts at line 740 and persists for 9 lines, whereas the issue in the new file starts at line 1056 and goes for 14 lines. However, if we search for where the changes are we get to line 429… I’m assuming that it’s just a calling function that is somehow being caught at line 740, but in any case, let’s continue.

Begin The Manual Patching…

So, if we open up all three files, cpu.c, cpu.c.rej and cpu.c.orig, we can pretty easily see what the issue is: there is a trace_sched_cpu_hotplug()function call in there that the patch file wasn’t expecting.

Indeed, if we check the patch file and search for out_release, we find:

This entry in the patch file indicates that it wants to add cpu_unplug_done(cpu);and out_cancel:after out_release and before cpu_hotplug_done(), but there is an extra trace_sched_cpu_hotplug()in there messing things up. Since the patch file didn’t expect this, it fails because it doesn’t quite know what to do. Let’s manually patch it by placing this inside of out cancel:

Let’s save cpu.c and consider this file patched!

Moving on to the next rejected file, suspend.c.rej, we see that this patch failed in two places:

Let’s search for this area in the patched file to see what the issue is.

So, here we see the first issue, thre is a pm_suspend_marker()call in between return -EINVAL;and error =.
Let’s manually patch this part and find the second issue.

Finding the second issue, it looks like the line pm_suspend_marker("exit")is unexpected, so let’s manually patch this like we did the first issue.

Save it and consider it patched!

Now, on to the third one: /net/ipv4/tcp_ipv4.c.rej

And proceeding like we did before to find these same spots in the output patched file:

We again see that there is an unexpected line (the .uid method). Manually add the lock and unlock function calls:

Finding the second section that was rejected, we also see it’s an issue involving locks:

Save it, and we should (hopefully) be done patching!

Building After The Manual Patch

Now, let’s pick up where the jetson hacks scripts have left off:

cd kernel/kernel-4.4
make xconfig

Before changing any of the configuration parameters, let’s verify that things compile without tweaking the preemption model. Just go ahead and save the default values as .config.
Continuing on with the jetson hacks stuff, lets go back to the git directory and source the makeKernel.shscript and see if it compiles.

Ok, shit blows up, not what we were hoping to see. It looks like this may be an issue in the way nvidia prefers you build kernels, going back and seeing how good our GoogleFu skills are we find this discussion:

https://devtalk.nvidia.com/default/topic/1014729/how-to-compile-the-tx2-l4t-kernel-source-/

It looks like we need to set some environment variables and select the make output directory. Let’s make these changes in the jetson hacks makeKernel.sh shell script.

Ok, let’s try making again.

We get a bunch of garbage, so let’s start from a clean slate using make mrproperlike it suggests.

Ok, then let’s manually call the commands:

mkdir $TEGRA_KERNEL_OUT
make O=$TEGRA_KERNEL_OUT tegra18_defconfig
make O=$TEGRA_KERNEL_OUT prepare
make O=$TEGRA_KERNEL_OUT zImage

It looks like we are seeing an issue with compiler warning flags. From a little more GoogleFu it appears that if we are using gcc5.x or higher we can suppress these warnings for incompatible pointer type. Let’s see if we need to chance which version of gcc we are using:

Since we’re using gcc 5.4.0 we can go into the main kernel Makefile and set -Wno-incompatible-pointer-types.

nano /usr/src/kernel/kernel-4.4/Makefile
and then search for the kbuild flag corresponding to incompatible pointer types.

Let’s change this line that enforces correct pointer usage to:
KBUILD_CFLAGS += $(call cc-option,-Wnoerror=incompatible-pointer-types)
Note: I’m not very concerned about doing this since the only places that this occurs is in the cryptography library.
Let’s save this change and manually try building the kernel image.

nvidia@tegra-ubuntu:/usr/src/kernel/kernel-4.4$ sudo make -j4 O=$TEGRA_KERNEL_OUT zImage

HO. LEE. SHIT. It built.

Selecting The Preemption Model

Phew, ok, let’s go back and make the config file and select the fully preemptive preemption model.

sudo rm -rf out/ mkdir $TEGRA_KERNEL_OUT
make O=$TEGRA_KERNEL_OUT tegra18_defconfig
make O=$TEGRA_KERNEL_OUT xconfig

In the configuration menu let’s go to Kernel Features -> Preemption Model -> Fully Preemptible Kernel (RT)
Note: if you are using a different system this will probably appear under a different tab.

If you want, you can also select to append a string to the local version. I was uncertain if it would automatically append -rt49, so I manually added this to the local version name. It does indeed add -rt49, so this was redundant.

Save and quit xconfig.

Continuing on, prepare the kernel and make it.
make -j4 O=$TEGRA_KERNEL_OUT prepare
sudo make -j4 O=$TEGRA_KERNEL_OUT zImage

This builds fine, so let’s make and install the kernel modules and device tree blobs.

sudo make O=$TEGRA_KERNEL_OUT dtb
sudo make O=$TEGRA_KERNEL_OUT modules
sudo make O=$TEGRA_KERNEL_OUT modules_install

This by default installs modules in /lib/modules/
I’m unsure if we need the compressed zImage or the regular binary Image file, so let’s just copy both over from the output directory:

sudo cp arch/arm64/boot/zImage /boot/zImage

sudo cp arch/arm64/boot/Image /boot/Image

Let’s verify that the compressed and binary files have been copied to the boot directory.

Looks good! Now, for the new kernel to take effect, reboot the machine and verify that the new kernel is being used. Let’s note the original system information using uname -rand uname -a.

Verify The Kernel Is Loaded

After rebooting, let’s verify that the kernel name has changed to reflect our patched image:

Cool! It looks like the new kernel has taken effect.
Now, the last check to be performed is to start a thread with priority level 99 and verify that it in fact shows this priority level in htop.