Tuesday, 28 August 2012

Nokia N9 Bluetooth PAN, USB & Dummy Networks

Please note: All of these instructions assume you have developer mode enabled and are familiar with using the Linux console. One of the variants of dummy networking I present here also requires a package to be installed with Inception or use of an open-mode kernel to disable aegis. I present an alternative method to use a pseudo-dummy network for people who do not wish to do that.

Background

Earlier this year I bought a Nokia N9 (then took it in for service TWICE due to a defective GPS, then returned it for a refund since Nokia had returned it un-repaired both times, then bought a new one for $200 less than I originally paid, then bought a second for my fiancé).

The SIM card I use in the N9 is a pretty basic TPG $1/month deal, which is fine for the small amount of voice calls I make, but it's 50MB of data per month is a not really enough, so I'd like it to use alternative networks wherever possible.

When working on another computer with an Internet connection, I could simply hook up the N9 via USB networking and have the computer give it a route to the Internet. That works well, but has the problem that any applications using the N9's Internet Connectivity framework (anything designed for the platform is supposed to do this via libconic) would not know that there was an Internet connection and would refuse to work - so I had to find a way to convince them that there was an active Internet connection using a dummy network. Also, this obviously wouldn't work when I was away from a computer.

I also happen to carry a pure data SIM card in my Optus MyTab with me all the time (being my primary Internet connection), so when I'm on the go I'd like to be able to connect to the Internet on the N9 via the tablet rather than use the small amount of data from the TPG SIM.

The MyTab is running CyanogenMod 7 (I'm not a fan of Android, but at $130 to try it out the price was right), so I am able to switch on the WiFi tethering on the tablet and connect that way, but it has a couple of problems:

  • It needs to be manually activated before use
  • It needs to be manually deactivated to allow the bluetooth tethering to work
  • It isn't very stable (holding a wakelock helps a lot - the terminal application can be used for this purpose)
  • It's a bit of a battery drain (at least the tablet has a huge battery)

The MyTab also supports tethering over bluetooth PAN (which I regularly use at home), so it made a lot of sense to me to connect the N9 to the tablet using that as well when I am out and about. Unfortunately, the N9 does not come with any software to connect to a bluetooth network, and I couldn't manage to find anyone else who had successfully done this (There are a couple of threads discussing it).

Fortunately, the N9 has a normal Linux userspace under the hood (one reason I'd take this over Android any day), which includes bluez 4.x and as such I was able to use that to make it do bluetooth PAN.

USB Network

Let's start with USB Networking since it is already supported on the N9 and works out of the box once developer mode is enabled (select SDK mode when plugging in).

Here's a few tricks you can do to streamline the process of using the USB network to gain an Internet connection. You will also want to follow the steps under one of the Dummy Networking sections below to allow applications (such as the web browser) to use it.

On the host, add this section to your /etc/network/interfaces (this is for Debian based distributions, if you use something else you will have work out the equivalent):

allow-hotplug usb0
iface usb0 inet static
    address 192.168.2.14
    netmask 255.255.255.0
    up iptables -t nat -I POSTROUTING -j MASQUERADE
    up iptables -A FORWARD -i usb0 -j ACCEPT
    up iptables -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT
    up echo 1 > /proc/sys/net/ipv4/ip_forward
    down echo 0 > /proc/sys/net/ipv4/ip_forward
    down iptables -F FORWARD
    down iptables -t nat -F POSTROUTING

Next, modify the same file on the N9 so that the usb0 section looks like this (this section already exists - I've just extended it a little):

auto usb0
iface usb0 inet static
    address 192.168.2.15
    netmask 255.255.255.0
    gateway 192.168.2.14
    up /usr/lib/sdk-connectivity-tool/usbdhcpd.sh 192.168.2.14
    down /usr/lib/sdk-connectivity-tool/usbdhcpd.sh stop
    up echo nameserver 208.67.222.222 >> /var/run/resolv.conf
    up echo nameserver 208.67.220.220 >> /var/run/resolv.conf
    down rm /var/run/resolv.conf

Now whenever you plug in the N9 and choose SDK mode it should automatically get an Internet connection with no further interaction required and you should be able to ping hosts on the Internet :)

But, you will probably notice that most applications (like the web browser) will still bring up the "Connect to internet" dialog whenever you use them and will refuse to work. To make these applications work we need to create a dummy network that they can "connect" to, while in reality they actually use the USB network.

    USB Networking Notes:
  • The iptables commands on the host will alter the firewall and routing rules to allow the N9 to connect to the Internet through the host. If you use your own firewall with other forwarding rules you may want to remove those lines and add the appropriate rules to your firewall instead.
  • The above commands will turn off all forwarding on the host and purge the FORWARD and POSTROUTING tables when the N9 is unplugged - if your host is a router for other things you definitely will want to remove those lines.
  • The two IP addresses used for the DNS lookups on the N9 are those of OpenDNS.org - you might want to replace them with some other appropriate servers. OpenDNS should be accessible from any Internet connection, which is why I chose them.
  • The N9 will use the most recently modified file under /var/run/resolv.conf* (specifically those listed in /etc/dnsmasq.conf) for DNS lookups. Which means that connecting to a WiFi/3G network AFTER bringing up the USB network would override the DNS settings. I suggest setting the DNS settings for your dummy network to match to avoid that problem.
  • The N9 doesn't run the down rules when it should, rather they seem to be delayed until the USB cable is plugged in again, when they are run immediately before the up rules. Because of the previous note, this isn't really an issue for the dnsmasq update, but it may be an issue if you wanted to do something more advanced.
  • Alternatively, there is an icd2 plugin for USB networking for the N900 available on gitorious. I haven't had a look at this yet to see if it works on the N9 or how it compares to the above technique. This would require installation with Inception.

Dummy Network

This approach to setting up a dummy network isn't for everyone. You are going to need to compile a package in the Harmattan platform SDK (or bug me to upload the one I built somewhere) and install it on the device with Inception, or use an open mode kernel. If you don't feel comfortable with this, you might prefer to use the technique discussed in the Alternative Dummy Network section instead.

First grab the dummy icd plugin from https://maemo.gitorious.org/icd2-network-modules

[host]$ cd /scratchbox/users/$USER/home/$USER
[host]$ git clone git://gitorious.org/icd2-network-modules/libicd-network-dummy.git
[host]$ scratchbox
[sbox]$ sb-menu
 Select -> HARMATTAN_ARMEL
[sbox]$ cd libicd-network-dummy
[sbox]$ dpkg-buildpackage -rfakeroot

Now copy /scratchbox/users/$USER/home/$USER/libicd-network-dummy_0.14_armel.deb to the N9, then install and configure it on the N9 with:

[N9]$ /usr/sbin/incept libicd-network-dummy_0.14_armel.deb

[N9]$ gconftool-2 -s -t string /system/osso/connectivity/IAP/DUMMY/type DUMMY
[N9]$ gconftool-2 -s -t string /system/osso/connectivity/IAP/DUMMY/name 'Dummy network'

[N9]$ devel-su
[N9]# /sbin/initctl restart xsession/icd2

Next time the connect to Internet dialog appears you should see a new entry called 'Dummy network' that you can "connect" to so that everything thinks there is an Internet connection, while they really use your USB or bluetooth connection.

Alternative Dummy Network

This isn't ideal in that it enables the WiFi & creates a network that nearby people can see, but it does have the advantage that it works out of the box and does not require Inception or Open Mode.

Open up settings -> internet connection -> create new connection

Fill out the settings like this:

Connection name: dummy
Network Name (SSID): dummy
Use Automatically: No
network mode: ad hoc
Security method: None

Under Advanced settings, fill out these:

Auto-retrieve IP address: No
IP address: 0.0.0.0
Subnet mask: 0.0.0.0
Default gateway: 0.0.0.0

Auto-retrieve DNS address: No
Primary DNS address: 208.67.222.222
Secondary DNS address: 208.67.220.220

These are the OpenDNS.org DNS servers - feel free to substitute your own.

Then if the 'Connect to internet' dialog comes up you can connect to 'dummy', which will satisfy that while leaving your real USB/bluetooth network alone.

Bluetooth Personal Area Networking (PAN)

This is very much a work in progress that I hope to polish up and eventually package up and turn into an icd2 plugin so that it will nicely integrate into the N9's internet connectivity framework.

First thing's first - you will need to enable the bluetooth PAN plugin on the N9, by finding the line DisabledPlugins in /etc/bluetooth/main.conf and removing 'network' from the list so that it looks something like:

[General]

# List of plugins that should not be loaded on bluetoothd startup
# DisablePlugins = network,hal
DisablePlugins = hal

# Default adaper name
...

Then restart bluetooth by running:

[N9]$ devel-su
[N9]# /sbin/initctl restart xsession/bluetoothd

Until I package this up more nicely you will need to download my bluetooth tethering script from:

https://raw.github.com/DarkStarSword/junk/master/blue-tether.py

You will need to edit the dev_dbaddr in the script to match the bluetooth device you are connecting to. Note that I will almost certainly change this to read from a config file in the very near future, so you should double check the instructions in the script first.

Put the modified script on the N9 under /home/user/blue-tether.py

You first will need to pair with the device you are connecting to in the N9's bluetooth GUI like usual.

Once paired, you may run the script from the terminal with develsh -c ./blue-tether.py

The bluetooth connection will remain up until you press enter in the terminal window. Currently it does not detect if the connection goes away, so you would need to restart it in that case.

For convenience you may create a desktop entry for it by creating a file under /usr/share/applications/blue-tether.desktop with this contents:

[Desktop Entry]
Type=Application
Name=Blue Net
Categories=System;
Exec=invoker --type=e /usr/bin/meego-terminal -n -e develsh -c /home/user/blue-tether.py
Icon=icon-m-bluetooth-lan

Again, this is very much an active work in progress - expect to see a packaged version soon, and hopefully an icd2 plugin before not too long.

One Outstanding Graphical Niggle

You may have noticed that the dummy plugin doesn't have it's own icon - in the connect to Internet dialog it seems to pick a random icon, and once connected the status bar displays it as though it was a cellular data connection. As far as I can tell, the icons (and other connectivity related GUI elements) are selected by /usr/lib/conniaptype/lib*iaptype.so which is loaded by /usr/lib/libconinetdui.so which is in turn used by /usr/bin/sysuid. I haven't managed to find any API references or documentation for these and I suspect being part of Nokia's GUI that they fall into the firmly closed source side of Harmattan. This would be nice to do properly if I want to create my own icd2 plugins, so if anyone has some pointers for this, please leave a note in the comments.

Why is Inception required for real dummy networking?

Well, it's because the Internet Connectivity Daemon requests CAP::sys_module (i.e. The capability to load kernel modules):

~ $ ariadne sh
Password for 'root':

/home/user # accli -I -b /usr/sbin/icd2
Credentials:
        UID::root
        GID::root
        CAP::kill
        CAP::net_bind_service
        CAP::net_admin
        CAP::net_raw
        CAP::ipc_lock
        CAP::sys_module
        SRC::com.nokia.maemo
        AID::com.nokia.maemo.icd2.
        icd2::icd2
        icd2::icd2-plugin
        Cellular

Because of this, aegis will only allow it to load libraries that originated from a source that has the ability to grant CAP::sys_module, which unfortunately (but understandably given what the capability allows) is only the system firmware by default, so attempting to load it would result in this (in dmesg):

credp: icd2: credential 0::16 not present in source SRC::9990007
Aegis: credp_kcheck failed 9990007 libicd_network_dummy.so
Aegis: libicd_network_dummy.so verification failed (source origin check)

Ideally the developers would have thought of this and separated the kernel module loading out into a separate daemon so that icd2 would not require this credential and therefore would allow third-party plugins to be loaded, but since that is not the case we have to use Inception to install the dummy plugin from a source that has the ability to grant the same permissions that the system firmware enjoys (Note that the library does not actually request any permissions because libraries always inherit the permissions of the binary that loaded them - it just needs to have come from a source that could have granted it that permission).

Also, if anyone could clarify what the icd2::icd2-plugin credential is for I would appreciate it - I feel like I've missed something because it's purpose as documented (to load icd2 plugins) seems rather pointless to me (icd2 loads libraries based on gconf settings, which it can do just as well without this permission... so what is the point of this?).

Thursday, 23 February 2012

Tiling tmux Keybindings

When most people use a computer, they are are using either a compositing or stacking window manager - which basically means that windows can overlap. The major alternative to this model is known as a tiling window manager, where the window manager lays out and sizes windows such that they do not overlap each other.

I started using a tiling window manager called wmii some years ago after buying a 7" EeePC netbook and trying to find alternative software more suited to the characteristics of that machine. Most of the software I ended up using on that machine I now use on all of my Linux boxes, because I found that it suits my workflow so much better.

Wmii as a window manager primarily focuses on organising windows into tags (like multiple desktops) and columns. Within a column windows can either be sized evenly, or a single window can take up the whole height of the column, optionally with the title bars of the other windows visible (think minimised windows on steroids).

Wmii is very heavily keyboard driven (which is one of it's strengths from my point of view), though a mouse can be used for basic navigation as well. It is also heavily extensible with scripting languages and in fact almost all interactions with the window manager are actually driven by the script. It defaults to using a shell script, but also ships with equivalent python and ruby scripts (the base functionality is the same in each), and is easy to extend.

By default keyboard shortcuts provide ways to navigate left and right between columns, up and down between windows within a column, and to switch between 10 numbered tags (more tags are possible, but rarely needed). Moving a window is as simple as holding down shift while performing the same key combos used to navigate, and columns and tags are automatically created as needed (moving a window to the right of the rightmost column would create a new column for example), and automatically destroyed when no longer used.

Recent versions of wmii also work really well with multiple monitors (though there is still some room for improvement in this area) allowing windows to really easily be moved between monitors with the same shortcuts used to move windows between columns (and they way it differentiates between creating a new column on the right of the left monitor versus moving the window to the right monitor is pure genius).

Naturally with such a powerful window manager, I want to use it to manage all my windows and all my shells. The problem with this of course is SSH - specifically, when I have many remote shells open at the same time and what happens when the network goes away. You see, I've been opening a new terminal and SSH connection for each remote shell so I can use wmii to manage them, which works really great until I need to suspend my laptop or unplug it to go to a meeting, then have to spend some time re-establishing each session, getting it back to the right working directory, etc. And, I've lost the shell history specific to each terminal.

Normally people would start screen on the remote server if they expect their session to go away, and screen can also manage a number of shells simultaneously, which would be great... except that it is no where near as good at managing those shells as wmii can manage windows and if I'm going to switch it would need to be pretty darn close.

I've been aware for some time of an alternative to screen called tmux which seemed to be much more sane and feature-rich than screen, so the other day I decided to see if I could configure tmux to be a realistic option for managing many shells on a remote machine that I could detach and re-attach from when suspending my laptop.

Tmux supports multiple sessions, "windows" (like tags in wmii), and "panes" (like windows in wmii). I managed to come up with the below configuration file which sets up a bunch of keybindings similar to the ones I use in wmii (but using the Alt modifier instead of the Windows key) to move windows... err... "panes" and to navigate between them.

Unlike wmii, tmux is not focussed around columns, which technically gives it more flexibility in how the panes are arranged, but sacrifices some of the precision that the column focus gives wmii (in this regard tmux is more similar to some of the other tiling window managers available).

None of these shortcut keys need to have the tmux prefix key pressed first, as that would have defeated the whole point of this exercise:

Alt + ' - Split window vertically *
Alt + Shift + ' - Split window horizontally

Alt + h/j/k/l - Navigate left/down/up/right between panes within a window
Alt + Shift + h/j/k/l - Swap window with the one before or after it **

Alt + Ctrl + h/j/k/l - Resize pane *** - NOTE: Since many environments use Ctrl+Alt+L to lock the screen, you may want to change these to use the arrow keys instead.

Alt + number - Switch to this tag... err... "window" number, creating it if it doesn't already exist.
Alt + Shift + number - Send the currently selected pane to this window number, creating it if it doesn't already exist.

Alt + d - Tile all panes **
Alt + s - Make selected pane take up the maximum height and tile other panes off to the side **
Alt + m - Make selected pane take up the maximum width and tile other panes below **

Alt + f - Make the current pane take up the full window (actually, break it out into a new window). Reverse with Alt + Shift + number **

Alt + PageUp - Scroll pane back one page and enter copy mode. Release the alt and keep pressing page up/down to scroll and press enter when done.

* Win+Enter opens a new terminal in wmii, but Alt+Enter is already used by xterm, so I picked the key next to it

** These don't mirror the corresponding wmii bindings because I could find no exact equivalent, so I tried to make them do something similar and sensible instead.

*** By default there is no shortcut key to resize windows in wmii (though the python version of the wmiirc script provides a resize mode which is similar), so I added some to my scripts.


~/.tmux.conf (Download Latest Version Here)

# Split + spawn new shell:
# I would have used enter like wmii, but xterm already uses that, so I use the
# key next to it.
bind-key -n M-"'" split-window -v
bind-key -n M-'"' split-window -h

# Select panes:
bind-key -n M-h select-pane -L
bind-key -n M-j select-pane -D
bind-key -n M-k select-pane -U
bind-key -n M-l select-pane -R

# Move panes:
# These aren't quite what I want, as they *swap* panes *numerically* instead of
# *moving* the pane in a specified *direction*, but they will do for now.
bind-key -n M-H swap-pane -U
bind-key -n M-J swap-pane -D
bind-key -n M-K swap-pane -U
bind-key -n M-L swap-pane -D

# Resize panes (Note: Ctrl+Alt+L conflicts with the lock screen shortcut in
# many environments - you may want to consider the below alternative shortcuts
# for resizing instead):
bind-key -n M-C-h resize-pane -L
bind-key -n M-C-j resize-pane -D
bind-key -n M-C-k resize-pane -U
bind-key -n M-C-l resize-pane -R

# Alternative resize panes keys without ctrl+alt+l conflict:
# bind-key -n M-C-Left resize-pane -L
# bind-key -n M-C-Down resize-pane -D
# bind-key -n M-C-Up resize-pane -U
# bind-key -n M-C-Right resize-pane -R

# Window navigation (Oh, how I would like a for loop right now...):
bind-key -n M-0 if-shell "tmux list-windows|grep ^0" "select-window -t 0" "new-window -t 0"
bind-key -n M-1 if-shell "tmux list-windows|grep ^1" "select-window -t 1" "new-window -t 1"
bind-key -n M-2 if-shell "tmux list-windows|grep ^2" "select-window -t 2" "new-window -t 2"
bind-key -n M-3 if-shell "tmux list-windows|grep ^3" "select-window -t 3" "new-window -t 3"
bind-key -n M-4 if-shell "tmux list-windows|grep ^4" "select-window -t 4" "new-window -t 4"
bind-key -n M-5 if-shell "tmux list-windows|grep ^5" "select-window -t 5" "new-window -t 5"
bind-key -n M-6 if-shell "tmux list-windows|grep ^6" "select-window -t 6" "new-window -t 6"
bind-key -n M-7 if-shell "tmux list-windows|grep ^7" "select-window -t 7" "new-window -t 7"
bind-key -n M-8 if-shell "tmux list-windows|grep ^8" "select-window -t 8" "new-window -t 8"
bind-key -n M-9 if-shell "tmux list-windows|grep ^9" "select-window -t 9" "new-window -t 9"

# Window moving (the sleep 0.1 here is a hack, anyone know a better way?):
bind-key -n M-')' if-shell "tmux list-windows|grep ^0" "join-pane -d -t :0" "new-window -d -t 0 'sleep 0.1' \; join-pane -d -t :0"
bind-key -n M-'!' if-shell "tmux list-windows|grep ^1" "join-pane -d -t :1" "new-window -d -t 1 'sleep 0.1' \; join-pane -d -t :1"
bind-key -n M-'@' if-shell "tmux list-windows|grep ^2" "join-pane -d -t :2" "new-window -d -t 2 'sleep 0.1' \; join-pane -d -t :2"
bind-key -n M-'#' if-shell "tmux list-windows|grep ^3" "join-pane -d -t :3" "new-window -d -t 3 'sleep 0.1' \; join-pane -d -t :3"
bind-key -n M-'$' if-shell "tmux list-windows|grep ^4" "join-pane -d -t :4" "new-window -d -t 4 'sleep 0.1' \; join-pane -d -t :4"
bind-key -n M-'%' if-shell "tmux list-windows|grep ^5" "join-pane -d -t :5" "new-window -d -t 5 'sleep 0.1' \; join-pane -d -t :5"
bind-key -n M-'^' if-shell "tmux list-windows|grep ^6" "join-pane -d -t :6" "new-window -d -t 6 'sleep 0.1' \; join-pane -d -t :6"
bind-key -n M-'&' if-shell "tmux list-windows|grep ^7" "join-pane -d -t :7" "new-window -d -t 7 'sleep 0.1' \; join-pane -d -t :7"
bind-key -n M-'*' if-shell "tmux list-windows|grep ^8" "join-pane -d -t :8" "new-window -d -t 8 'sleep 0.1' \; join-pane -d -t :8"
bind-key -n M-'(' if-shell "tmux list-windows|grep ^9" "join-pane -d -t :9" "new-window -d -t 9 'sleep 0.1' \; join-pane -d -t :9"

# Set default window number to 1 instead of 0 for easier key combos:
set-option -g base-index 1

# Pane layouts (these use the same shortcut keys as wmii for similar actions,
# but don't really mirror it's behaviour):
bind-key -n M-d select-layout tiled
bind-key -n M-s select-layout main-vertical \; swap-pane -s 0
bind-key -n M-m select-layout main-horizontal \; swap-pane -s 0

# Make pane full-screen:
bind-key -n M-f break-pane
# This isn't right, it should go back where it came from:
# bind-key -n M-F join-pane -t :0

# We can't use shift+PageUp, so use Alt+PageUp then release Alt to keep
# scrolling:
bind-key -n M-PageUp copy-mode -u

# Don't interfere with vi keybindings:
set-option -s escape-time 0

# Enable mouse. Mostly to make selecting text within a pane not also grab pane
# borders or text from other panes. Unfortunately, tmux' mouse handling leaves
# something to be desired - no double/tripple click support to select a
# word/line, all mouse buttons are intercepted (middle click = I want to paste
# damnit!), no automatic X selection integration(*)...
set-window-option -g mode-mouse on
set-window-option -g mouse-select-pane on
set-window-option -g mouse-resize-pane on
set-window-option -g mouse-select-window on

# (*) This enables integration with the clipboard via termcap extensions. This
# relies on the terminal emulator passing this on to X, so to make this work
# you will need to edit your X resources to allow it - details below.
set-option -s set-clipboard on


You may also need to alter your ~/.Xresources file to make some things work (this is for xterm):

~/.Xresources (My Personal Version)

/* Make Alt+x shortcuts work in xterm */
XTerm*.metaSendsEscape: true
UXTerm*.metaSendsEscape: true

/* Allow tmux to set X selections (ie, the clipboard) */
XTerm*.disallowedWindowOps: 20,21,SetXprop
UXTerm*.disallowedWindowOps: 20,21,SetXprop

/* For some reason, this gets cleared when reloading this file: */
*customization: -color

To reload this file without logging out and back in, run:
xrdb ~/.Xresources

There's a pretty good chance that I'll continue to tweak this, so I'll try to update this post anytime I add something cool.

Edit 27/02/2012: Added mouse & clipboard integration & covered changes to .Xresources file.

Friday, 17 February 2012

SSH passwordless login WITHOUT public keys

I was recently in a situation where I needed SSH & rsync over SSH be to able to log into a remote site without prompting for a password (as it was being called from within a script and would have been non-trivial to make the script pass in a password, especially as OpenBSD-SSH does not provide a trivial mechanism for scripts to pass in passwords - see below).

Normally in this situation one would generate a public / private keypair and use that to log in without a prompt, either by leaving the private key unencrypted (ie, not protected by a passphrase), or by loading the private key into an SSH agent prior to attempting to log in (e.g. with ssh-add).

Unfortunately the server in question did not respect my ~/.ssh/authorized_keys file, so public key authentication was not an option (boo).


Well, it turns out that you can pre-authenticate SSH sessions such that an already open session is used to authenticate new sessions (actually, new sessions are basically tunnelled over the existing connection).

The option in question needs a couple of things set up to work, and it isn't obviously documented as a way to allow passwordless authentication - I had read the man page multiple times and hadn't realised what it could do until Mikey at work pointed it out to me.

To get this to work you first need to create (or modify) your ~/.ssh/config as follows:

Host *
  ControlPath ~/.ssh/master_%h_%p_%r


Now, manually connect to the host with the -M flag to ssh and enter your password as normal:

ssh -M user@host

Now, as long as you leave that connection open, further normal connections (without the -M flag) will use that connection instead of creating their own one, and will not require authentication.


Edit:
Note that you may instead edit your ~/.ssh/config as follows to have SSH always create and use Master connections automatically without having to specify -M. However, some people like to manually specify when to use shared connections so that the bandwidth between the low latency interactive sessions and high throughput upload/download sessions doesn't mix as that can have a huge impact on the interactive session.

Host *
  ControlPath ~/.ssh/master_%h_%p_%r

  ControlMaster auto



Alternate method, possibly useful for scripting


Another method I was looking at using was specifying a program to return the password in the SSH_ASKPASS environment variable. Unfortunately, this environment variable is only used in some rare circumstances (namely, when no tty is present, such as when a GUI program calls SSH or rsync), and would not normally be used when running SSH from a terminal (or in the script as I was doing).

Once I found out about the -M option I stopped pursuing this line of thinking, but it may be useful in a script if the above pre-authentication method is not practical (perhaps for unattended machines).

To make SSH respect the SSH_ASKPASS environment variable when running from a terminal, I wrote a small LD_PRELOAD library libnotty.so that intercepts calls to open("/dev/tty") and causes them to fail.

If anyone is interested, the code for this is in my junk repository (libnotty.so & notty.sh). You will also need a small script that echos the password (I hope it goes without saying that you should check the permissions on it) and point the SSH_ASKPASS environment variable to it.

https://github.com/DarkStarSword/junk

Git trick: Deleting non-ancestor tags

Today I cloned the git tree for the pandaboard kernel, only to find that it didn't include the various kernel version tags from upstream, so running things like git describe or git log v3.0.. didn't work.

My first thought was to fetch just the tags from an upstream copy of the Linux kernel I had on my local machine:

git fetch -t ~/linus

Unfortunately I hadn't thought that though very well, as that local tree also contained all the tags from the linux-next tree, the tip tree as well as a whole bunch more from various distro trees and several other random ones, which I didn't want cluttering up my copy of the pandaboard kernel tree.

This lead me to try to find a way to delete all the non-ancestor tags (compared to the current branch) to simplify the tree. This may be useful to others to remove unused objects and make the tree smaller after a git gc -- that didn't factor into my needs as I had specified ~/linus to git clone with --reference so the objects were being shared.

Anyway, this is the script I came up with, note that this only compares the tags with the ancestors of the *current HEAD*, so you should be careful that you are on a branch with all the tags you want to keep first. Alternatively you could modify this script to collate the ancestor tags of every local/remote branch first, though this is left as an exercise for the reader.


#!/bin/sh

ancestor_tags=$(mktemp)
echo -n Looking up ancestor tags...\ 
git log --simplify-by-decoration --pretty='%H' > $ancestor_tags
echo done.

for tag in $(git tag --list); do
 echo -n "$tag"
 commit=$(git show "$tag" | awk '/^commit [0-9a-f]+$/ {print $2}' | head -n 1)
 echo -n ...\ 
 if [ -z "$commit" ]; then
  echo has no commit, deleting...
  git tag -d "$tag"
  continue
 fi
 if grep $commit $ancestor_tags > /dev/null; then
  echo is an ancestor
 else
  echo is not an ancestor, deleting...
  git tag -d "$tag"
 fi
done

rm -fv $ancestor_tags


Also note that this may still leave unwanted tags in if they are a direct ancestor of the current HEAD - for instance, I found a bunch of tags from the tip tree had remained afterwards, but they were much more manageable to delete with a simple for loop and a pattern.

Sunday, 14 November 2010

Bluetooth 3G Modems on Debian Linux: Chatscripts and rfcomm bluez

I've been using 3G mobile broadband to my primary Internet connection for a couple of years now, and ever since I moved out of college it has become my only Internet connection at home - It's saved me the cost, delays and headache of dealing with Telstra to sort out some kind of wired link.

In my particular setup I removed the 3G data SIM card from the USB modem that came with my plan and placed it in my Nokia N900, which I use as a bluetooth modem for my various computers (my N95 used to fill this role) as well as having the convenience of having the N900 itself connected wherever and whenever I want.

Every now and again I get asked about my setup - a lot of people seem to have had trouble setting up bluetooth modems in Linux. This is understandable - last time I checked out Network Manager I found that it could set up a USB 3G modem pretty easily but had zero provisions to set up a bluetooth modem, and the Linux bluetooth stack (bluez) also leaves something to be desired (try using bluez 4 to pair to something without X... fail). I've previously been directing these people to some posts I made on the CLUG mailing list that had my configuration files, but it's clear that it will be easier to direct people to a blog post.

The quickstart guide for those people would be scan this post, grab the file excerpts and place them where they belong, restart bluetooth and run the pon <profile> command to try to bring up the 3G connection. Then when that inevitably doesn't work read the rest of the article to figure out what you need to change to make it work. I should note that I'm using Debian so some of this article may not apply to other non Debian derived distributions (the pon and poff commands came from Debian, for example)

Firstly, a little background on the technical details we care about: 3G modems provide PPP (Point-to-Point Protocol) links to your ISP, just like the dial-up modems of old did. We even use the same protocol and method to talk to them that we used to use to talk to dial-up modems - the AT command set over some kind of serial like interface (itself encapsulated in a USB or bluetooth link).

A few things have changed though - for one they are much faster than dial-up modems. Authentication is also handled differently - we no longer (typically) use a username and password, instead handling the authentication in the SIM card. And instead of calling a phone number for your local ISP, we instead call a special number (such as *99#) to establish the link. Added to this, we now also have something called an APN (Access Point Name) to identify the IP packet data network that we want to communicate with.

There are a few important consequences of all of this. Firstly, we are using the same infrastructure (ppp, chatscripts, wvdial, ...) in Linux to connect to 3G that we used to use to connect old dial-up connections. Secondly, despite not requiring a username and password any more we still have to provide something in their stead to make everything happy even though they are ignored. We also still have the same nonsense of every ISP having a subtle difference in their authentication that affects how we connect to them. There can also be subtle differences in the AT commands we need to communicate with different modems to get them to do what we want.

Some people like using wvdial to establish their ppp links. If that works for you that's great, but my experience has been that wvdial fails in many circumstances, and getting it to work in those cases is quite often impossible, so I'm going to cover a much more tunable back to basics method: ppp + chatscripts + rfcomm.

Firstly, make sure ppp is installed (apt-get install ppp)... I hope you have some other connection than your 3G link to get that... Perhaps whatever you are reading this blog on?

We'll start with a USB connection - no sense adding the extra complexities of a bluetooth link to the mix until we have that working. I'll show the profiles I use for both the Huawei E220 USB modem that came with the plan and the USB link to my N900 (or N95).

We need two configuration files for each profile - the configuration for the ppp side of the link goes under /etc/ppp/peers/<profile> and the chatscript which tells the modem how to establish the ppp link under /etc/chatscripts/<profile>. The chatscript is referenced from the ppp configuration file, so it is possible to use one chatscript for multiple profiles, assuming the profiles are talking to the same modem (or at least that one modem doesn't require special treatment) and using the same APN.

The chatscript is responsible for initialising the modem and getting the connection to the point where pppd can take over, so I'll start with that. Here's the chatscript that I use for my Nokia N900 (USB and bluetooth), Nokia N95 and Huawei E220 USB mdoem:

/etc/chatscripts/optus-n900
ABORT BUSY
ABORT ERROR
ABORT 'NO CARRIER'
REPORT CONNECT
TIMEOUT 10
"" "ATZ"
OK "ATE1V1&D2&C1S0=0+IFC=2,2"
OK AT+CGDCONT=1,"IP","<APN>"
OK "ATE1"

OK "ATDT*99#"

CONNECT \c

IMPORTANT: Replace <APN> with the APN for your connection (for me on Optus post-paid mobile broadband that is "connect", for Lucy on Three pre-paid mobile broadband that is "3services" - refer to the documentation that came with your plan to find out what it is for you). If you don't you will run into inexplicable problems later.

I said above that some modems need to be treated specially in the chatscript. I used to have to use this on my Huawei E220 because I could not find one script that would satisfy both it and my N95 (the AT+IPR line below was necessary for the E220, but caused the N95 to fail), but the differences no longer seem to be necessary (firmware upgrade? Some other change I made and forgot about? Phase of the moon? I can't recall), but it might help someone so here it is:

/etc/chatscripts/optus-huawei
ABORT BUSY
ABORT ERROR
ABORT 'NO CARRIER'
REPORT CONNECT
TIMEOUT 10
"" "ATZ"
OK AT+CGDCONT=1,"ip","connect"
OK "ATE1V1&D2&C1S0=0+IFC=2,2"
OK "AT+IPR=115200"

OK "ATE1"

TIMEOUT 60
"" "ATD*99#"

CONNECT \c

Now we need a profile for ppp that references that chatscript and contains all the settings necessary to establish a successful ppp link. I have a number of these for different profiles, depending on which modem I'm using and whether I'm using my Optus link or Lucy's Three link, but they all pretty similar and include some common elements, so I'll just show one combined file with comments for differences between them. All these options and more are described in man pppd:

/etc/ppp/peers/<profile>
# This can help track down problems:
#debug

# The modem device to talk to:
/dev/ttyACM0 # N900/N95 USB
#/dev/ttyUSB0 # Huawei USB
#/dev/rfcomm0 # N900 Bluetooth

# In some cases it may be necessary to specify a baud rate,
# but generally it's best to let ppp detect this:
#115200
#230400
#460800
#... etc

# Optus requires both of these options, Three requires neither.
# Other ISPs may have different authentication requirements:
refuse-chap
require-pap

# When to detach from the console:
updetach
#nodetach

# These are generally necessary:
crtscts
noauth
noipdefault

# If the connection drops out try to reopen it:
persist

# We want this to be the default internet connection:
defaultroute
replacedefaultroute

# Get DNS settings from the ISP:
usepeerdns

# not used, but we must provide something:
user "na"
password "na"

# Playing with these compression options *may* improve
# performance, but get it working first:
noccp
nobsdcomp
novj
#nodeflate

#What chatscript we are using in this profile:
connect "/usr/sbin/chat -s -S -V -f /etc/chatscripts/optus-n900"
#connect "/usr/sbin/chat -s -S -V -f /etc/chatscripts/optus-huawei"


Got that? Great, let's give it a go! Connect your modem by USB, do whatever magic incantations you need to get your modem to reveal it's modem aspects to Linux (for Nokia phones this is usually select the PC suite mode when you plug it in, some people report having to do strange things with kernel modules and udev to poke their Huawei E220 modems, though I have never found that necessary myself), shutdown your network manager and run this in a terminal:

pon <profile>

All going well hopefully you will see some output like this:

ATZ
OK
ATE1V1&D2&C1S0=0+IFC=2,2
OK
AT+CGDCONT=1,"IP","connect"
OK
ATE1
OK
ATDT*99#
CONNECTchat: Nov 14 13:12:13 CONNECT
Serial connection established.
Using interface ppp0
Connect: ppp0 <--> /dev/rfcomm0
PAP authentication succeeded
Cannot determine ethernet address for proxy ARP
local IP address www.xxx.yyy.zzz
remote IP address 10.6.6.6
primary DNS address 211.29.132.12
secondary DNS address 61.88.88.88

Obviously the exact output will vary, but usually if you see some IP and DNS addresses you have successfully connected. Otherwise you really should try to get this working before continuing to the bluetooth part. If you got as far as the CONNECT... "Serial connection established." your modem and chatscripts are probably working (assuming you APN in the chatscript is correct) and you may need to look at the ppp configuration, though you might just try a few times first - sometimes my connections take a few attempts to come up successfully.

If you haven't got as far as the CONNECT you'll need to check your modem, coverage and chatscripts to try to locate the problem. Also double check that you have specified the correct device in the ppp configuration. If you are using a phone as your modem you might try rebooting it. If you get a NO CARRIER you are likely out of coverage or your modem couldn't connect to a nearby base station for some other reason (such as it being full), though the symptoms for that are unfortunately not always consistent - failing to connect to the modem at all can also be a symptom of that (and a host of other possible causes) for instance.

There's just too many things that can go wrong by this point for me to cover here. Google is your friend. You may be able to find other people's chatscripts and ppp configuration for your modem and/or ISP that you could try.

Now you've successfully got a connection with ppp + chatscripts it's time to add bluetooth into the mix. Serial connections over bluetooth are handled with the rfcomm protocol. They are controlled with the rfcomm program and once bound show up as /dev/rfcomm0 and similar. A device can have different serial services listening on different rfcomm "channels" (like IP ports), and there is no guarantee for which services appear on which rfcomm channel. My Nokia N95 reveals it's modem on rfcomm channel 2 and it's GPS on rfcomm channel 5 (via ExtGPS), while my N900 reveals it's modem on rfcomm channel 1 (In fact it is actually running rfcomm -S -- listen -1 1 /usr/bin/pnatd {}). You can use an rfcomm scanner like rfcomm_scan from Collin Mulliner's BT Audit suite or do some trial and error to find the channel you need (there's only 30 channels and it's usually a low number).

Add a section like the following to your /etc/bluetooth/rfcomm.conf:

/etc/bluetooth/rfcomm.conf:
rfcomm0 {
 bind yes;
 device AA:BB:CC:DD:EE:FF;
 channel 1;
 comment "N900 Data";
}

Replacing the bluetooth address and channel number as appropriate. Then tell rfcomm to bind rfcomm0 to this device with rfcomm bind 0 (this will also happen automatically at boot).

You should now see a new file /dev/rfcomm0 which we use to communicate with the modem over bluetooth. You should make a copy of the /etc/ppp/peers/<profile> you were using to connect over bluetooth and change the new profile to use /dev/rfcomm0.

Now, we need to pair the devices together and tell the phone to trust the computer to connect whenever it wants. Pairing in bluez is still a bit hairy, particularly if you aren't using KDE or GNOME (like me) which provide their own bluez agents. In that case you don't have many options available to you. Bluez 3 used to have a hack in which you could specify a PIN to pair with under /var/lib/bluetooth/<device>/pincodes to allow pairing without an agent, however that does not work in bluez 4. Bluez provides an example console agent in the examples directory, but I have never managed to get it to work reliably with bluez 3 or bluez 4, so we now need a bluez agent, which lacking any decent console/curses agents means we need X (FAIL). This nonsense is now true even of HID devices which could previously be paired and activated with a simple hidd --search, which now doesn't trust them to re-pair to the computer so they stop working as soon as they start power saving (FAIL). Sigh, one day I'll get around to writing a decent ncurses bluez agent if no one beats me to it, but I digress.

If you aren't using GNOME or KDE you might try using the GTK bluez agent blueman instead. You'll need to have it's system tray applet (blueman-applet) running for blueman-manager to work properly (FAIL - I don't have a system tray. At least it doesn't actually need to show the tray icon to work, though if you want that "trayer" or "stalonetray" can be used to provide a temporary system tray).

Anyway, once you have some kind of bluez agent running, be it KDE's kbluetooth, gnome-bluetooth or blueman you can try to pair your phone. I say "try" because even with an agent, pairing with bluez is still hairy. In theory running the pon <profile> command will attempt to open the bluetooth link and initiate pairing, causing both phone and computer to ask for a PIN to authenticate each other - enter the same on each. If you're really lucky they might even remember that they have been paired so you don't have to do it again the next time. If you're unlucky and that didn't work you can try deleting any existing pairing from the computer and phone then using your bluetooth agent's interface to initiate a pairing. Rebooting and walking around your computer in circles while chanting "all hail bluez" over and over may also help - I wish you luck.

The good news is that you only need the bluez agent while pairing - once you successfully pair and manage to get the 3G link up (and down and up a second time to make sure it remembered what to do) you usually don't have to touch bluez again and things get a lot easier. Unless one of the devices pairings get lost or confused... Or your bluetooth address changes, or ...

Hopefully by this stage you have successfully managed to pair your computer and phone you should be able to use the pon and poff commands to bring the connection up and down as above. Congratulations, you're done! You can stop reading now. If you are getting a "host is down" error you have not successfully paired or the bluetooth link has otherwise failed. Another symptom of (non-pairing) bluetooth related problems that I've seen was getting no OK response after the initial ATZ. If you are pairing OK but only getting partway through the connection sequence you may have to go back to debugging your chatscripts and ppp options like I talked about above.


The (broadcom) bluetooth dongle I use on my EeePC introduces another complexity to the process - every time it is plugged in a couple of bits in it's bluetooth address change at random for no good reason (check with hciconfig), which as you can imagine makes it rather hard to maintain a pairing between it and anything else. I've also come across some (broadcom) bluetooth dongles with a bluetooth address of 00:00:00:00:00:00. Oddly enough, very few devices like pairing with them, and fewer still will re-pair with them automatically. If you have this problem tell broadcom they suck buy a CSR dongle you might try the dbaddr utility in the bluez source to force them to use a particular bluetooth address (if they support changing it through software, which of course is no guarantee). The script I use on my EeePC to connect shuts down my network manager and any running DHCP client, changes the bluetooth address on the dongle and opens the 3G connection:

/etc/init.d/wicd stop
killall dhclient
killall dhclient3

/usr/local/sbin/dbaddr AA:BB:CC:DD:EE:FF
hciconfig hci0 reset

pon optus-blue

Tuesday, 9 November 2010

Remind+wyrd events in other timezones & other tricks

When I bought my EeePC I challenged myself to wherever possible find lightweight (console/curses if possible) and keyboard friendly alternatives to the software I had been using. What I discovered was that I quickly began to prefer that way of interacting with the computer to my previous KDE centric setup, so now almost all of my desktop and laptops have the same setup.

One application which I sought to replace was a calendar. I discovered a lightweight console calendar program called "remind" with a ncurses frontend known as "wyrd":


A basic event in file processed by remind might look something like this:

REM Nov 09 2010 AT 18:00 MSG Write a blog entry

That should be reasonably self explanatory. You can also specify some quite advanced recurring events in fairly natural ways:

REM Mon Tue Wed Thu Fri AT 9:00 MSG Go to work

REM Dec 25 MSG Christmas!

Or to specify the fourth Thursday of every month (Technically the next Thursday on or after the 22nd of any month):

REM Thursday 22 AT 19:00 DURATION 3:00 MSG Canberra Linux Users Group Meeting

There are also syntaxes for advanced reminders (+) and repetition (*) - but this isn't a full remind tutorial, read the man pages or search google (tip: add wyrd in your search to narrow the results down).

You may have noticed that I never specified a timezone in those examples. Unfortunately remind was written a long time ago on a hermit like platform that knew nothing of how time worked elsewhere in the world (DOS) and as a result doesn't have any support for events in other timezones built in. Just defining the event in local time may not be suitable depending on what both timezones do with daylight savings.

But there is another thing you should know about remind - it's not just a calendar domain specific language (though as you can see from those examples it certainly includes plenty of DSL constructs), it is in fact a calendar oriented programming language and we can use that to work around this limitation.

Seriously, let me say that one more time. My calendar is specified in a programming language. That is awesome. I can specify events to only occur once every blue moon---for real. I could shell out and have reminders only occur if my IP address indicates I'm at the office. Seriously, it could remind me to catch the bus only if I haven't already done so (note to self: make it do that, that would be cool).

Specifying a one off event in another timezone isn't in itself terribly difficult:

REM [trigger(tzconvert('2010-09-11@18:20', "US/Pacific"))] +30 DURATION 1:00 MSG Look up

The problem with this method is that there is no way to specify advanced recursion. tzconvert takes a datetime and returns a datetime. There's no way to say "every monday in that timezone" or "every fortnight commencing on x in that timezone" or "on the last Sunday of October every year in that timezone", which remind has no trouble doing for local events.

Remind's programing language capability is unfortunately somewhat limited - mixing the DSL grammar and functions together is a bit kludgey. It's easy to cast the output of a function to a string and use it in the grammar (as above), but going the other way is a little more difficult. For instance, variables are set using the SET command, but if there is any way to set a variable from a function it has escaped me. Functional programing techniques may be usable to work around this, but I get the impression that remind's author didn't exactly design it with that in mind - for one thing recursive calls are explicitly disallowed.

But, we can INCLUDE another file, which will then be executed by remind (even if it's included multiple times) and will be able to use the DSL commands and have access to any variables already defined, so we can use that mechanism to create a function that will do what we want. After a bit of playing around today I finally settled on this:

# USAGE:
# SET these variables then INCLUDE this script:
#
# tz_src - the timezone the event is in
# tz_src_date - the date component of the event as would be passed to REM,
# including any repetition and reminders
# tz_src_time - the time component of the event in hh:mm form
# tz_src_trem - any time repetition, reminders, DURATION, etc. as passed into
# REM (if not desired, set to "")
# tz_msg - The message to print.
#
# Afterwards tz_dst_time will be set for *today's* occurrence of the event in
# localtime, or unset if no event occurs.


# Find next date in src timezone that occurs today() in localtime:
REM [tz_src_date] SCANFROM [trigger(today()-2)] UNTIL [trigger(today()+2)] SATISFY \
 coerce("DATE", tzconvert(datetime(trigdate(), tz_src_time), tz_src)) == today()
IF trigvalid()
 # We know local date is today from SATISFY, convert time to local:
 SET __dst_dt tzconvert(datetime(trigdate(), tz_src_time), tz_src)
 SET tz_dst_time coerce("TIME", __dst_dt)

 REM [trigger(today())] AT [tz_dst_time] [tz_src_trem] MSG [tz_msg]
ELSE
 UNSET tz_dst_time
ENDIF

That searches for a date the event occurs on the other timezone that satisfies the condition that the event occurs today() in the local timezone (today() is not necessarily the actual system date, it could be a specific date being looked up or the date of a calendar entry being computed). The source date can be specified with any of the usual remind recurrence constructs, just like an ordinary event. I've noticed some parse errors using this with a one off event on days the event does not occur - I think it might be a bug in remind for non-recurring events with a SATISFY clause that returns 0, but if someone can see something I've done wrong there I'd welcome the feedback. Anyway, for one off events you can just use the more concise syntax above, I've tried a few different forms of recurring events and haven't yet seen it on any of them.


The title of this post says "and other tricks", so I should probably show you some. I have a weekly meeting who's time varies depending on daylight savings (to better accommodate people elsewhere in the world who call in), so I've come up with this trick checking if every Friday is in (local) daylight savings time to accommodate this (try doing this in iCal!):

REM Fri SATISFY 1
IF isdst(trigdate())
 REM [trigger(trigdate())] +2 SKIP AT 09:30 DURATION 0:30 Some meeting
ELSE
 REM [trigger(trigdate())] +2 SKIP AT 08:30 DURATION 0:30 Some meeting
ENDIF



Finally, for anyone in Canberra, here is a list of public holidays you can import into your remind file. These should take care of any of the floating public holidays as well, and you can use the SKIP keyword to have events automatically be cancelled if it falls on a public holiday, or the BEFORE or AFTER keywords to move it to another day. The only thing these can't predict is any meddling from the Government:

# Public Holidays
FSET next_monday(x) x + (7-wkdaynum(x-1))
FSET next_monday_inc(x) x + (7-wkdaynum(x-1))%7
FSET weekend(x) wkdaynum(x) == 0 || wkdaynum(x) == 6

OMIT Jan 1 SPECIAL COLOR 255 255 255 New Year's Day
REM Jan 1 SCANFROM [trigger(today()-7)] SATISFY weekend(trigdate())
OMIT [trigger(next_monday_inc(trigdate()))] SPECIAL COLOR 255 255 255 New Year's Day Holiday
OMIT Jan 26 SPECIAL COLOR 255 255 255 Australia Day
REM Jan 26 SCANFROM [trigger(today()-7)] SATISFY weekend(trigdate())
OMIT [trigger(next_monday_inc(trigdate()))] SPECIAL COLOR 255 255 255 Australia Day Holiday
REM Mon Mar 8 SCANFROM [trigger(today()-7)] SATISFY 1
OMIT [trigger(trigdate())] SPECIAL COLOR 255 255 255 Canberra Day
SET easter EASTERDATE(YEAR(TODAY()))
OMIT [TRIGGER(easter-2)] SPECIAL COLOR 255 255 255 Good Friday
REM [TRIGGER(easter-1)] SPECIAL COLOR 255 255 255 Easter Saturday
REM [TRIGGER(easter)] SPECIAL COLOR 255 255 255 Easter Sunday
OMIT [TRIGGER(easter+1)] SPECIAL COLOR 255 255 255 Easter Monday
OMIT Apr 25 SPECIAL COLOR 255 255 255 Anzac Day
REM Apr 25 SCANFROM [trigger(today()-7)] SATISFY weekend(trigdate())
OMIT [trigger(next_monday_inc(trigdate()))] SPECIAL COLOR 255 255 255 Anzac Day Holiday
REM Mon Jun 8 SCANFROM [trigger(today()-7)] SATISFY 1
OMIT [trigger(trigdate())] SPECIAL COLOR 255 255 255 Queen's Birthday
REM Mon Oct SCANFROM [trigger(today()-7)] SATISFY 1
OMIT [trigger(trigdate())] SPECIAL COLOR 255 255 255 Labour Day
OMIT 25 Dec SPECIAL COLOR 255 255 255 Christmas
OMIT 26 Dec SPECIAL COLOR 255 255 255 Boxing Day
REM 25 Dec SCANFROM [trigger(today()-7)] SATISFY weekend(trigdate())
IF trigvalid()
 OMIT [trigger(next_monday_inc(trigdate()) )] SPECIAL COLOR 255 255 255 Christmas Holiday
 OMIT [trigger(next_monday_inc(trigdate())+1)] SPECIAL COLOR 255 255 255 Boxing Day Holiday
ENDIF
REM 26 Dec SCANFROM [trigger(today()-7)] SATISFY wkdaynum(trigdate()) == 6
OMIT [trigger(next_monday_inc(trigdate()))] SPECIAL COLOR 255 255 255 Boxing Day Holiday

Wednesday, 14 July 2010

Fun with Foreign Debian Bootstrapping

Yesterday I found myself booting Linux on a device with no attached permanent storage - all I had was several gigabytes of RAM and the ability to netboot it through TFTP. I had been using a very minimal root filesystem inside the kernel image, but I began to wonder if it would be possible to have an entire Debian installation in the ramdisk instead - the box certainly had enough RAM to fit a minimal installation.

Ordinarily one could just use debootstrap to set up a minimal Debian installation inside a directory and make a ramdisk from that, but this was further complicated by the fact that this was a PowerPC device. Debootstrap does have a --foreign option to perform the first part of the installation on a different architecture, but the --second-stage still needs to be run as root on native hardware and assumes that it is being run from within an existing Linux installation with a bunch of standard tools available to it.

The only machines I had root on were all x86 (other than the device in question, but the ramdisk I had been using had some limitations that would have complicated matters) and some other test boxes (which I would have had to wait to requisition). So instead I decided to do a partial debootstrap on my local x86 box and complete the installation using only my local x86 box and that partial image on the PowerPC box.

If you are following this article as a guide I should note that it assumes you are able to compile and boot your own kernel and have a decent familiarity with Linux in general.

So first, begin the debootstrap process, but use --foreign to only perform the first part of the bootstrapping process (NOTE: almost everything here needs to be run as root, signified by the # at the start of each line):

# mkdir deb-ppc
# debootstrap --arch=powerpc --foreign squeeze deb-ppc http://<mirror>/debian

After this command completes you have an incomplete Debian installation in deb-ppc - some basic tools are installed (but not configured) and some packages have been downloaded but not installed. I did not select any additional packages into the initial root disk at this stage, though had I been thinking ahead it would have been useful to also include openssh-server and rsync, but that was not a major setback for me. You might want to include them, and if you don't like vi or nano you might also want to install your console editor of choice. At the moment the root disk is not bootable, so let's fix that:

# ln -s /bin/bash deb-ppc/init

This still won't boot into a full Debian installation - after the kernel finishes it's initialisation and tries to spawn the init userspace process to take over booting, it will instead spawn an interactive shell which can be used to complete the bootstrapping process. Since I'm bundling this inside the kernel image as an initramfs as opposed to an initrd loaded separately, I link an interactive shell into /init. If you were doing this with an initrd you would instead link it to /initrd.

Before we can make a ramdisk image from that directory we need to save this script as mkinitramfs.sh from Documentation/filesystems/ramfs-rootfs-initramfs.txt in the kernel sources:
#!/bin/sh

# Copyright 2006 Rob Landley <rob@landley.net> and TimeSys Corporation.
# Licensed under GPL version 2

if [ $# -ne 2 ]
then
  echo "usage: mkinitramfs directory imagename.cpio.gz"
  exit 1
fi

if [ -d "$1" ]
then
  echo "creating $2 from $1"
  (cd "$1"; find . | cpio -o -H newc | gzip) > "$2"
else
  echo "First argument must be a directory"
  exit 1
fi

NOTE: when using this script be sure you are calling this script and not a separate program also named mkinitramfs from your distribution.

Let's bundle the root disk into a cpio image:

# ./mkinitramfs.sh deb-ppc ramdisk.cpio.gz

Now you need to compile the kernel and netboot it - I'll leave the details of how to actually do that out of this article - there's plenty of good resources for that around already and the netboot procedure may vary depending on your setup (if you are netbooting at all). If you are doing this with an initramfs like I am you will need to point CONFIG_INITRAMFS_SOURCE to that image - once you have configured the kernel edit the .config file and remove the 'CONFIG_INITRAMFS_SOURCE=""' line. Then run make oldconfig which will ask you to set that option as well as some UID and GUI mapping (which you can leave as 0 since the image already should already have the correct ownership). After that you can run make and wait for the kernel to build. I'll also assume you know which zImage is the correct one to boot on your hardware.

Once you have successfully booted the kernel you should find yourself at a bash prompt. You should be aware that the environment is extremely limited at this point - for one thing there is no job control so don't try to spawn a process that you need to ctrl+c out of (I made the mistake of pinging a host to check that the network was up).

The debootstrap --second-stage did not work for me, so instead I completed the installation manually:

# export PATH=/usr/sbin:/usr/bin:/sbin:/bin
# dpkg --force-depends --install /var/cache/apt/archives/*.deb

A few things may complain during that and you may need to tell apt to fix up any problems:

# apt-get -f install

Now you will have a much more complete userspace - including vi. There's a few more things we need to do to get the system usable. Firstly, let's edit /etc/fstab and add an entry for /proc since so much userspace depends on it:

# vi /etc/fstab

proc /proc proc defaults 0 0

# mount /proc

Now we should probably get networking set up (I'm assuming you are using DHCP and your interface is eth0):

# vi /etc/network/interfaces

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp

# vi /etc/hostname
# ifup lo
# ifup eth0

Do not make the mistake I made of checking if the interface is up by pinging something. You can run ifconfig to make sure your IP address looks right.

And set up apt (note /debian postfix in sources.list which isn't in the template provided by debootstrap - I spent around 10 minutes contemplating the 403 I was getting before I noticed that):

# vi /etc/apt/sources.list

deb http://<mirror>/debian squeeze main

# vi /etc/apt/apt.conf.d/10local
APT::Install-Recommends "0";
APT::Install-Suggests "0";

# apt-get update

Now you can install any additional packages you may need (if you didn't do this in the initial debootstrap), so let's install what we need to be able to copy our changes out of the machine (interactive SSH won't work just yet, but file copying will):

# apt-get install openssh-server rsync
# passwd

Note that if you are interacting with the machine via serial it may be a bit awkward to interact with the configuration for some packages (such as localepurge) so just install the bare essentials for the moment. After installing some packages it's probably a good idea to clean the apt cache since we are likely pretty tight on RAM:

# apt-get clean

Speaking of serial, if you are logging into the machine via serial (as I was) you may want to spawn a console on the serial line:

# vi /etc/inittab

T0:2345:respawn:/sbin/getty -L ttyS0 57600 vt100

Back on the x86 box we can now copy all those changes back into the ramdisk and make it actually boot Debian:

# rsync -avx <host>:/ deb-ppc
(NOTE: the x is important, otherwise /proc will be copied as well)
# rm deb-ppc/init
# ln -s /sbin/init deb-ppc/init
# ./mkinitramfs.sh deb-ppc ramdisk.cpio.gz
(again, the mkinitramfs from the kernel doc, not a distro)

Again, compile the kernel and boot it. You will need to do this last part every time you make a change in the ramdisk that you want to make persistent.

Once booted you will be able to interactively SSH into it and will find you now have a complete Debian installation you can do whatever you like with within the constraints of the available RAM. With full SSH, job control and proper TTY management you can now perform some changes that would have been a little tricky earlier, such as reconfiguring any packages you couldn't configure properly earlier (tzdata for me) and stripping out unneeded locales (this messed up a little for me since locales wasn't installed before localepurge. I haven't tested this and it's probably longer than it needs to be, but I think it will work):

# apt-get install locales
# locale-gen en_AU-UTF-8
# dpkg-reconfigure locales
# apt-get install localepurge
# localepurge
# apt-get clean

You might also want to strip out some unneeded packages, for example with:

# apt-get purge logrotate mac-fdisk rsyslog yaboot info install-info man-db manpages nano

Remember to follow the above instructions to make those changes persistent if you are happy with them. Later I'll probably play around with docpurge (from maemo) and look at other ways of reducing the size of the image (disabling logging is probably a good place to start).

If you're after some further reading on booting the kernel with initial ramdisks, check out Documentation/early-userspace/README and Documentation/filesystems/ramfs-rootfs-initramfs.txt in the kernel source.