Namespaces (part 1) : Plan 9

namespaces
Author

Emmanuel Jeandel

Published

March 10, 2024

This is the first posts in a series of posts about namespaces, and how OS other than Linux implement some form of namespaces.

The series starts with Plan 9.

If you want to learn more about Plan 9, I’d suggest the series of youtube videos by adventuresin9 which, contrary to most of the official docs, does a good job at explaining how everything works.

The commands here have been tested with 9front, as sink network interfaces do not seme to exist in original Plan 9. I won’t pretend I understand everything that happens behind the scenes.

Plan 9

Plan 9 adheres to the Unix philosophy that “everything is a file” and put it to the extreme (note that Plan 9 is not an Unix, even though it might look very similar at first).

Not only is everything a file (including most devices), but every communication with the files are done using open/read/write, and there is no use for peculiar ioctl.

For instance, to connect to a TCP server, you just need to open the file /net/tcp/clone to create a tcp connection, and read a number N from it. Next you can write connect 10.0.0.1!80 to the file /net/tcp/N/ctl to connect to the given address, and then communicate with the distant server using the file /net/tcp/N/data. This can be scripted of course.

Similarly, to kill a process, one can just do echo kill > /proc/PID/ctl.

Another peculiarity is that the filesystem is per process. Every process can mount/bind a volume and it will only be seen from that process. For instance, mounting a usb key in one terminal does not mean the usb key will be seen in another.

Mount namespaces

This means it is particularly simple to have a process with a completely different filesystem than the other processes (technically that is always the case in plan 9 !).

For instance, suppose that your HOME directory is /usr/glenda. By typing the following command in a terminal:

bind -c /some_directory /usr/glenda

/usr/glenda now contains the content of /some_directory and your process cannot see the original files in /usr/glenda anymore.

There is a small problem however. Using the command ns, the user can see all mounted volumes, and will see:

term% ns
bind '#c' /dev
bind '#d' /fd
...
bind -c /some_directory /usr/glenda

so the user can do

unmount /usr/glenda

to see the original files !

If you want to protect the files correctly, the best is to create another root entirely, without the executable bin/unmount. Or, at the very least, you should also remount /bin to exclude the file bin/unmount.

Network namespaces

Mounting different directories as /net, it is easy to have different processes see different network stacks.

In the following I will illustrate this with a recipe to have two environments that thinks they have IP 10.0.0.1 and 10.0.0.2 respectively on two different network cards.

Here are the commands. First we create two empty directories /tmp/net.1 and /tmp/net.2 and associate each of them with a virtual network card.

mkdir /tmp/net.1
mkdir /tmp/net.2

bind -a '#l1:sink ea=001122334402' /tmp/net.1
bind -a '#I1' /tmp/net.1
bind -a '#l2:sink ea=deadbeef0011' /tmp/net.2
bind -a '#I2' /tmp/net.2

Intuitively, the first bind adds the network interface /ether1 (resp. /ether2) to the /tmp/net.X directory. The second binds adds the layer 3/4 services (IP/ICMP/UDP/TCP) for that network interface. The sink argument says the network card is virtual. The ea argument is for the ethernet address.

Now we build a bridge, and connect both interfaces to the bridge:

bind -a '#B1' /net

echo 'bind ether port1 0 /tmp/net.1/ether1' > /net/bridge1/ctl
echo 'bind ether port2 0 /tmp/net.2/ether2' > /net/bridge1/ctl

and we add IP addresses to these interfaces:

ip/ipconfig -x /tmp/net.1 ether /tmp/net.1/ether1 10.0.0.1 255.255.255.0
ip/ipconfig -x /tmp/net.2 ether /tmp/net.2/ether2 10.0.0.2 255.255.255.0

Now we want to open two terminals: one where the network is given by what is currently in /tmp/net.1 and the other by what is currently in /tmp/net.2. But remember, all these commands were written in one terminal (say, terminal A) and the filesystemsa are per process, so terminals other than terminal A won’t see what are in these directories.

So we can’t do bind /tmp/net.1 /net on one process, as their /tmp/net.1 directory is empty.

To do that correctly, we will use srvfs, as a relay file server. Everything is /srv is seen everywhere, so we will put our two configurations directories inside /srv with the commands

srvfs pc1 /tmp/net.1
srvfs pc2 /tmp/net.2

This creates directories /srv/pc1 and /srv/pc2.

Now, on two different terminals, we can do mount /srv/pc1 /net and mount /srv/pc2 /net, and these two different terminals will use the two virtual interfaces as their default interfaces. You can then try ip/ping 10.0.0.2 from the first terminal.

The same remark as the last section applies: each terminal can still do unmount /net to recover the original network interface. Some more work has to be done to prevent the use of unmount.

Remarks

There were a few namespace things that I wasn’t able to do. Some of them, after some thinking, might not make sense given how the OS is structured. In particular, I would like to have terminals with different PID namespaces (and therefore different /proc namespaces). I don’t see anything that would break with that in place, but I certainly didn’t look closely enough.

Next Time

Next Time, we examine Solaris zones.