Research Project

Process Isolation

  • Setup a Fedora testing VM for easier and consistent testing.

    • Fedora 35 uses pure cgroups2 by default - very useful!
    • Sorted out backups for the repo and the above machine.
  • Chose a testing framework for C enabling testing of the assertions to build the project upon.

    • Used Unity. Simple and the examples show that it doesn’t do much beyond the minimum - important for this low level code with syscalls.
    • Forking and cloning doesn’t cause any problems as long as each process is reliably exited (not returned from).
  • Began writing assertion tests for the flags of the Linux syscall clone3. Important as many of these have what I consider surprising behaviour going from just the names. clone3 and the resultant processes/namespaces are going to be the majority of process separation in this project.

    • CLONE_FS: Links specific bits of filesystem metadata, such as the PWD of the processes.

      • Importantly this is cloned in a copy-on-write way regardless of the flag, but the CLONE_FS flag keeps the two processes linked.
    • CLONE_FILES: Links the file descriptor tables of the processes.

      • Again, this is copy-on-write - all fds are inherited without this flag, as the same underlying file descriptors, but this flag ensures that new ones are shared.
      • Tricky to test, as IPC is required to pass the file descriptors around.
    • CLONE_NEWNS: Place a cloned process into a new mount namespace.

      • Copy-on-write: has all existing mounts of the parent namespace. Perhaps the solution is to clone twice but give the second clone the original parent? That way the first cloned process can unmount all filesystems in the new namespace, then the new-new namespace will be created with no mounts. Otherwise, unmount in the privileged section before handing off control of the clone.
      • Filesystems marked as shared and mounted under basically ignore the new namespace - the new filesystem is still propagated back. Both my /tmp tmpfs, which I’ve been using, and my / root are mounted as shared. This is certainly a tricky one to get my head around.
      • Requires CAP_SYS_ADMIN.
    • CLONE_NEWCGROUP: Place a cloned process into a new cgroup namespace.

      • Basically a chroot for the cgroup directory structure.
      • Requires CAP_SYS_ADMIN.
    • CLONE_NEWNET: Place a cloned process into a new network namespace.

      • The new net namespace has only a loopback adapter in it, which is down by default.
      • Though a process only has one network namespace, these can be linked with some work, allowing for pretty complex separation.
      • Requires CAP_SYS_ADMIN.
    • CLONE_NEWPID: Creates a process in a new PID namespace.

      • Process appears with PID 1 in its new namespace (it believes it is the init process).
      • Requires CAP_SYS_ADMIN.
    • CLONE_IO: Have the two processes share an I/O context.

      • This one is going to be particularly tricky to examine in a unit-test like format, so I’m saving looking more into it for later.
      • It appears to be solely for performance optimisation, but quite tricky to evaluate. This could perhaps be a stretch task if I can figure out why it’s useful.
    • CLONE_INTO_CGROUP: Place a cloned process into a specific cgroup.

      • This flag is only supported by the clone3 call, as it requires passing a file descriptor to the cgroup to place the cloned process in.
      • Nothing particularly interesting about this, but it is very useful - the final result of this project will likely use this extensively.
    • CLONE_PARENT: Sets the cloned process parent ID to the cloning process’s parent.

      • Useful for e.g. cloning twice to clone the second clone into an already empty mount namespace. One can give the new process back to the parent to manage and then kill the intermediary clone.
    • CLONE_PTRACE: Sets the cloned process to be traced if the cloning process is being traced.

      • I think this is going to end up being necessary when writing the debugger, as cloned processes will need to be somehow attached to.
    • CLONE_NEWUSER: Place a cloned process into a new user namespace.

      • One of the most useful clone flags, but also one of the most complex.
      • Lets you do things like have the process appear as a variety of users with full capabilities internally, but be limited to an overall user in the external namespace.
      • Lots of work so research incomplete so far.
      • No longer requires any special privileges.
  • Read up on the remaining clone options, which I think are less useful for this project, but I’ve summarised them and the reason why briefly. This could change later on.

    • CLONE_NEWIPC: Place a cloned process into a new IPC namespace.

      • Each process can only be a member of a single IPC namespace, so I think this would be quite challenging to make use of in a non-trivial process layout where each process needs to communicate with more than one other.
      • Requires CAP_SYS_ADMIN.
    • CLONE_NEWUTS: Creates a process in a new UTS (UNIX Timesharing System) namespace.

      • Lets you change the hostname and domainname without affecting other processes. I can’t see this being very useful for the project.
      • Requires CAP_SYS_ADMIN.

Up Next

  • Keep writing assertions for the incomplete clone types.

  • Write a tool to get a view of what the process isolation looks like under a parent.

    • A debugger than can attach (gdb /bin/bash -p XXX)/directly launch (gdb /bin/bash) would be very useful.
    • Write this in Rust? C++? Or perhaps OCaml?
  • Alongside this debugger, start writing bigger C programs that employ the namespace/cgroup techniques.

    • These will be eventually mapped to OCaml to test for equivalence.
    • Clear structure is going to be far more important in these than in the assertion tests.

Modules

Digital Signal Processing

  • Extracted an image from a digital recording of an aerial nearby an HDMI cable.