Research Project

Dissertation draft

A block of progress on the stuff learnt so far and how each of the namespaces can be modified to create a void. The work is available here on Overleaf (permissions required), here on Gitea (no permissions required), and a current draft is available here.

  • Finished Table 1 (history of namespaces).

    • I haven’t mentioned it in the write up but this was horrible to firstly find the answers to and secondly to cite.
    • Some serious git commands were needed to find the commits where things were added. Once I found the commit citing them wasn’t too bad.
    • The version changelogs are horrible to cite. They used to be posted by Linus on the mailing list, but now they’re posted on a site that’s basically a Wiki (and referred to by Linus in the release email). Leaves a load of noauthor entries in my bibtex…
  • Wrote a large section on mount namespaces (§3.1).

    • I finally understand the horror show that is mount namespaces. The dissertation contains snippets of C and shell code that try to backup each thing I say clearly.
    • Made a commitment to myself to add these commits and versions to Wikipedia when I get the chance, making this much easier in future.
    • Cited some of the reasons these choices were made, in an attempt to provide a balanced view.
    • I have some thoughts on how this could be done from the ground up in a far better way, but didn’t include them as they aren’t particularly useful - hindsight is a wonderful thing.
    • There also might be a bug in the MNT_DETACH logic, but again I’m not sure I can fully back this up. Discussed more in the programming section of this post.
  • Wrote sections on how each of the namespaces can be used to make a void.

    • Haven’t yet completed cgroup and user namespaces, as they may need some extra work that isn’t currently present in the code. The rest provide an accurate view of how to provide the utmost separation.
    • There’s an idea floating around of separation of the host from the container versus separation of the container from the host. I haven’t got good wording for this yet. E.g. cgroup namespaces can provide decent separation of the host from the container, as you can go into an isolated subtree, but the host can still modify that subtree. The reason I haven’t said too much on this is that root on the host can modify anything in this system if it really wants.
  • I wrote briefly about some future work on dynamic linking that is possible and suits the threat model that I don’t think I’ll have time to complete.

  • Minor unlogged changes elsewhere.


  • Tested and expanded the mount namespace voiding.

    • As written in the dissertation, mount namespaces are tricky. My initial code for voiding them completely borked the host system.
    • Had a good offline exchange with our sysadmin trying to figure out which of us was breaking the machine.
      • Possible exposed a kernel bug with MNT_DETACH crossing shared subtree boundaries and unmounting recursively. I can’t see why this would ever be desired behaviour, as MNT_DETACH already relies on the kernel’s GC-type unmounting, so dropping a reference to the shared subtree should work fine. Not completely sure about this though.
      • I’d go a bit further to say I have no idea why MNT_DETACH unmounts recursively in the presence of shared subtrees. I think decreasing the refcount to what you’re trying to unmount should have the same effect. I haven’t devoted time to testing a patch for this though, as it’s a bit extraneous to the project’s main goals.
    • Wrote the code to re-add the required elements to the mount namespace void (an empty tmpfs).
    • This works surprisingly well with dynamic linking!
      • I was expecting to have some serious difficulties here. It turns out that running ldd on the binary and bind mounting each of the paths in works perfectly (so far).
      • Fits well with the threat model too (trusted binaries that are co-operatively requesting privilege separation).
  • Added all the small support parts necessary for the TLS server.

    • Proper socket support for sending FDs.
    • Better error handling for the now larger shim.
    • Allowing file descriptors into the void explicitly.
    • CI linting.
  • Build a TLS server.

    • The spec and the app are written, it just needs slightly more verification and a little debugging before merging.
    • All of the feature adding commits have been cherry-picked already, so only the example itself remains to be merged.

Up Next

The code is in a much better spot now than it was last time. The goal for the next week is to get the example applications completed under this near to final shim.

  • Write about my fib example.

    • The code is already written, the section in the dissertation needs to be completed.
    • Include some detail as the simplicity of the application makes a good spot to discuss the complexity of the separation.
  • Add the needed handling to gzip to run under the shim.

    • Edit the source.
    • Write a specification.
    • Write about it in the write up.
    • Create a decent explanatory figure.
  • Write about the TLS server in Rust under the shim.

    • Write about it in the write up.
    • Create a decent explanatory figure.
  • Plan the evaluation.

    • I’m still not super clear on the best structure for the evaluation, so I think a plan is in order.
    • Hopefully meet next week to follow up?


  • After reading so many kernel mailing list entries trying to find references it caught my attention. I’ve subscribed to [email protected] with notifications off. Still trying to work out if this was a mistake…