More than you ever wanted to know about the DoC web server setup

Servers involved:

Swift - Demultiplexing

WWWDEMUX is the first point of call for all HTTP requests. The rewrite rules on this Apache 2 server proxy the request to the actual server which is going to be doing the work. It also does a limited amount of caching. The caching is done with mod_cache.

The rewrite rules look a little like this:

1  RewriteRule     ^/~([^/*]+)$            /~$1/                           [R,L]
2  RewriteMap      username_map            txt:/usr/apache/conf/namemap
3  RewriteCond     %{REQUEST_URI}          ^/~([^/]+)/
4  RewriteCond     ${username_map:%1|FALSE}        !=FALSE
5  RewriteRule     ^/~[^/]+/(.*)           /~${username_map:%1|OPPS}/$1                            [R,L]
6  RewriteRule     ^/~(.+)$                http://tweetypie/~$1            [P,L]
7  RewriteRule     ^/(.*)$                 http://www/$1                   [P,L]

Line 1 just redirects missing trailing slashes and that's a terminal rule. The target server is going to do it anyway - we might as well save it to trouble. Line 2 configures a map generated by a maint script that maps like adam.langley to agl02. Line 5 is only tried if 3 and 4 match. Three just checks that it's a home request and 4 checks that the username is in the username map. If so, 5 redirects to the usual username and finishes.

Six catches all remaining home requests and sends them to tweetypie. Everything else goes to linnet.

The SSL setup is the same, but uses SSLProxyEngine to do an SSL request out the back so that scripts on the other servers think that the connection is SSLed.

The Home Servers

The home servers (tweetypie and magpie) have a number of Apache patches to set them apart.

Firstly, apache-1.3.28-rlimitsuexec.patch sets up resource limits for all SuEXECed programs. If the server is running a 2.6 kernel then the NPROC rlimit limits the user to a maximum of 5 processes running on the system at once. If they exceed this, the call to setuid will fail and this is noted in the suexec log.

Secondly, apache-1.3.28-suexecimport.patch makes SuEXEC treat the path /import/*/*/*/username/public_html as a valid home directory for a given user.

Lastly, apache-1.3.28-modaccess.patch makes the server trust 146.169.1.0/24 to give the true source IP address in the X-Forwarded-For header. This is then used when checking .htaccess.

The home servers also use iptables to deny users access to the network. The config is disted to /etc/sysconfig/iptables. There are a few exceptions, see the config file.

Unfortunately, iptables-save generates an incorrect config. It doesn't leave a space after a negation '!', so you need to manually edit the output.

Since the delay in automounting homes on request is noticable the homes servers run a maint script that mounts all the toplevel home mounts under /import. It then writes /etc/httpd/homesmap which contains a map like:

rohita  /import/buzzard.doc.ic.ac.uk_export1/users/r/rohita
cstest  /import/buzzard.doc.ic.ac.uk_export1/users/a/cstest
swilton /import/buzzard.doc.ic.ac.uk_export1/users/s/swilton
lyt99   /import/buzzard.doc.ic.ac.uk_export1/users/l/lyt99
ack     /import/buzzard.doc.ic.ac.uk_export1/users/k/ack
grc     /import/buzzard.doc.ic.ac.uk_export1/users/g/grc

Home requests are rewritten internally by apache to the non-automounted location and SuEXEC has been patched (see above) to accept this. Note that because the primary apache path is still ~username Apache still calls SuEXEC in homedir mode. If you don't know what a primary path is - don't worry. Just never use a rewrite rule with the PT option.

Site Map
/Root
     AlternateThe Weird and Wonderful
          BacklinksWhat are backlinks
          John GilmoreWhat's Wrong with Copy Protection
     ArchivesBlog Archives
          OneArchive 1
          TwoArchive 2
          ThreeArchive 3
          FourArchive 4
          FiveArchive 5
          SixArchive 6
          SevenArchive 7
          EightArchive 8
          NineArchive 9
          TenArchive 10
          ElevenArchive 11
          TwelveArchive 12
          ThirteenArchive 13
          FourteenArchive 14
          FifteenArchive 15
          SixteenArchive 16
          SeventeenArchive 17
          EighteenArchive 18
          NineteenArchive 19
          Twenty Archive 20
          Twenty OneArchive 21
          Twenty TwoArchive 22
          Twenty ThreeArchive 23
          Twenty FourArchive 24
          Twenty FiveArchive 25
          Twenty SixArchive 26
          Twenty SevenArchive 27
          Twenty EightArchive 28
          Twenty NineArchive 29
          Thirty Archive 30
     PhotosPoor People Caught on Film
          Jack and the Beanstalk Jack and the Beanstalk
          RIP ScanResults of a Stage Scan Fire
          YosemiteYosemite National Park
     ProjectsIncomplete things from the lab
          Seagull's BaneLinux Automounter
          bttrackdBitTorrent Tracker
          CAPTCHACAPTCHA CGI script
          ConservConsole Serving
          DeerparkUsing Tor with Firefox/1.1 (Deerpark)
          DNSFixFixing DNS
          XoversXTA Crossover Control
          IAFSArchive Org Storage
          JBIG2JBIG2 Encoder
          VerifyPGP Key Verifier
          MaxFlowMaximal Flow in Python
          PyBloomBloom Filters in Python
          pyGnuTLSPython wrapping of GnuTLS
          SxmapApache SuEXEC Map
          HellardUnion Server Notes
     RecordingsFree recordings
          ICSM ChoirSt Paul's Church
     SchoolAncient School Stuff
     WritingsWho knows
          Cap SystemsCapability Systems
          IntroIntroduction to me
          SupremaJMC2 Group Project
          MP LettersLetters I've written to my MP
          SoundSound With Dramsoc
          SyncThreadingThe wonders of user-land threads