Tue Apr 15 18:33:08 BST 2003

When putting my email address on webpages I usually have it as aglREMOVETHIS@imperialviolet.org. Spammers have never bothered to try and decruft these addresses because there was always lower hanging fruit.

Well, today I got email addressed to aglTHIS@imperialviolet.org. I guess the fruit isn't so low any more.

Evil Bits

A reply that the author of the `evil bit' RFC (3514) got. Note the company name at the bottom. (it was an April Fool's joke - for those who don't know))

What or who determines the "evilness" or "goodness" of the packet? If a security admin or OS can determine or flag bits as good, what keeps the hacker from spoofing this process by setting the bit to "good"? Does the bit change based on behavior? Or maybe a database with signatures of "bad" bits?

(name deleted)

Microsoft Corporation

Fri Apr 11 18:32:39 BST 2003

Nothing much to put up here. Been revising lots and still waiting for the CapPython PEP . Wondering about interfaces on landscape, but I doubt I'll do anything about that for a long time.

The dog's sprained something and now has a leg in bandages.

Sun Apr 6 21:09:10 BST 2003
Capability Python

Zooko's blog had been down for a fair time as he changed hosting, so I only noticed today that he had started posting again.

Recently, he's been talking about adding capabilities to Python and, oddly enough, I was thinking about the exact same thing yesterday. If some of the introspection abilities of Python were limited, it would make a very effective capability language by using references as a capabilies. Thankfully, from following some of the links on the python-dev archives (below) it seems that a PEP is being worked on.

Once the language is limited, the standard library needs to be looked at. The Python people aren't going to accept the gutting of the library for the needs of capability freaks, so many of the modules will have to be proxied as they have dangerous functions (for example, taking a filename not a file handle).

Also, some of the standard functions (thinking of __repr__ here) leak a little too much information and could be trimmed without loss of useful funtionality. Leakage increases the bandwidth of side-channels. You can never be rid of side-channels, but you can stomp them as much as possible.

Defense Against Middleperson Attacks

Zooko has also written a nice crypto paper. However, I had to scribble notes when reading it and hope that this version is a little easier to understand:

Defence Against Middleperson Attacks

Zooko, <zooko AT zooko DOT com>

The Problem

Alice thinks she's pretty hot stuff, chess wise, and bets that no one can beat her in a game. Bob takes her up on this challenge and the game commences. Unknown to Alice, Bob is also playing a game against a chess grand-master. Whenever Bob gets a move from Alice, he plays that move against the grand-master and relays his response to Alice. The grand-master trounces Bob, and so, Bob trounces Alice. Bob wins the bet.

Somehow, Alice wishes to know that the identity of the person playing her matches a given public key. That way, if she encrypts prize to that key she knows that Bob cannot cheat her - the surprised grand-master would get the goodies.

The Solution

Dramatis Personae

Alice is ready to make the first move in the chess game. She calculates:

Alice transmits Commitment1 and sleeps K seconds. Alice transmits a signed copy of Message1

After Bob receives Commitment1 he sleeps for K seconds. He then waits to receive the signed copy of Message1. Bob verifies that Message1 was signed by the included copy of PKA, and that Commitment1 is correct.

Bob quickly ponders his move and calculates:

Bob signs Message2 and encrypts the signed copy with the PKA from Message1. He transmits this.

Alice receives the signed and encrypted Message2. If more than K seconds have passed since she send Message1 she aborts the game.

Otherwise, she decrypts Message2 and checks the signature against the included PKB and that her public key is correct.

Results

Assumptions

Sun Apr 6 13:04:57 BST 2003
LuFS

Ok, LuFS is pretty fantastic. Unlike all other other userfs implementations that I've come across it actually works, and it's fast. Using its localfs module (which just tests the LuFS interface by mirroring the existing filesystem) the speed difference is epsilon. Certainly for networked filesystems the LuFS latency is swamped by the network.

It's FTP and SFTP modules really work quite well, I certainly expect to be using the SFTP one when I'm back at Imperial.. It also supports autofs mounting so that you can just cd into (say) /mnt/sftp/agl@somehost and it will sftp (or ftp) mount it for you on the fly.

And the localfs module could be a wonderful way to chroot some difficult programs by mapping on a configured set of directories read-only. Though it would need some of the grsecurity kernel patches for it to actually be secure.

Sat Apr 5 12:48:56 BST 2003
UserFS

LuFS seems to be a userfs that it actually being worked on. I haven't tried it yet, but it could be promising.

And it even has coderman's P2P fs as an experimental module

Wed Apr 2 09:44:55 PST 2003

Fantastic quote from Bram

In other good news, the IETF announced a new policy in RFCs against using MUST to describe behavior which is widely violated in practice, especially when that violation won't change for the forseeable future

Wed Apr 2 15:43:15 BST 2003

Some notes on the stuff I was talking about yesterday. People are welcome to jump in with comments if they like, but this is mostly for me to recognise when I've gone in a circle.

A terminal log from my knockup in Python. This uses setxattr and friends. It's a new toy in the kernel (go read the manpage) but only XFS supports it correctly. I think the terminal log pretty much speaks for itself

List: 
unsorted                 {%e0cb6e07b63cd592cad592bbb2c4f37a}          
title                    Root                                         
keywords                 {%904f3b6ed1f0bdc87cf11232eea4292b}          
comp                     {%f3d00322003197219ed873a87135f09c}          
people                   {%f2124f6ef95e9c0cb6899b32741ea969}          
types                    {%89b870c400af24726b0095896587a10f}          

> ...unsorted 
> List:       
syncthreading.pdf        {Sync Threading}                             
TR-94-06.pdf             {Control Transfer in Operating System Ker... 
CIRCA_whitepaper.pdf     {CIRCA Technology Overview}                  
core_vulnerabilities.pdf {Advanced Buffer Overflows}                  
RC22534Rev1full.pdf      {Thirty Years Later: Lessons from the Mul... 

> ...syncthreading.pdf 
> List:                
title                    Sync Threading                               
type                     {PDF}                                        
filename                 syncthreading.pdf                            
author                   {Adam Langley}                               

> ...author 
> List:     
email                    agl@imperialviolet.org                       
title                    Adam Langley                                 

> Pop   
> List: 
title                    Sync Threading                               
type                     {PDF}                                        
filename                 syncthreading.pdf                            
author                   {Adam Langley}                               

> :view 
xpdf /home/agl/lscape/1931bdf5e54e08edc866b00ff0f2a6a0&

In this model there are strings, objects, bags and lists (collectively elements). Objects are unordered (string, elements) pairs and most of the things in that log are objects. Bags are unordered sets of elements and lists are ordered vectors of elements.

That works to a point and I was just about to add backlinks to every object as bag of backlinks and a link called .backlinks. But, while links from objects are named the backlinks would never be. This is ok in some cases (such as structral links), but most of the time it matters that you were linked to with the name author because that has information value in the other direction as well.

So links are:

(if you are an RDF type, think of Properties as Triples. I may end up with an RDF model, but I'll make my own why there)

Now, should I allow multiple Properties with the same name from the same object or force them via a bag? Objects are going to have multiple incomming Properties with the same name, so I don't see why not.

Also need to think about indexes

Python Snippits That I Know I'll Be Hunting For In The Future

Disabling terminal line buffering

from termios import *

IFLAG = 0
OFLAG = 1
CFLAG = 2
LFLAG = 3
ISPEED = 4
OSPEED = 5
CC = 6

def save (fd):
        return tcgetattr (fd)

def restore (fd, data):
        tcsetattr (fd, TCSAFLUSH, data)

def nobuffer (fd, when=TCSAFLUSH):
        """Disable terminal line buffering."""
        mode = tcgetattr (fd)

        mode[IFLAG] = mode[IFLAG] & ~(INPCK | ISTRIP | IXON)
        mode[CFLAG] = mode[CFLAG] & ~(CSIZE | PARENB)
        mode[CFLAG] = mode[CFLAG] | CS8
        mode[LFLAG] = mode[LFLAG] & ~(ECHO | ICANON)
        mode[CC][VMIN] = 1
        mode[CC][VTIME] = 0
        tcsetattr(fd, when, mode)

Why on earth isn't fold an inbuilt function? (this is a left fold)

def fold (f, lst, init):
        cur = init

        for x in lst:
                cur = f (cur, x)
        return cur

Longest common prefix of two strings

def common_root (s1, s2):
	"Longest common prefix of two strings"
        n = min (len (s1), len (s2))

        for x in range (n):
                if (s1[x] != s2[x]):
                        return s1[:x]
        return s1[:n]

And tab completion

p = filter (lambda x : x.find (b) == 0, comps)
if (len (p) > 0):
	root = fold (common_root, p, p[0])
	if (len (root) > len (your_current_string)):
		your_current_string = root
Tue Apr 1 09:31:15 PST 2003

Sigh. Another April 1st and, once again, we have an April Fool's overload. Come on people, only post them if they're any good!

I sent the following to coderman in reply to this blog entry. It was a little rushed since I was typing over a dial-up ssh (as am I with this actually) and it has inspired me to actually code up something, even if it's far short of what it could be. Hopefully it will give some insights (agl's nth rule: you don't understand it until after you code it).

What you are talking about it very close to `the Unix philosophy'. One of the
fantastic things about Unix is: cat /dev/sda1 | gzip > /backup/`date +%s`.gz

Utility goes up super-linearly with the number of pluggable components.

Now this stuff gives me nightmares, mainly because I'm generally always
thinking about this stuff off and on, and have been for years. Your design of
interfacing P2P with the filesystem is a good example of increasing the utility
(and usability, from my point of view) of an application by exposing it using
common interfaces. The design of those interfaces is just fantastically
difficult.

The `everything's a file' idea of Unix is good. But what is really missing is a
userfs module in the kernel. Such things have existed at points in time, but
never has there been a polished one (or even one included in the main kernel
src). This limits the filesystem abstraction to devices and a few other little
things and leaves bodges like PRELOAD libraries and GnomeVFS around. But we
really need to expose application data and not have to end up writing fragile
regexps which break on every minor release.

I'm always wondering about designing a `better' system for this but generally
get stuck in a loop:

* Requires a fantastic number of components
* and a lot of abstraction points that we don't have at the moment
  (most programs output falls into a few simple blocks like `typed table'
   (think `ls`) or `dictionary' (something like ifconfig) or nestings of the
   same. ls shouldn't know anything about terminals, it should just output
   a table and let the UI handle if (if the output is going to a UI). But then,
   if we are doing this properly, all code should use `ls` to get directory
   listings and that's a lot of forking and stuff data over pipes. Thus...
* it would be fantastic if everything was in a single address space
* so a `safe' language is needed. Quite possibly a new language completely
* but that's a hell of a lot of work and makes the barrier to adoption
  pretty high.
* So we cut down the number of components and dream about making it better..
* 

(that was a bit unplanned, but I'm on a metered dialup at the moment I'm
afraid.)
-- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org (+44) (0)7986 296753 PGP: 9113 256A CC0F 71A6 4C84 5087 CDA5 52DF 2CB6 3D60
Wed Mar 26 19:01:39 GMT 2003
Mon Mar 24 18:59:34 GMT 2003

Term's over, so I'm back home and back on a dialup and ferrying floppies between computers whenever I need to get anything onto my computer. Ah, the joys of non-IC Internet connections.

Just before I left I pretty much had an automatic installation of Gentoo working, we'll just have to see if I can remember what on Earth I was doing when I go back in 5 weeks.

This holiday will be mostly revising, so there's no going to be much stuff worthy of posting here going on, but there are some new photos of my halls Xmas party up.

Thu Mar 20 11:29:22 GMT 2003
Epoll

From LWN

One aspect of the epoll interface is that it is edge-triggered; it will only return a file descriptor as being available for I/O after a change has happened on that file descriptor. In other words, if you tell epoll to watch a particular socket for readability, and a certain amount of data is already available for that socket, epoll will block anyway. It will only flag that socket as being readable when new data shows up.

Edge-triggered interfaces have their own advantages and disadvantages. One of their disadvantages, as epoll author Davide Libenzi has discovered, would appear to be that many programmers do not understand edge-triggered interfaces.. Additionally, most existing applications are written for level-triggered interfaces (such as poll() and select()) instead. Rather than fight this tide, he has sent out which switches epoll over to level-triggered behaviour. A subsequent patch makes the behaviour configurable on a per-file-descriptor basis.

Fantastic, level triggered interfaces are nicer because they need less system calls. With edge-triggered you always need to call read/write until it EAGAINs otherwise you can miss data. That means at least 2 calls per edge, while level triggered generally means only 1 call per edge.

Also, edge-triggering causes locking headaches when dealing with mutlithreaded apps and with these patches it should be possible to quite simply alter existing code to use epoll.

Wed Mar 19 19:57:26 GMT 2003
Happy (Belated) Birthday IV!

I totally missed it, but IV (in it's current form) was 1 year old on March 11th. Woo!

I can now read what I was doing a year ago, which is nice. Oddly enough, I was doing pretty much the same thing... (read on)

Mass Installing Gentoo

Dept/Computing at Imperial has rather a lot of computers, as I'm sure you can guess. Nearly all of them install from a common base in order to keep the sysadmin tasks manageable. At the moment that base system is SuSE 7.2, with a few key packages upgraded (X, kernel, JDK and so on). Of course, SuSE 7.2 is getting a bit old now and we are looking for a new install for the coming year.

We are testing a lot of distros, but at the moment I'm trying Gentoo. Good points:

Bad points:

To autoinstall it I have a GRUB boot disk that TFTP/NFSroot boots a 2.4.20 kernel with init=/bin/bash. That (will, at the moment you have to type a command line) run a Python script that uses finds the IP of the box (kernel does DHCP), does a DNS lookup to get the hostname and uses a config file with regexps on the hostname to find a series of scripts to run.

I have a quick python module to handle writing partition tables, other than that the scripts are at the bottom (these aren't final by any means)

Everything is installed from binary packages built on a 2-way Xeon (which looks like a 4-way because of hyperthreading). The grub packages seem broken at the moment, however.

If you look at the scripts, all you need to do is mount the /usr/portage directory from the server and make /usr/portage/distfiles a tmp area.

#!/bin/sh

/sbin/mkfs.xfs -f /dev/ide/host0/bus0/target0/lun0/part2
/sbin/mkswap /dev/ide/host0/bus0/target0/lun0/part1
swapon /dev/ide/host0/bus0/target0/lun0/part1
mount -n -t xfs /dev/ide/host0/bus0/target0/lun0/part2 /mnt
import partitions

def run():
        p = partitions.PartitionTable("/dev/ide/host0/bus0/target0/lun0/disc")
        p.add_size (0x82, 512, 0)
        p.add_size (0x83, -1, 1)
        p.write ()
cd /mnt
tar xjv < /stage1-x86-1.4_rc3.tar.bz2
mount -n --bind /usr/portage /mnt/usr/portage
mount -n --bind /mnt/tmp /mnt/usr/portage/distfiles
cp /etc/make.conf /mnt/etc/make.conf
cp /etc/resolv.conf /mnt/etc/resolv.conf
cp /etc/ld.so.conf /mnt/etc/ld.so.conf
cp /config/gentoo-systems/internal /mnt/internal
chmod a+x /mnt/internal
chroot . /internal

cd /
umount /mnt
#!/bin/sh

source /etc/profile
ldconfig
emerge -K gcc gettext glibc baselayout texinfo zlib binutils
ln -sf /usr/share/zoneinfo/Europe/London /etc/localtime
emerge -K system
emerge -K kde
emerge -K prelink
emerge -K sysklogd
emerge -K grub
emerge -K vim
emerge -K libogg
emerge -K libmng
/usr/bin/fc-cache
mkdir -p /boot/grub
cd /boot/grub
cp -a /usr/share/grub/i386-pc/* .
printf "root (hd0,1)\nsetup (hd0)\n" | grub

umount /usr/portage/distfiles
swapoff /swap
umount /usr/portage
umount /dev
umount /proc
mkdir /lib/modules/2.4.20-xfs
Sat Mar 15 19:52:36 GMT 2003
Stage Craft

Another busy day yesterday. Setting up some lights (I'm in the very light brown t-shirt). And the results of those lights (think what it would have looked like if we had used green gels and inside the venue (which was generally quite empty because everyone was downstairs for Artful Dodger)

Sat Mar 15 14:16:53 GMT 2003
Quarantine - Greg Egan

Greg Egan is a fantastic author and Quarantine is one of his very early books, and it shows a little. The book is full of the usual wonders of Egan's ideas but I felt that the ending was a little weak. I couldn't say why, there was no good reason why it wasn't a good ending - had a neat little twist and there weren't any loose threads left, but I felt like it didn't quite get back to the home key (GEB reference, for those who get it).

Thu Mar 13 21:57:01 GMT 2003
You know it's time to upgrade when...

... the load of the box you're building Gentoo packages on can't hit the number of processors because it can't download source code fast enough when it's comming in at 1.6MB/s.

Not too much been happening. Building lots of Gentoo binary packages for possibly installing on department lab machines next year (see above). Point to note: the userpriv option breaks stuff.

Maybe with 64-bit address spaces we can finially get rid of filesystems as a user visiable system and all switch to single-level persistent object stores. Anyway, AMD looks like they have the best 64-bit offering at the moment and EROS's 2002 paper on single-level store design is here

Mon Mar 10 19:35:58 GMT 2003
GLibc6 2.3.2

Just a quick warning, libc6 2.3.2 causes many programs to cough with an IP address of 0. I know it's not a very valid IP address, but it was a damm useful way of saying localhost. Just s/ 0 / 127.0.0.1 /g/

Mon Mar 10 17:53:19 GMT 2003
Valenti Speech

Valenti's speech is quite good. Apart from the file that he's wrong in almost every important point, he speaks very well. I don't know if he gets things wrong because he really doesn't understand, or that he's just trying to find an acceptable cover for his clients' greed.

He simply (seemingly) doesn't get that there is something fundamentally different about my physically depriving you of something and taking a copy. Of course, it's profitable to ignore that. He also asserts (unquestioned) that there are no alternate business models. Of course, it's profitable to ignore them (for the moment). He also doesn't get the difference between information and the physical expression of that information. Of course, it's profitable to treat information as a product.

He just fundamentally doesn't get it. I wish I had a transcript of his answer to a question about DVD Region Encoding where he just assumes that the legal system is there to uphold whatever he deems best. It's just breathtaking arrogance.

Nagios

Nagios looks like a really good status monitoring tool. Have a look at the demo. I hope to get it going in DoC, but it requires a heck of a lot of installing. Thankfully there are (nearly) wonderful Gentoo ebuilds which do everything needed. (Only nearly wonderful because the ebuild had a bug; patch mailed to the maintainer. On the same note, the prelink ebuild also has a missing dependency to libc6-2.3.2; patch send to, and acked by, the maintainer).

Unfortunately, DoC servers don't run Gentoo (or Debian I'm afraid) so it looks like I'm going to be doing it by hand.

Nagios Configuration

Nagios has quite a nasty configuration I'm afraid. So here's a Python script to do some of it for you.

The input is a series of lines. The first character determines the type of line and they go like this:

Example:

Sping,Ping,check_ping!100.0,20%!500.0,60%
Shttp,HTTP,check_http
Sssh,SSH,check_ssh
Ssmtp,SMTP,check_smtp
Snntp,NNTP,check_nntp
Snfs,NFS,check_rpc!nfs
Hbulbul,ssh
Hsax,ssh
Hsparrow
Gservers,Servers,bulbul,sax,sparrow
Sat Mar 8 17:23:39 GMT 2003

Debian used to provide a very useful file called base2.2.tgz for potato. In it was a very basic, but runnable, Debian system from which you could install everything else. You can still get the one for 2.2, but there's no such file for 3.0. Instead you have a tarball containing the debs of all the critical packages. Which is nice, except that you don't have dpkg to install them.

So, converting them all the tgz's and unpacking them gives you something close to a base system, except that dpkg doesn't think that anything is installed. Trying to install anything (including dpkg) pulls in libc6, and the inst script requires that dpkg know about itself. But you can't pull in dpkg because that requires libc6...

In the end you have to install base2.2.tgz and upgrade it. Yay Gentoo, Boo Debian.

Thu Mar 6 23:14:14 GMT 2003

So here's an odd thought in the hours before I go and be Strike Crew until 6 in the morning.

In a world which seems to respect worse is better designs we shouldn't be looking to stamp principles all over our political system. We should, instead, be looking for an incremental approach; a directed genetic algorithm. David Brin thinks that this has been happening for decades with good effect.

So, we would hope that all the political parties would have very similar views. The final result would be that everyone was in exactly the same political position, on top of the highest hill. (or, if you think of a GA as a minimising function, then at the bottom of the lowest valley).

So, we should all rejoice that it's so hard to tell our political parties apart because it's a sign of increasing perfection [1, 2 and 3] (and yes, it is wonderful that number 2 there has a .com URL).

No, I don't really believe it either. Nice thought for a rainy day though.

Thu Mar 6 15:37:08 GMT 2003

Coder's recent entry contains a link to a really good article on pricing. (similar to that Wal-Mart article I linked to).

He then goes on to talk about how wonderful a database of prices would be so that anyone could instantly compare prices on a given product.

I remember that such a database was going to be one of the great things about the Internet. There was a short story in New Scientist years ago, set in the future where baked beans were the only thing that still had brand loyalty. Everything else was brought from whomever sold at the lowest price. Hyperbole, but you get the idea.

But I would bet that this database won't ever exist. For one, as that article covers, prices are becoming increasingly personalised so there would almost have to be one database per person. Also, companies don't want it.

How many companies offer an XMLRPC/SOAP/etc way to find out the price of anything? Companies don't want a market where their prices are driven into the ground. They want to draw you into their advertising wonderland and certainly don't want RSS type applications searching for the lowest prices. We have all seen the unparsable mess they create when they are trying to make a good website. Just think what they could manage if they were trying to obscure the prices. When they are distorted images (designed to be hard to OCR, Turing Test like) it just won't be worth the effort.

My Dilbert books are in Cheltenham, but I think it's in the Dilbert Future where Scott Adams talks about confusopolies. He was spot on.

Site Map
/Root
     AlternateThe Weird and Wonderful
          BacklinksWhat are backlinks
          John GilmoreWhat's Wrong with Copy Protection
     ArchivesBlog Archives
          OneArchive 1
          TwoArchive 2
          ThreeArchive 3
          FourArchive 4
          FiveArchive 5
          SixArchive 6
          SevenArchive 7
          EightArchive 8
          NineArchive 9
          TenArchive 10
          ElevenArchive 11
          TwelveArchive 12
          ThirteenArchive 13
          FourteenArchive 14
          FifteenArchive 15
          SixteenArchive 16
          SeventeenArchive 17
          EighteenArchive 18
          NineteenArchive 19
          Twenty Archive 20
          Twenty OneArchive 21
          Twenty TwoArchive 22
          Twenty ThreeArchive 23
          Twenty FourArchive 24
          Twenty FiveArchive 25
          Twenty SixArchive 26
          Twenty SevenArchive 27
          Twenty EightArchive 28
          Twenty NineArchive 29
          Thirty Archive 30
     PhotosPoor People Caught on Film
          Jack and the Beanstalk Jack and the Beanstalk
          RIP ScanResults of a Stage Scan Fire
          YosemiteYosemite National Park
     ProjectsIncomplete things from the lab
          Seagull's BaneLinux Automounter
          bttrackdBitTorrent Tracker
          CAPTCHACAPTCHA CGI script
          ConservConsole Serving
          DeerparkUsing Tor with Firefox/1.1 (Deerpark)
          DNSFixFixing DNS
          XoversXTA Crossover Control
          IAFSArchive Org Storage
          JBIG2JBIG2 Encoder
          VerifyPGP Key Verifier
          MaxFlowMaximal Flow in Python
          PyBloomBloom Filters in Python
          pyGnuTLSPython wrapping of GnuTLS
          SxmapApache SuEXEC Map
          HellardUnion Server Notes
     RecordingsFree recordings
          ICSM ChoirSt Paul's Church
     SchoolAncient School Stuff
     WritingsWho knows
          Cap SystemsCapability Systems
          IntroIntroduction to me
          SupremaJMC2 Group Project
          MP LettersLetters I've written to my MP
          SoundSound With Dramsoc
          SyncThreadingThe wonders of user-land threads