Tue Mar 22 11:28:52 GMT 2005
The wonders of GreaseMonkey

GreaseMonkey is a Firefox extension which allows you to install Javascript scripts which can manipulate webpages before they're displayed.

You can see a list of GreaseMonkey scripts, but as a demo have a look at the one I cooked up below. It adds result numbers to Google search results and you can then select that result with a single keypress (press '1' for the first result etc).

Type-ahead-find often goes pretty badly on search results because (you would hope) many of the results have all the same keywords in them.

(If you have GreaseMonkey installed, you can install this script by following this link and selecting "Install User Script" from the Tools menu.)

/*
	Add one-press access keys to Google search results. Search results
	are numbered in red and pressing 1..0 selects that search result.
	A Firefox Greasemonkey script,
	Version 0.1
	Adam Langley <agl@imperialviolet.org>

	Public Domain
*/

// ==UserScript==
// @name 			Google Searchkeys
// @namespace 		http://www.imperialviolet.org
// @description 	Adds one-press access keys to Google search results
// @include 		http://www.google.*/search*
// ==/UserScript==

(function() {
	// Search results are in p elements with a class of 'g'
	// This uses XPath to find all such elements and returns a 
	// snapshot. (A snapshot doesnt become invalid after changing
	// the DOM
	
	var results = document.evaluate("//p[@class='g']", document, null, 
			XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
	var counter = 1;
	
	// We store the links in this array which is used by the keypress
	// handler function
	var links = new Array();
	
	for (var i = 0; i < results.snapshotLength; ++i) {
		var result = results.snapshotItem(i);
		// the first child of the paragraph is a comment element
		// this is a little fragile, maybe should be an XPath lookup
		links.push(result.firstChild.nextSibling.getAttribute("href"));

		// We put the result number in a small-caps red span
		var newspan = document.createElement("span");
		newspan.setAttribute("style", "color:red; font-variant: small-caps;");
		newspan.appendChild(document.createTextNode("Result " + counter++ + " "));
		result.insertBefore(newspan, result.firstChild);
	}

	function keypress_handler(e) {
		// e.which contains the ASCII char code of the
		// key which was pressed
		// see: http://web.archive.org/web/20040214161257/devedge.netscape.com/
		// library/manuals/2000/javascript/1.3/reference/
		// handlers.html#1120313
		
		var keypressed = String.fromCharCode(e.which);
		if (keypressed < '0' || keypressed > '9') {
			return true;
		}

		var resnum = e.which - "0".charCodeAt(0);
		if (resnum == 0) {
			resnum = 10;
		}
		document.location = links[resnum - 1];

		return false;
	}

	document.onkeydown = keypress_handler;
})();


Sat Mar 19 15:13:08 GMT 2005
OpenSSH: Old dog, new tricks

OpenSSH has hit version 4.0 (and 4.0p) and with that comes at least one cool new feature: hostname hashing.

If you (or anyone) cats ~/.ssh/known_hosts it lists all the hostnames of every host you ssh to. Probably not a big problem, but the new version of ssh lets you run ssh-keygen -H to hash all these values so that they look like:

|1|bZ457JK38+Bee4NMHxZMmkMqyKg=|+J6sIIzIAoUirdxXwY04fBsb8QQ= ssh-rsa AAAAB3NzaC1y
c2EAAAABIwAAAIEAljhZCk8u8rVqR7YdQxGGG7YBW0uDJq/s9J9hqZlHFs10dX1PHEYsQQf7GV5SB5qLI
6bZcYTZ2OrBOQjlJdp7xPWqCdh3TGEfPUARf5K0tFYCBpFNXt9Fjb2gZDIxG/PAT+JZHJOh66u147QYMo
J3s1MRBoXXm7tSmlwm+QeBcAE=

This, obviously, is a fairly irreversible step (though ssh-keygen does make a backup, the same file name is used for every backup. So it lasts, at most, until the next time ssh-keygen changes the known hosts file.) It also means that you have to use the ssh-keygen -R to delete entries from now on.

Other things that people should do more often: Use ssh-keygen -l and publish the fingerprint of hosts which you expect people to ssh to. Over the phone you can use the -B option to get a more readable version.

Also, use ssh aliases. This is an old trick, but it save a lot of typing of hostnames and usernames (if your username varies across boxes at all). Just put something like this into ~/.ssh/config:

Host alias-name
HostName long.hostname.of.the.host.com
User optional-username

Any option from man 5 ssh_config can go in there.

Tags:

Fri Mar 18 13:17:22 GMT 2005
Acroread 7 for Linux

Hopefully this one will stay around for a little while longer: Acrobat Reader 7 for Linux. The last release disappeared very quickly.

Why would you want this? Well, acroread is the bet quality PDF renderer around I'm afraid. It's big, statically linked and note very fast. But if you're reading a big PDF on screen it's probably worth it.

(Note: once installed goto the Reader/intellinuix directory and rename plug_ins to plug_ins_disabled. It starts much faster, takes less memory and I've no idea what all those plugins do since it seems to work just as well without.)

Update: It was released to help "people in the Netherlands meet tax deadlines"

Tags:

Fri Mar 18 12:59:03 GMT 2005
Just how important is a monotonic clock?

In discussions with Zooko I did a little test to see how important the addition of CLOCK_MONOTONIC is really.

The alternative to using CLOCK_MONOTONIC is to have an itimer which increments a global, volatile counter at some number of Hz. You can do that with something like the following:

struct itimerval itv;
memset(&itv, 0, sizeof(itv));
itv.it_interval.tv_sec = 1;
itv.it_value.tv_sec = 1;

struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = sigalrm_handler;
sigaction(SIGALRM, &sa, NULL);

setitimer(ITIMER_REAL, &itv, NULL);

So I setup a test which tracks the difference between the true time elapsed (from CLOCK_MONOTONIC) and the count of global timer value. Firstly at 10Hz:

Nsecs elapsedGlobal Tick CountSkew in global count
3029049000300
6061401000600
9090753000900
121200890001201
151497730001501
181788020001801
212111430002102
242404800002402
272698220002702
302991660003002
333285130003303
363578670003603
393892030003903
424185580004204
454478990004504
484772420004804
515065930005105
545379640005405
575672850005705
605966330006005
636259870006306
666553240006606
696866770006906
727160180007207
757453630007507
787747140007807
818047020008108
848546410008408
878647660008708
908940980009008
939234440009309
969527940009609
999851630009909
103014486000102010

So, after 100 seconds the counter is already one second off. That's pretty terrible. Trying it again at 1Hz gives better results. I'm not going to give the whole table here but the skew is about 1 second lost every 20 minutes. Still not great.

Tags:

Tue Mar 15 15:47:32 GMT 2005
SSL Libraries

For anyone using OpenSSL for implementing SSL can I suggest theat you look at GnuTLS first? Actually, scratch that. Look at GnuTLS second, after you've seen what the state of the OpenSSL code and documentation are.

I can't argue that GnuTLS has had the same level of inspection and bug hunting that OpenSSL has, and maybe that's a clincher, but it actually has good docs, with examples and everything. It's based on gcrypt (the core of gnupg I believe) and you're much less likely to screw up when using it than you are with OpenSSL.

(I've no fiscal or otherwise interest in GnuTLS, this is just from trying and giving up with OpenSSL over the last couple of days.)

Tags:

Mon Mar 14 10:08:11 GMT 2005
Monotonic Time

How long has a monotonic clock been sitting in POSIX without me noticing? The lack of one has really bugged me in the past. gettimeofday is useless for many timekeeping tasks because it can jump backwards and forwards with the tides of daylight savings, NTP and switching timezones. One doesn't want every timeout to suddenly trigger because they're all `over an hour overdue`, or (possibly worse) not trigger for an hour. Usually I use settimer to increment a one second resolution counter and hope for the best.

But behold! clock_gettime (go read the manpage) can be passed a CLOCK_MONOTONIC argument and on my system at least (2.6.11, glibc CVS Jan 2005) it's a system call which returns the current uptime of the system with nanoseconds. Fantastic.

(Note: you need to link against librt.)

Wed Mar 9 16:32:32 GMT 2005

Is it really beyond the wit of man sshd-authors to log an error message saying Rejecting login due to shell not being in /etc/shells??

In fact, if such an event occurs the error message is Failed password which, when your password goes via pam, via Kerberos 5 to a Windows Active Directory server can really take quite a while to track down what the hell's wrong.

Tue Mar 8 17:05:53 GMT 2005
Free Municipal WiFi

Lessig writes a satirical reply to a bill on the desk of the governor of Philadelphia which would prevent the state from funding free-to-use WiFi networks. I happen to think that Lessig's condemnation of this is ill considered.

I believe that some products don't work in a free market, while others should only ever be setup in one. National defence is an example of the first and Lessig is correct that street lighting probably is too. Lots of things are at the other end, luxury items most surely.

It may be reasonable to worry that a bill which prevents the state from offering any service which could be provided privately is quite stupid, but this has been mixed up with the idea that telecoms companies are using the legislative process to destroy `competition' from the state in the broadband market. The latter feeling rests upon the assumption that it's a good idea for the state to offer free-to-use WiFi and Lessig's writings are being used to support that.

So where is WiFi on the scale from national defence to chocolate cake? The closest example is mobile phone service. This isn't provided by the state, and I've never heard anyone suggest that it should be. Certainly this leads to some duplication of base stations, I'm sure. But more importantly it has lead to better phone service through competition. Does anyone believe that the quality of service would be better, or the costs lower, if mobile phone service was provided by the state?

So why should WiFi be `free'? (I've used to quotes because, of course, everyone is forced to pay for it, it's just mashed together with the rest of the local tax.)

There may be some argument that people need broadband in the same way that public libraries are a good idea. But even I wouldn't suggest the broadband is that important. Indeed, broadband usage in the US has fallen (in terms of world rankings) quite sharply in recent years suggesting that American people agree. If there was demand for broadband then I suspect companies would be offering it in more areas of Philadelphia already.

New Dr. Who

I'm now utterly convinced that the leak of the first episode of the new series of Dr. Who was not a cunning viral marketing ploy. It's rubbish. It plays like a episode of Neighbors with a phone box in it. The special effects of the old series were charmingly primitive. I'm not sure if the effects in the new series were trying to emulate that or if they were just dire. I couldn't even watch it all the way through.

Mon Mar 7 09:53:28 GMT 2005

(NB: This server heeps will be down for upgrades from 11am GMT tomorrow until it's finished.)

I don't have a whole lot to write about at the moment, as the timestamps on this pages will attest to.

It's my third, and final, ride round the merry-go-round at Imperial as the second term draws to a close and the Easter revision period looms large. At the end of this period lie the exams - painfully spread out - and the guilt of knowing that I should probably be doing more revision tempered with the knowledge that my mark was probably set in the first couple of weeks of the course; when I figured out if I liked it. Never has anything managed to motivate me to do something which I don't enjoy.

And then (after a long gap in which I've nothing to do and no money to do it with) I'm probably going to be leaving it all behind. Possibly going to Zürich, slim chance of Mountain View.

Thu Feb 24 20:33:54 GMT 2005
Brian Sedgemore MP on the Prevention of Terrorism Bill

(Hansard source, theyworkforyou link)

As this will almost certainly be my last speech in Parliament, I shall try hard not to upset anyone. However, our debate here tonight is a grim reminder of how the Prime Minister and the Home Secretary are betraying some of Labour's most cherished beliefs. Not content with tossing aside the ideas and ideals that inspire and inform ideology, they seem to be giving up on values too. Liberty, without which democracy has no meaning, and the rule of law, without which state power cannot be contained, look to Parliament for their protection, but this Parliament, sad to say, is failing the nation badly. It is not just the Government but Back-Bench Members who are to blame. It seems that in situations such as this, politics become incompatible with conscience, principle, decency and self-respect. Regrettably, in such situations, the desire for power and position predominates.

As we move towards a system of justice that found favour with the South African Government at the time of apartheid and which parallels Burmese justice today, if hon. Members will pardon the oxymoron, I am reminded that our fathers fought and died for libertymy own father literallybelieving that these things should not happen here, and we would never allow them to happen here. But now we know better. The unthinkable, the unimaginable, is happening here.

In their defence, the Prime Minister and the Home Secretary say that they are behaving tyrannically and trying to make nonsense of the House of Lords' decision in A and Others as appellants v. the Home Secretary as respondent because they are frightened, and that the rest of us would be frightened too if only we knew what they will not tell us. They preach the politics of fear and ask us to support political incarceration on demand and punishment without trial.

Sad to say, I do not trust the judgment of either our thespian Prime Minister or our Home Secretary, especially given the latter's performance at the Dispatch Box yesterday. It did not take Home Office civil servants or the secret police long to put poison in his water, did it? Paper No. 1, entitled "International Terrorism: the Threat", which the Home Secretary produced yesterday and I have read, is a putrid document if it is intended to justify the measure. Indeed, the Home Secretary dripped out bits of it and it sounded no better as he spoke than it read. Why does he insult the House? Why cannot he produce a better argument than that?

How on earth did a Labour Government get to the point of creating what was described in the House of Lords hearing as a "gulag" at Belmarsh? I remind my hon. Friends that a gulag is a black hole into which people are forcibly directed without hope of ever getting out. Despite savage criticisms by nine Law Lords in 250 paragraphs, all of which I have read and understood, about the creation of the gulag, I have heard not one word of apology from the Prime Minister or the Home Secretary. Worse, I have heard no word of apology from those Back Benchers who voted to establish the gulag.

Have we all, individually and collectively, no shame? I suppose that once one has shown contempt for liberty by voting against it in the Lobby, it becomes easier to do it a second time and after that, a third time. Thus even Members of Parliament who claim to believe in human rights vote to destroy them.

Many Members have gone nap on the matter. They voted: first, to abolish trial by jury in less serious cases; secondly, to abolish trial by jury in more serious cases; thirdly, to approve an unlawful war; fourthly, to create a gulag at Belmarsh; and fifthly, to lock up innocent people in their homes. It is truly terrifying to imagine what those Members of Parliament will vote for next. I can describe all that only as new Labour's descent into hell, which is not a place where I want to be.

I hope that but doubt whether ethical principles and liberal thought will triumph tonight over the lazy minds and disengaged consciences that make Labour's Whips Office look so ridiculous and our Parliament so unprincipled.

It is a foul calumny that we do today. Not since the Act of Settlement 1701 has Parliament usurped the powers of the judiciary and allowed the Executive to lock up people without trial in times of peace. May the Government be damned for it.

Sat Feb 19 18:25:52 GMT 2005
Why we shouldn't have security regulation

Bruce Schneier is calling for regulation of software to punish companies who release programs with security problems. This is stupid (sorry Bruce):

  1. Govt regulation is bad: It creates bureaucracy, its rules are complex, arbitrary and inflexible and it costs ... lots. Unless there is a clear benefit to regulation, which gains us more than it costs, then we shouldn't do it. This means that the burden of proof is on the other side.
  2. Who knows what the hell those crazy fools will come up with?: Let's face it. If we're talking about laws to regulate the tech industry then the people voting on them are mostly the same lot which gave us the DMCA (if you're in the US), the EUCD and (very nearly) software patents (if you're EU). These people are not competent to regulate software.
  3. Who are they to decided on the balance of security?: Security is a trade off. People still run phpBB, despite its security record because they think it's functionally superior and that that makes up for the security. That seems to work well for sites like forums.gentoo.org, but other people (myself included) treat running phpBB as the security equivalent of bending over in the prison showers.
  4. What about open-source (etc) software?: Leading on from the second point .. who's to say that you won't be able to release open-source software without liability insurance? If software makers are going to be fined for security problems how is this going to be avoided? Do you trust them to draw that line properly?
  5. What's a security problem?: While they're at it you can be sure that there will be a push from some quaters to get tools like nmap and nessus banned (or made impossibly expensive for their authors). I'm sure that the MPAA and RIAA would define the end-to-end nature of the Internet as a security problem, would you?

Yes, this is fear-mongering. There's a possibility that a given law will be very sensible and reasonable (a thousand monkeys etc). But I'm saying that we shouldn't even start down that road because it will probably end up somewhere very bad and we won't be able to steer it once it starts.

Wed Feb 16 17:47:43 GMT 2005
Directions in Future Languages - Lock-Free malloc

Normally I'll just bookmark papers, but I've found one that deserves better treatment: Scalable Lock-free Dynamic Memory Allocation. It was presented at PLDI04 and is everything a paper should be: practical, clear and with great results.

The author presents a malloc implementation which is lock-free (so you can call it from signal handlers, or kill threads and not cause deadlock etc) and it's faster (on SMP boxes) than the other mainstream concurrent allocators (Hoard and PTMalloc specifically).

For anyone at Imperial, you should come to a talk by Tim Harris on this subject on March 9th (details to be announced).

Entertain

And everyone at (or near) Imperial should come to the charity ball in the Great Hall on the 26th of this month. Tickets are £10 and every penny goes to charity.

Queens tower with projection

Thu Feb 10 14:17:48 GMT 2005
Why Application Level Filtering in Tor is Bad

Background for those who need it: Tor is an onion routing network for TCP streams that allows users to be fairly anonymous while using HTTP/IRC etc. The TCP connections are bounced round the Tor nodes and come out somewhere unrelated to the real source of them.

Of course, over such networks abuse happens. At the moment the most concerning is the spamming of Usenet via Google Groups. Not all Tor nodes allow themselves to be the final (exit) node in the chain as that node is where the connection appears to be coming from (at the IP level) to whoever is the target of the connection. Those that do only allow certain destination ports - 80 is a very common one. Thus people can use Tor to access Google Groups and post spam to Usenet.

(It's suspected, for a number of reasons, that people are doing this in order to trigger complaints to the ISP of the exit nodes and thus it's an attack on the Tor network as a whole. Some people truly believe that anonymity is evil.)

Tor exit nodes can refuse to connect to Google Groups and this is reported in the Tor network wide directory of nodes. Thus clients can check which nodes will support a connection to the destination that they require and choose those nodes as exit nodes. However, a running game of blocking websites used for abuse is probably an unwinnable game. Also, why shouldn't people be able to read Google Groups over Tor? It's only posting that is concerning.

Thus some people (e.g. myself) have suggested that the exit nodes should be able to parse outgoing connections (HTTP being a very good example) and reject POST requests and the like. Here's why this is a bad idea.

This policy could be described in the directory, as IP based policies currently are but they can't be used because the first Tor node (client) cannot know if the browser is going to need to POST before creating the connection, and the exit node is chosen at that point. Thus the exit nodes are chosen randomly and some will have POST blocked.

Tor users then experience random failure of posting. Sometimes it will work, sometimes is doesn't. So the whole network will be dragged down to the level of the most restrictive exit node - because anything else will randomly fail.

Tags:

Mon Jan 31 20:21:47 GMT 2005
Right To Protest

Our dear government has announced that there will be a crackdown on protesters in the new crime bill which is going though the motions at the moment.

This is clearly (and explicitly, in all but the wording of the bill) aimed at `animal rights' protesters who have, in recent times, stepped up their campaigning to include grave robbing, hate mail, vandalism, etc. The arguments are predictable and the animal rights `protesters' are claiming that this is an attack on their right to protest.

I believe that a right to protest should exist. Protesting gives a voice to those that cannot be heard another way. This may be because of financial limits, or because the media refuses to carry their story. As a firm believer in freedom of speech, I think that protesting is an important form of communication.

However, the right to protest is fairly limited. It's a communication mechanism, not a way to impose ones views on the world. What some animal rights `protesters' are doing has gone far beyond communication, into enforcement.

I'm slightly off-balance finding myself, as I do, in agreement with the government on this one. They are enforcing their monopoly on violence, which is right. We install a monopoly on violence in the government because we can (hopefully) control it via the democratic process. That democratic process then sets the limits on what the people can do. If it gets it right, the limits should be minimal.

Animal rights protesters are seeking to impose their own limits on what is already a tightly regulated sector. The government is right to slap them down.

Tags:

Sun Jan 30 10:57:11 GMT 2005
Directions in Future Languages - Software Transactional Memory

STM is a method of handling concurrency in multithreaded systems. The light at the end of this tunnel is that we wish to able to write the following:

def add-child(x):
  global children-by-name, children-by-id
  atomic:
    children-by-name[x.name] = x
    children-by-id[x.id] = x

At no point would any other thread see the child in one map, but not the other. Traditionally this would be done with locks. The maps themselves would have locks inside them, of course, and would thus be `thread safe', but we would need to implement our own locking in order to achieve atomicity between them. An STM says that, when you enter an atomic block, nothing happens until you exit it and then it all happens at once. If another thread altered children-by-name while you were processing the block above the atomic block would be aborted and attempted again.

Lock-free techniques:

The design I'm outlining is is from the lock-free group at Cambridge and a good explaination can be found in Keir Fraser's PhD dissertation. So why am I reiterating it here? Partly because I'm probably going need to write something like this for a report of my own at some point, but also because digging into a PhD can be tough work and these ideas deserve to be better known.

So here's an STM linked list:

The first thing to notice is that nothing holds a direct pointer to anything - they are all via STM object headers. The object header holds the true pointer to the data.

When starting a transaction, a transaction context must be created. This holds the status of the transaction (undecided at the moment) and two lists; a read list and a write list. In this STM design, objects must be `opened' - either for reading or for writing. When you open an object you get a pointer to the true data and that object is recorded in the context.

Thus you open the objects you need (say, opening for read each list element in turn until you find one to delete, thus opening the previous one for write). When an object is opened for write, a copy is made and a pointer to that is returned. Thus you don't edit `live' objects.

So assume that you're removing the last element from this list. Thus you've read the first element and altered the second element (to put a null pointer in its next field). Your memory now looks like this:

Since you are finished altering the list you commit the transaction. At this point we need to introduce a hardware level primitive that we'll be using. In the litrature it's called CAS (Compare and Swap) and, on Intel systems, it's called Compare and Exchange (cmpxchg). It takes three arguments: (old value, new value, memory location) and replaces the contents of memory location with new value if, and only if, its current value is old value, returning the contents of memory location at the end of the instruction - and it does this atomically.

So the CAS operation allows us to update a value in memory, assuming that someone hasn't already beaten us to it. A transaction commit uses this operation to change the pointer in the object headers of objects in the write list, to point to the transaction context instead. This is called `aquiring' the header and it is done in order of increasing memory address.

So, assuming that no other transaction is comming and has beaten us to it, our memory now looks like:

But what if another thread is commiting a conflicting transaction? In that case we'll find a pointer to its transaction context when we do a CAS and we recursively help. Since we have a pointer to its context, we have all the information we need in order to perform its commit ourselves, so we do so. This is to ensure that some progress is always being made by the program as a whole. Unfortunately, it also means that our transaction is void so we abort and try it again.

Assuming that we have accquired all the object headers that we are writing we have to check that none of the objects that we read have been updated in the mean time. If everything looks good we update the pointers in the object headers to point to our shadow blocks and release them - the transaction is complete:

(Note, this is a simplification - see the paper for the full details. Gray objects are now garbage.)

There are several other tricks which can be added to the above. The same Cambridge group has some ideas in one of their recent papers about how IO can be included in a tranaction. IO is, of course, a side effectful operation so clashes if we ever need to `roll-back' a transaction. The Cambridge group have implemented IO in Java which can rebuffer and be included in a transaction.

More cool ideas are presented in this paper (which is talking about a Haskell based STM). There the author presents a primitive called retry which aborts the current transaction and only retries is when an object in the read-list is updated. Thus a queue can be implemented like this:

class Q:
  def get(self):
    if self.length == 0:
      retry
    return self.q.pop()

Thus no more missed condition variables resulting in a stuck thread.

Tags:

Mon Jan 24 19:35:37 GMT 2005
Including SVG figures in TeX documents

Doing this is way tougher than it should be. I gather that, although SVG may be included in PDF documents (for some versions of PDF), it cannot be included inline, but only by giving the SVG a whole page. I've no idea why though.

Note that I wont accept any path which introduces bitmaps. Thus exporting SVG as a high DPI bitmap and including that will not do. That's easy. My requirement is that it look pretty in Acroread 7. (Yes, I have Acroread 7 for Linux but it won't be public for a while yet.)

Why do I want to include SVG? Because good editors for SVG exist. My favourite is Inkscape and, although xfig will always have a place in my heart, it doesn't really cut it anymore I'm afriad.

So, you'll need:

FOP was the tough thing to find. Google was little help here. You may also have success with Scribus, it has good PDF output but its SVG import was too poor for me. Also, you may wish to try Adobe SVG Viewer on Windows or Mac OS X with a PDF printer driver.

To use FOP you'll need to download this wrapper XML file and edit it to set the name of your SVG file. Then export JAVA_HOME and run FOP:

./fop.sh svgpdf.fo -pdf output.pdf

Done? Now edit your TeX file and include the pdftex graphics package:

\usepackage[pdftex]{graphicx}

And include the PDF file:

\begin{figure}[h]
\scalebox{0.82}[0.82]{\includegraphics[viewport=0 740 200 840]{filter-dia.pdf}}
\caption{The capability filter pattern}
\end{figure}

The arguments to scalebox are the X and Y scale factors. The viewport argument is a clipping box for the source PDF file: the lower-left and upper-right corners in pts from the bottom-left of the page. Those values take a few trials to get right, but inkscape will tell you the general values.

Mon Jan 24 12:04:17 GMT 2005
Directions in Future Languages - Exceptions

(Andy: sorry, your Jabber client isn't going to like this one either :))

Exceptions generally work well (so long as people don't misuse them for general control flow) but I'm still looking for a language with a couple of features, which I think are missing.

Firstly, I want to be able to test which object raised the exception. Consider:

try:
  somefunc(mapping[key])
except KeyError:
  return 'key unknown!'

Unfortunately, the KeyError may have come from the lookup in mapping, or it may have come from deep within somefunc - I've no way of telling. In Python I can test the value which caused the error, but that's a bodge.

There are several workarounds:

if not key in mapping:
  return 'key unknown!'
somefunc(mapping[key])

Which is probably what I would do - but it requires an additional lookup. Or:

try:
  value = mapping[key]
except KeyError:
  return 'key unknown!'
somefunc(value)

Which is a little ugly. What I really want is a way to test which object caused the KeyError in the first place:

try:
  somefunc(mapping[key])
except KeyError from mapping:
  return 'key unknown!'

Also, I would like to second a call from Bram for the ability to assert that a given exception would be caught, somewhere in the call chain, by something other than a default handler. Currently there's no way to do this at all in Python (except glitch testing - even then it's not certain) and I've also not heard of it any place else.

Sat Jan 22 11:29:05 GMT 2005
Directions in Future Languages - Predicate Dispatch

I'm certainly not the fist person to talk about predicate dispatch. Like almost everything else in the world of computing - LISP has already done it. But that doesn't mean that anyone actually uses it. (If you want papers see [it rocks] and [it's efficient].)

You probably know predicate dispatch from such feature movies languages as Haskell and Erlang:

def fact(0):
  return 1
def fact(N):
  return N*fact(N - 1)

That's an example of pattern matching, but we can do more:

def is_even?(N) where N % 2 == 0:
  return 'yes'
def is_even?(N):
  return 'no'

We're allowed to test whatever we wish as a predicate. Many people think that the predicates should be side-effect-free, but I'm not too bothered about that. If some coder has good reason for a side-effectful predicate then go right ahead.

No that, in the last example, the most specific function was called. This is determined by the dispatcher. The dispatcher collects and sorts a list of candidate functions and determines which are to be called. Even if I say most-specific-first I don't mean to say that the following will always work:

def foo(N) where N % 2 == 0:
  return True
def foo(N) where N % 4 == 0:
  return False

Sure, the first function is more specific, but don't except your langauge to work that out. In the above case, both functions would be deemed equally specific so it would probably be a case of most-recently-defined first.

This brings us to a couple of other common tricks in these systems: call-next and :before, :after functions etc.

A call-next allows a more specific function to do stuff and then call the next most specific one etc:

def authenticate-user(Username) where Username == 'root':
  if global-disallow-root:
    return False
  call-next(Username)

In dynamic languages that function could be injected at any point and thus thus 'override' a generic authenticate-user function in a library.

:before, :after and :around are from LISP (note the colon) and allow an equally specific function to bump itself up the priority stack (:before), pre-process the return value before the caller sees it (:after) or both (:around)

Also, if one can define any dispatcher one likes (on a per-function-name basis) then they can do odd things. Like calling every appliciable function and summing their return values. (Probably best if the return function is commutative like addition, I think). I can't think, right now, of an example where it would be useful, but I'm sure someone will come across a suitable problem at some point in the history of the universe.

With predicate dispatch one can also build up common features of programming languages like, say, object orientation:

Let class be a type where SomeClass == SomeDerivedClass is true, but SomeDerivedClass == SomeDerivedClass is a more specific match. Now I can define objects:

def __init__(Self, X, Y) where Self == Rectangle:
  Self.X = X
  Self.Y = Y

def type(Self) where Self == Rectangle:
  return 'Rectangle'

Circle(Rectange)  # make Circle == Rectange and
                  #      Circle <  Rectange true

def __init__(Self, X) where Self == Circle:
  Self.X = X
  Self.Y = Y

def type(Self) where Self == Circle:
  return 'Circle'

Now one can override functions of a class 'in-place'. Just declare a more specific function (or use something like :before) and all calls to existing instances of that object are overridden:

def :before type(Self) where Self == Circle:
  return 'Ha, this will screw things up'

(p.s. I've no idea why I suddenly started Capatalising the first letter of variables in the examples. It's bad - don't do it.)

(you too can play with this stuff today by using PEAK)

Wed Jan 12 23:25:54 GMT 2005
Atomic Increment

I've had this link about java atomic operations in my del.icio.us links for a while now. I'm reading around the subject for a number of far flung ideas which might make it into my final year project. But I hate being too abstract for too long, so I implemented a single atomic counter to check that it works the way I expect etc.

As ever, the gcc info pages on inline asm are pretty bad and AT&T syntax buggers my everytime. Still, I have two (NPTL) threads each incrementing a counter. I can increment them without locking, with an atomic op and with locking.

Without locking: 0.2 secs, with locking: 14.1 secs, with atomic ops 3.2 seconds. Not bad.

/* Very quick demo of atomic operations
 * agl */
#include <pthread.h>
#include <stdio.h>

volatile int count = 0;

#define ATOMIC_INC(x) \
	asm volatile("agl1:" : :); \
	asm volatile("mov %0,%%eax" : : "m" (count) : "eax"); \
	asm volatile("mov %%eax, %%ebx" : : : "ebx"); \
	asm volatile("inc %%ebx" : : : "ebx"); \
	asm volatile("lock cmpxchg %%ebx,%0" : : "m" (count) : "eax"); \
	asm volatile("jnz agl1" : :);

/* A thread function to increment count atomically */ 
void *
incr(void *arg) {
	int a;
	for (a = 0; a < 10000000; ++a) {
		ATOMIC_INC(count);
	}

	return NULL;
}

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

/* A thread function which uses locking */
void *
incr_locked(void *arg) {
	int a;
	for (a = 0; a < 10000000; ++a) {
		pthread_mutex_lock(&mutex);
		count++;
		pthread_mutex_unlock(&mutex);
	}

	return NULL;
}

int
main() {
	pthread_t th1, th2;

	pthread_create(&th1, NULL, incr, NULL);
	pthread_create(&th2, NULL, incr, NULL);
	
	pthread_join(th1, NULL);
	pthread_join(th2, NULL);

	printf("count is %d\n", count);

	return 0;
}
Sun Jan 9 13:45:38 GMT 2005

People who read this via RSS probably don't even notice my del.icio.us feed at the top. (The RSS for that is here).

Recently, I have mosting been reading about concurrency. This isn't new - I've been thinking about it for ages (in fact, some people wish I would shutup about it sometimes). But previously my thinking has been about how to manage complex, stateful servers without lock-hell or inversion-of-control. I've now written a Python module (twistless) which does this very nicely. (Source availible upon prodding, but it needs polishing before a public release)

But that (and my other code, like Neuroses) is based around syncthreading which manages to avoid locks by only really have a single thread. The complexity is vanquished, but it doesn't actually use an SMP machine.

This was in the hope that `it will always be fast enough tomorrow'. The thinking was that doubling the speed of your library wasn't worth hand-crafted inverted control code because the CPUs would catch up soon enough anyway. But anyone who has been watching cpu speeds will have noticed that this is no longer true - we've hit the ceiling and CPUs are now growing outwards (multiple cores) rather than upwards (ever faster clock speeds).

Herb Sutter has a good text in DDJ about this.

Doing `true' concurrency without lock-hell is really tough. I don't know how to do it. (Maybe you're smart enough to write complex, multi-threaded code with fine-grained, deadlock and priority-inversion free locking - but I'm not, and nor are a lot of people.) Erlang does quite well in this area with very real results (working ATM switches etc), but it's based around the idea of message-passing threads, which isn't my cup of tea.

The most exciting thing I've read about this is Composable memory transactions which uses a begin-commit style for shared memory (with retry and orElse - see the paper). Unfortunately, it's built on Haskell. (A language which calls IO, exceptions and external calls the "awkward squad" was built on the wrong foundations I'm afraid. Having said that - here's a great introduction on how to do them in Haskell).

Unfortunately, I can't see that the ideas in that paper fit neatly into any current language. (The authors had to hack GHC quite a lot - and even then it only runs syncthreads in a single kernel-thread!) so maybe a new language is in order to play about with them.

Site Map
/Root
     AlternateThe Weird and Wonderful
          BacklinksWhat are backlinks
          John GilmoreWhat's Wrong with Copy Protection
     ArchivesBlog Archives
          OneArchive 1
          TwoArchive 2
          ThreeArchive 3
          FourArchive 4
          FiveArchive 5
          SixArchive 6
          SevenArchive 7
          EightArchive 8
          NineArchive 9
          TenArchive 10
          ElevenArchive 11
          TwelveArchive 12
          ThirteenArchive 13
          FourteenArchive 14
          FifteenArchive 15
          SixteenArchive 16
          SeventeenArchive 17
          EighteenArchive 18
          NineteenArchive 19
          Twenty Archive 20
          Twenty OneArchive 21
          Twenty TwoArchive 22
          Twenty ThreeArchive 23
          Twenty FourArchive 24
          Twenty FiveArchive 25
          Twenty SixArchive 26
          Twenty SevenArchive 27
          Twenty EightArchive 28
          Twenty NineArchive 29
          Thirty Archive 30
     PhotosPoor People Caught on Film
          Jack and the Beanstalk Jack and the Beanstalk
          RIP ScanResults of a Stage Scan Fire
          YosemiteYosemite National Park
     ProjectsIncomplete things from the lab
          Seagull's BaneLinux Automounter
          bttrackdBitTorrent Tracker
          CAPTCHACAPTCHA CGI script
          ConservConsole Serving
          DeerparkUsing Tor with Firefox/1.1 (Deerpark)
          DNSFixFixing DNS
          XoversXTA Crossover Control
          IAFSArchive Org Storage
          JBIG2JBIG2 Encoder
          VerifyPGP Key Verifier
          MaxFlowMaximal Flow in Python
          PyBloomBloom Filters in Python
          pyGnuTLSPython wrapping of GnuTLS
          SxmapApache SuEXEC Map
          HellardUnion Server Notes
     RecordingsFree recordings
          ICSM ChoirSt Paul's Church
     SchoolAncient School Stuff
     WritingsWho knows
          Cap SystemsCapability Systems
          IntroIntroduction to me
          SupremaJMC2 Group Project
          MP LettersLetters I've written to my MP
          SoundSound With Dramsoc
          SyncThreadingThe wonders of user-land threads