<?xml version='1.0' encoding='UTF-8' ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Things Of Interest</title>
    <link>http://qntm.org/</link>
    <atom:link
      href="http://qntm.org/rss.php?src"
      rel="self"
      type="application/rss+xml"
    />
    <description>the personal website of Sam Hughes</description>
    <item>
      <guid>http://qntm.org/call</guid>
      <title>The difference between call-by-value and call-by-reference</title>
      <link>http://qntm.org/call</link>
      <description><![CDATA[<b><a href="http://qntm.org/call">Code</a> »</b> The difference between call-by-value and call-by-reference is most easily explained using the following pseudocode.


function modify(input) {
	input = "red"
}

x = "blue"
modify(x)
print(x) // "blue" if call-by-value
         // "red" if call-by-reference


Hint: most popular programming languages call by value. This includes C, Java and Python. The most notable exception is Perl.

Look out!

Confusingly, many call-by-value programming languages pass an object reference as their value.
Use this pseudocode to test this behaviour.


function modify(input) {
	input.colour = "red"
}

x = new Object
x.colour = "blue"
modify(x)
print(x.colour) // "blue" if call-by-value and value is whole object
                // "red" if call-by-value and value is reference to object
                // "red" if call-by-reference


Hint: most popular call-by-value programming languages use a reference to the object as their value. The most notable exception is PHP.

That'...]]></description>
      <pubDate>Mon, 17 Jun 2013 19:18:17 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/finder</guid>
      <title>Java class finder</title>
      <link>http://qntm.org/finder</link>
      <description><![CDATA[<b><a href="http://qntm.org/finder">Code</a> »</b> "Agh I know that that class is in one of these two hundred JARs, just tell me which one for crying out loud"


import os, sys, zipfile

top = sys.argv[1]
# e.g. "C:\\Program Files (x86)"

classname = sys.argv[2]
# e.g. "javax.wsdl.factory.WSDLFactory"

member = "/".join(classname.split(".")) + ".class"
# e.g. "javax/wsdl/factory/WSDLFactory.class"

for dirpath, _, filenames in os.walk(top):
	for filename in filenames:
		if not filename.endswith(".jar"):
			continue
		filename = os.path.join(dirpath, filename)

		with zipfile.ZipFile(filename) as jar:
			if member in jar.namelist():
				print(filename)
...]]></description>
      <pubDate>Tue, 11 Jun 2013 22:56:26 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/variadic</guid>
      <title>Variadic fixed point combinators</title>
      <link>http://qntm.org/variadic</link>
      <description><![CDATA[<b><a href="http://qntm.org/variadic">Code</a> »</b> The procedure for creating a function which is recursive (self-calling) despite also being anonymous (i.e. having no name by which to call itself) is well-understood and straightforward. In a nutshell, the solution is to use a fixed point combinator.
Unfortunately, this procedure is only effective for constructing anonymous functions which directly call themselves, such as:

fibonacci = lambda n: \
	0 if n == 0 else \
	1 if n == 1 else \
	fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(17)) # "1597"

Now imagine you want to construct a pair of anonymous functions, such that each one potentially calls itself and/or the other, such as:

even = lambda n: \
	True if n == 0 else odd(n-1)
odd = lambda n: \
	False if n == 0 else even(n-1)

print(even(6)) # "True"

In this situation, a mere Y combinator isn't sufficient. We need a variadic fixed point combinator which can resolve the fixed points of multiple mutually referential functions simultaneously.
Read on.

Conten...]]></description>
      <pubDate>Sun, 24 Mar 2013 20:20:05 +0000</pubDate>
    </item><item>
      <guid>http://qntm.org/y</guid>
      <title>Y</title>
      <link>http://qntm.org/y</link>
      <description><![CDATA[<b><a href="http://qntm.org/y">Code</a> »</b> How can a subroutine call itself without it knowing its own name?
Contents
					Problem
				
					Fix #1: Symbolic references
				
					Fix #2: strictness
				
					Fix #3: Anonymity
				
					Fix #4: Fixed-point combinators
				
					Fix #5: That was all a complete waste of time
				
Problem
This is Perl.
In reality, my recursive subroutine was one designed to find the differences between two XML documents, but I don't have space for that, so instead I'll use, ugh, Fibonacci numbers:

sub fib {
	my $n = shift;
	return $n == 0 ? 0 :
	       $n == 1 ? 1 :
	       fib($n-1) + fib($n-2);
}

print fib(17); # "1597"


One day I renamed the subroutine and forgot to refactor all of the calls to it, resulting in a runtime error:


sub fibonacci {
	my $n = shift;
	return $n == 0 ? 0 :
	       $n == 1 ? 1 :
	       fib($n-1) + fib($n-2);
}

print fibonacci(17); # "Undefined subroutine &amp;main::fib called at asdf.pl line 3"


This is a trivial defect to spot in this...]]></description>
      <pubDate>Sat, 26 Jan 2013 12:35:02 +0000</pubDate>
    </item><item>
      <guid>http://qntm.org/perl</guid>
      <title>Learn Perl in about 2 hours 30 minutes</title>
      <link>http://qntm.org/perl</link>
      <description><![CDATA[<b><a href="http://qntm.org/perl">Code</a> »</b> A bunch of new people started at work and we use a lot of Perl in our department. So I put together some information about Perl and I thought it might be worth sharing so here it is.

 Perl in about 2 hours 30 minutes
 Japanese translation by ktat

This was originally dispensed in the form of a presentation and about a dozen Perl scripts and half a dozen modules, which I've consolidated into one fairly readable file. If you find something that has been omitted from the file, this is probably either because it isn't used very frequently in my department, because it's trivial to look up, or because it just wouldn't fit due to time constraints.
It turns out that two and a half hours is barely enough to rattle through the essentials of Perl. The biggest example of this is use, a critical core feature which requires knowledge of packages, modules (which are different from packages), BEGIN blocks and class methods (i.e. object-oriented Perl) in order to fully explain....]]></description>
      <pubDate>Tue, 11 Sep 2012 11:57:47 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/shuffle</guid>
      <title>Perform bit shuffling operations in Python</title>
      <link>http://qntm.org/shuffle</link>
      <description><![CDATA[<b><a href="http://qntm.org/shuffle">Code</a> »</b> Here's another one from the "uncertain if anybody has any use for a thing like this" department.
Let's say you were trying to decode a four-byte UTF-8 character to its Unicode code point, as a 32-bit integer. You might express that conversion in the form of a neat, short string as follows:

11110aaa 10bbbbbb 10cccccc 10dddddd -&gt; 00000000 000aaabb bbbbcccc ccdddddd

But manual bit-shuffling is tedious at the best of times. shuffle.py is a Python 3 script which takes this exact string as input and returns you some sample bit-shuffling code as output.


C:\>python shuffle.py "11110aaa 10bbbbbb 10cccccc 10dddddd -&gt; 00000000 000aaabb bbbbcccc ccdddddd"

def shuffle(input, index):
        '''
                11110aaa 10bbbbbb 10cccccc 10dddddd
                ->
                00000000 000aaabb bbbbcccc ccdddddd
        '''
        assert index + 4  4,
                (input[index + 1] & 0b00001111) > 2,
                (input[index + 2] & 0b00000011) ...]]></description>
      <pubDate>Tue, 07 Aug 2012 22:27:38 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/axiom</guid>
      <title>Python lacks the Axiom of Choice</title>
      <link>http://qntm.org/axiom</link>
      <description><![CDATA[<b><a href="http://qntm.org/axiom">Code</a> »</b> Python has sets, but lacks the axiom of choice.
If I have a set object or a frozenset object then that object may or may not have members.
If a set has a single member, then I can get that member using set.pop():

chars = {"a", "b", "c"}

if len(chars) == 1:
    print(chars.pop())
else:
    print("[" + "".join(escape(char) for char in sorted(chars)) + "]")

But this has two drawbacks. Firstly, it's destructive: chars is now empty after following the first code path.
Secondly, it doesn't work on frozensets, because you can't modify a frozenset. The only option available to me is:

chars = frozenset({"a", "b", "c"})

if len(chars) == 1:
    print([char for char in chars][0])
else:
    print("[" + "".join(escape(char) for char in sorted(chars)) + "]")

which is just disgusting. Wouldn't it be much prettier if Python simply supported the axiom of choice?

chars = frozenset({"a", "b", "c"})

if len(chars) == 1:
    print(chars.choose())
else:
    print("[" + "".j...]]></description>
      <pubDate>Thu, 02 Aug 2012 14:40:30 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/fsm</guid>
      <title>Finite state machines in Python</title>
      <link>http://qntm.org/fsm</link>
      <description><![CDATA[<b><a href="http://qntm.org/fsm">Code</a> »</b> 
 fsm.py which is on GitHub now

This is a library for the creation and manipulating of deterministic finite state machines in Python 3. I realise that this has been done before in basically every language under the sun, Python included, but doing it for myself was incredibly good fun and highly educational.
fsm class overview
This contains the basic fsm class. This allows you to create a deterministic finite-state machine (I guess "automaton" is a more correct term but never mind, if that bothers you too much you can do a find-and-replace).
You can use the fsm.accepts() method to pass a string to the FSM and retrieve True or False. The fsm class also implements __str__() which means you can print an FSM using print(fsm).
Finally and most importantly, if you also have my companion lego.py module, you can call fsm.lego() to convert the FSM into an object representing a regular expression with exactly equivalent matching power to the original FSM. This process uses this algorithm ...]]></description>
      <pubDate>Wed, 01 Aug 2012 20:04:22 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/fib</guid>
      <title>Memoizing recursive JavaScript functions (without mentioning fixed-point combinators or lambda calculus)</title>
      <link>http://qntm.org/fib</link>
      <description><![CDATA[<b><a href="http://qntm.org/fib">Code</a> »</b> This is me attempting to decode this.

The problem

Let's say you've got a function of one argument. Basic Fibonacci sequence. Recursive, calls itself. Returns an integer.


var fibo = function(n) {
  return n == 0 ? 0 :
         n == 1 ? 1 :
         fibo(n-1) + fibo(n-2);
};

alert("fibo(32) = " + fibo(32)); // Takes a little while



Is there a way to create a new function which is identical but automatically caches answers before it returns them? Let's say, for simplicity of implementation, that the argument can always be used as the index in an object, which means that a simple object cache can be used to cache the results from the function.
Then there sure is. For example, we could manually create a wrapper function fibo_caching:


var fibo = function(n) {
  return n == 0 ? 0 :
         n == 1 ? 1 :
         fibo(n-1) + fibo(n-2);
};

var fibo_caching = function(n, cache) {
  if(!cache[n]) {
    cache[n] = fibo(n);
  }
  return cache[n];
};

var ...]]></description>
      <pubDate>Wed, 07 Sep 2011 19:11:47 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/tetris</guid>
      <title>Solving Tetris in C</title>
      <link>http://qntm.org/tetris</link>
      <description><![CDATA[<b><a href="http://qntm.org/tetris">Code</a> »</b> Exactly one of the following statements is true.

	In Tetris, you can always get at least one line before losing.
	In Tetris, a sufficiently evil AI can always force you to lose without getting a single line, by providing bad pieces.

Which one?
Prove it.
The setup
I created HATETRIS last year because I wanted Tetris players to suffer. It was something I created for my own amusement and as a programming challenge and to investigate the use of JavaScript as a gaming platform.
HATETRIS explores only to a depth of 1 piece. It looks at the immediate future and chooses the single worst piece for the present situation, ignoring the possibility of the player using the current piece to set up a situation where the 2nd piece forces a line. After the first iteration of the game, I started using a bit mask to express each row of the playing field ("well"), which made testing the insertion and removal of possible pieces faster, to the extent that a new piece appeared more or less instantl...]]></description>
      <pubDate>Wed, 20 Jul 2011 23:26:17 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/bulbs</guid>
      <title>The Bulb Dropping Problem</title>
      <link>http://qntm.org/bulbs</link>
      <description><![CDATA[<b><a href="http://qntm.org/bulbs">Code</a> »</b> The problem
You have a 100-storey building and 2 identical, supposedly-unbreakable light bulbs.
You know for a fact that if you drop a bulb at floor 0 (i.e. ground level), the bulb will not break. You also know for a fact that if you drop a bulb at floor 101 (i.e. go to the roof and throw it off), the bulb will definitely break. In fact, there is some floor f with 0 ≤ f ≤ 100 such that if you drop a bulb at floor f or lower it will not break, and if you drop it at floor f+1 or higher, it will definitely break.
You have only 2 light bulbs. Once they are both broken, there are no more. Assume that dropping a bulb multiple times does not weaken it. How many bulb drops must be performed to find f, in the worst-case scenario?
Hint: the answer is below 20.

Suppose we had just 1 bulb
Consider a simpler scenario in which we have only 1 light bulb. In this case the solution is obvious.
We cannot drop bulb 1 at, say, the 50th floor. If it breaks, then we have learned that f is betwe...]]></description>
      <pubDate>Tue, 12 Jul 2011 22:29:54 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/locoparser</guid>
      <title>More Loco examples</title>
      <link>http://qntm.org/locoparser</link>
      <description><![CDATA[<b><a href="http://qntm.org/locoparser">Code</a> »</b> Following up from Loco, which introduced Loco.php, here are some more examples built using this parser combinator library.
Each of these examples enables you to parse a grammar and return a Grammar object capable of recognising that grammar.

	bnf.php - Defines $bnfGrammar, which parses grammars presented in Backus-Naur Form.
	wirth.php - Defines $wirthGrammar, which parses grammars presented in Wirth syntax notation.
	ebnf.php - Defines $ebnfGrammar, which parses grammars presented in Extended Backus-Naur Form.
	locoNotation.php - Defines $locoGrammar, which parses grammars presented in Loco notation (see below).


Examples and demonstrations can be found in the unit testing sections at the bottom of each script. Alternatively, read on to see what each notation gives you:

Backus-Naur Form
BNF is generally pretty low-tech and lacks a lot of features.
Sample grammar in Backus-Naur Form
This appears on Wikipedia. This is a pretty clunky example because it doesn't handle wh...]]></description>
      <pubDate>Sun, 10 Jul 2011 18:36:29 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/loco</guid>
      <title>Loco: parser combinators for PHP</title>
      <link>http://qntm.org/loco</link>
      <description><![CDATA[<b><a href="http://qntm.org/loco">Code</a> »</b> Loco is a library of parser combinators for PHP.
Loco uses single-valued parsers called MonoParsers. A conventional, "enthusiastic" parser returns a set of possible results (which is empty if parsing is not possible). A "lazy" parser returns one possible result on the first call, and then returns further results with each subsequent call until no more are possible. MonoParsers simply return a single result or failure. By explicitly disallowing backtracking, this keeps parsing time linear (unless you use advanced, non-regular "regular expressions", but that's crazy talk).

  Loco.php

Examples

  json.php - parse JSON expressions
  regEx.php - take simple regular expressions and parse them into data structures
  simpleComment.php - recognise HTML comments for here on qntm.org

More examples
For examples of using Loco to recognise grammar syntax notations such as Backus-Naur Form, see More Loco examples.

Great things about Loco

  Parse in linear time
  Simple, logical,...]]></description>
      <pubDate>Thu, 30 Jun 2011 12:49:38 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/tax</guid>
      <title>U.S. federal individual income tax rates history, 1913-2011</title>
      <link>http://qntm.org/tax</link>
      <description><![CDATA[<b><a href="http://qntm.org/tax">Code</a> »</b> So this is what I was doing with all of that graphing. I guess some people might find this useful. I found the information interesting.


A more interesting summary graph is on Wikipedia: Historical tax rates for the highest and lowest income earners.


U.S. federal individual income taxation 2011
U.S. federal individual income taxation 2010
U.S. federal individual income taxation 2009
U.S. federal individual income taxation 2008
U.S. federal individual income taxation 2007
U.S. federal individual income taxation 2006
U.S. federal individual income taxation 2005
U.S. federal individual income taxation 2004
U.S. federal individual income taxation 2003
U.S. federal individual income taxation 2002
U.S. federal individual income taxation 2001
U.S. federal individual income taxation 2000
U.S. federal individual income taxation 1999
U.S. federal individual income taxation 1998
U.S. federal individual income taxation 1997
U.S. federal individual income taxation 1996
U.S....]]></description>
      <pubDate>Mon, 28 Feb 2011 19:59:34 +0000</pubDate>
    </item><item>
      <guid>http://qntm.org/chomsky</guid>
      <title>Convert a context-free grammar to Chomsky Normal Form in PHP</title>
      <link>http://qntm.org/chomsky</link>
      <description><![CDATA[<b><a href="http://qntm.org/chomsky">Code</a> »</b> Backus-Naur Form is the standard means by which a context-free grammar may be represented in text. Backus-Naur Form can be used to represent any context-free rule from any context-free grammar.
However, it is sometimes more useful to render a context-free grammar into Chomsky Normal Form. In Chomsky Normal Form, only the following forms of context-free rule are allowed:


 S &rarr; ε, where S is the start symbol
 A &rarr; BC, where B and C are non-terminal symbols and B ≠ S and C ≠ S
 A &rarr; α, where A is a nonterminal symbol and α is a terminal symbol.


It can be proven that any context-free grammar, whether presented in Backus-Naur Form or otherwise, can be converted into Chomsky Normal Form. The following PHP script describes a Rule class and a ContextFreeGrammar class to which Rules can be added. The ContextFreeGrammar class also contains a function toCnf() which can be called to convert the grammar to Chomsky Normal Form.


 cfg.php


Use

Use of this co...]]></description>
      <pubDate>Tue, 08 Feb 2011 10:11:39 +0000</pubDate>
    </item><item>
      <guid>http://qntm.org/greenery</guid>
      <title>Compute the intersection of two regular expressions in Python 3</title>
      <link>http://qntm.org/greenery</link>
      <description><![CDATA[<b><a href="http://qntm.org/greenery">Code</a> »</b> I undertook a project to make it possible to compute the intersection between two regular expressions in Python 3.
Elementary automata theory tells us that the intersection of any two regular languages is a regular language, but carrying out this operation on actual regular expressions to generate a third regular expression afterwards is much harder than doing so for the other operations under which the regular languages are closed (concatenation, alternation, Kleene star closure).

Code
The whole thing is now on GitHub.
For the purposes of this project I developed two modules.

First I developed fsm.py (master), a Python 3 library for creating and manipulating finite state machines in Python.
Next, lego.py (master), a Python 3 library for creating and manipulating regular expressions as data structures.

The important features of these modules are:

lego.parse(), which takes a regular expression from the form of a string and turns it into a lego object, a representation of...]]></description>
      <pubDate>Sat, 09 Oct 2010 17:16:22 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/lego</guid>
      <title>Regular expression parsing in Python</title>
      <link>http://qntm.org/lego</link>
      <description><![CDATA[<b><a href="http://qntm.org/lego">Code</a> »</b> 
 lego.py which is now on GitHub

This module provides methods for reading a regular expression, as provided in the form of a string, into a manipulable nested data structure, and for manipulating that data structure.
Note that this is an entirely different concept from that of simply creating and using those regexes, functionality which is present in basically every programming language in the world, Python included.
Critically, this module provides two relatively rare and advanced operations:

 reduce(), a method for the simplification of regular expressions, and
 __and__() (written in code as foo = bar &amp; baz), a method for taking two regular expressions and computing their intersection. Elementary automata theory demonstrates that regular languages are indeed closed under this operation, but performing it is much more difficult in the general case than the other three operations under which regular languages are closed: concatenation, alternation and Kleene star closure....]]></description>
      <pubDate>Sun, 01 Aug 2010 18:10:45 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/hatetris</guid>
      <title>HATETRIS</title>
      <link>http://qntm.org/hatetris</link>
      <description><![CDATA[<b><a href="http://qntm.org/hatetris">Code</a> »</b> Play Hate Tetris.
This is bad Tetris. It's hateful Tetris. It's Tetris according to the evil AI from "I Have No Mouth And I Must Scream".
I have to be honest, this is not an entirely original idea. However, RRRR's implementation of the concept of a game of Tetris which always gives you the worst possible pieces leaves much to be desired:

 the keyboard interface frequently doesn't work
 the conditions for failure are ambiguous and inconsistent
 the playing field is only 8 blocks wide as compared to the standard 10
 the AI is either overly generous, or stupid, and frequently does NOT provide you with the worst possible piece

(UPDATE: there is also Bastet, which can be played online here, but is also far too forgiving.)
HATETRIS was an experiment to rectify those flaws. Also, every coder has to build Tetris at least once in their life. In addition, I also have a passionate hatred of JavaScript, but I felt that that hatred was borne out of ignorance, so this was an attempt to r...]]></description>
      <pubDate>Sat, 03 Apr 2010 23:58:47 +0100</pubDate>
    </item><item>
      <guid>http://qntm.org/snakecube</guid>
      <title>Snake Cube</title>
      <link>http://qntm.org/snakecube</link>
      <description><![CDATA[<b><a href="http://qntm.org/snakecube">Code</a> »</b> For Christmas one of my friends presented me with a block puzzle. This puzzle was basically a Rubik's Cube, but the difference was, instead of having colours on each side, all the sides were coloured yellow. Instead, a snake, with two heads, starts at the centre of one of the faces of the cubies and snakes around the other cubies to its endpoint. Once scrambled, the idea is to unscramble the cube so that the snake forms one continuous curve.
Attempt 1: solving legitimately
This proved to be extremely difficult. To begin with, I am very bad at the Rubik's Cube puzzle, so simply rearranging the cubies using ordinary legal turns is not something I can do. I was restricted to trial and error.
Attempt 2: take the cube apart and solve by eye
The next problem was that even having taken the cube apart - and unfortunately I broke one of the pieces in order to do this, but the puzzle itself was still intact enough to be entirely solvable, so no worries - you still don't know what the solutio...]]></description>
      <pubDate>Fri, 19 Feb 2010 21:45:47 +0000</pubDate>
    </item> 
  </channel>
</rss>