Here's another one from the "uncertain if anybody has any use for a thing like this" department.
Let's say you were trying to decode a four-byte UTF-8 character to its Unicode code point, as a 32-bit integer. You might express that conversion in the form of a neat, short string as follows:
11110aaa 10bbbbbb 10cccccc 10dddddd -> 00000000 000aaabb bbbbcccc ccdddddd
But manual bit-shuffling is tedious at the best of times. shuffle.py is a Python 3 script which takes this exact string as input and returns you some sample bit-shuffling code as output.
C:\>python shuffle.py "11110aaa 10bbbbbb 10cccccc 10dddddd -> 00000000 000aaabb bbbbcccc ccdddddd"
def shuffle(input, index):
'''
11110aaa 10bbbbbb 10cccccc 10dddddd
->
00000000 000aaabb bbbbcccc ccdddddd
'''
assert index + 4 <= len(input)
assert input[index] & 0b11111000 == 0b11110000
assert input[index + 1] & 0b11000000 == 0b10000000
assert input[index + 2] & 0b11000000 == 0b10000000
assert input[index + 3] & 0b11000000 == 0b10000000
return [
0b00000000,
(input[index] & 0b00000111) << 2 | (input[index + 1] & 0b00110000) >> 4,
(input[index + 1] & 0b00001111) << 4 | (input[index + 2] & 0b00111100) >> 2,
(input[index + 2] & 0b00000011) << 6 | input[index + 3] & 0b00111111,
], index + 4
The resulting sample code is perfectly valid and readable, and also ideal for refactoring into whatever purpose you like.
Possible improvements
There is a huge list of possible improvements which I may embark upon if it turns out that anybody has the slightest use for them.
shuffle.pyis written in Python, but can theoretically give output in many other programming languages- Options to customise whether or not the resulting
shuffle()function checks input length, and if so, how - Ditto for checking fixed bits on the inputs
- Make the input index and output index optional
- Options for supplying different function name
- Options for returning different types (at the moment, a
listofints between 0 and 255 inclusive is returned; it would be very easy to return abytearrayorbytesobject instead) - Maybe accept multiple "shuffle" strings, and return an entire binary data encoder as output?
Discussion (14)
2012-08-08 01:02:47 by Artanis:
2012-08-08 01:03:33 by Sam:
2012-08-08 01:55:33 by Artanis:
2012-08-08 09:20:50 by Sam:
2012-08-08 13:05:40 by Andrew:
2012-08-08 14:20:33 by Sam:
2012-08-08 15:01:41 by Andrew:
2012-08-08 15:49:20 by ejl:
2012-08-18 12:50:07 by OvermindDL:
2012-08-18 12:51:01 by OvermindDL:
2012-08-18 13:44:40 by OvermindDL:
2012-08-18 13:46:43 by OvermindDL:
2012-08-18 13:48:50 by OvermindDL:
2012-08-18 13:51:03 by OvermindDL:
add comment