Here's another one from the "uncertain if anybody has any use for a thing like this" department.
Let's say you were trying to decode a four-byte UTF-8 character to its Unicode code point, as a 32-bit integer. You might express that conversion in the form of a neat, short string as follows:
11110aaa 10bbbbbb 10cccccc 10dddddd -> 00000000 000aaabb bbbbcccc ccdddddd
But manual bit-shuffling is tedious at the best of times.
shuffle.py is a Python 3 script which takes this exact string as input and returns you some sample bit-shuffling code as output.
C:\>python shuffle.py "11110aaa 10bbbbbb 10cccccc 10dddddd -> 00000000 000aaabb bbbbcccc ccdddddd" def shuffle(input, index): ''' 11110aaa 10bbbbbb 10cccccc 10dddddd -> 00000000 000aaabb bbbbcccc ccdddddd ''' assert index + 4 <= len(input) assert input[index] & 0b11111000 == 0b11110000 assert input[index + 1] & 0b11000000 == 0b10000000 assert input[index + 2] & 0b11000000 == 0b10000000 assert input[index + 3] & 0b11000000 == 0b10000000 return [ 0b00000000, (input[index] & 0b00000111) << 2 | (input[index + 1] & 0b00110000) >> 4, (input[index + 1] & 0b00001111) << 4 | (input[index + 2] & 0b00111100) >> 2, (input[index + 2] & 0b00000011) << 6 | input[index + 3] & 0b00111111, ], index + 4
The resulting sample code is perfectly valid and readable, and also ideal for refactoring into whatever purpose you like.
There is a huge list of possible improvements which I may embark upon if it turns out that anybody has the slightest use for them.
shuffle.pyis written in Python, but can theoretically give output in many other programming languages
- Options to customise whether or not the resulting
shuffle()function checks input length, and if so, how
- Ditto for checking fixed bits on the inputs
- Make the input index and output index optional
- Options for supplying different function name
- Options for returning different types (at the moment, a
ints between 0 and 255 inclusive is returned; it would be very easy to return a
- Maybe accept multiple "shuffle" strings, and return an entire binary data encoder as output?