Thoughts on Base32 encoding and decoding
Observations about how others do it
If anyone has ways to optimize this code, please leave a comment. I’m always looking for ways to improve my craft. This has been tested against the supplied test vectors in the RFC.
Update 3: I swapped out the code above with my latest code. Please see the end of the post for the original code and updates 1 and 2. This also fixes a bit-padding issue with the last character.
In the code snippet above, the majority of the operation is handled in one line (
base = bytes...)
and the rest is there simply to allow padding to be calculated properly. Why is it then that other
implementations are multiple methods, each with many more lines than this?
My guesses are as follows:
- I messed up something in my interpretation of the standard, even though all my test data is capable of re-encoding back to the same string as other libraries decode from.
- Other library authors came from other languages and are not familiar with elements of the Ruby language that could aid in optimizing their solution
- The other authors are aware of the method I’m using, however the performance is sub-optimal for large computational volume, or there is some other optimization performed which requires a longer code block.
With my caveat in bullet one, I suspect that is not it, however I am open to points on the contrary. As for bullets two and three, I have not been able to settle on either. I would like to assume the third, however we all know what happens when you make “ASSumptions”.
I invite open dialog on this…
Update 4: In a conversation with my friend Jón, he asked me about the original code. This got me thinking about how small a piece of code I could rewrite the encoder into. The snippet below is the result. I took the original author’s 20 LoC implementation and reduced it to a single line that I spread out for readability. Jón also made the good observation that the original code looks like a port of the C implementation. Nice catch!
Original Code from First Posting:
Update 1: It occurs to me that there is a relationship between the number of bits in the last bit capture that would allow the modulo-math if-block to be replaced with a case statement using length equlity. That could potentially be optimized further if the number of bits can be converted into 0, 1, 3, 6 for the padding multiplier. Thoughts?
Update 2: Added in the
BASE32_CHARS constant declaration to make the example runable.
blog comments powered by Disqus