Wednesday, February 17, 2010

Base64 Encoding and Decoding

     base64 is an encoding scheme used to transfer binary/text data across networks. Though initially used predominantly for transmitting data through email, base64 is used in a wide variety of of applications nowadays.

      The encoding is done by converting input data into printable ASCII characters. The input stream is divided into 6 bit segments and converted to a printable character from a set of 64 encoding characters based on the value of the 6 bits ( 6 bits can represent a value between 0 to 63   - 64 numbers ).   In practical dividing by 6 bits is done by taking 3 consecutive bytes (3 x 8 bits =  4 x 6 bits ) of the input stream and converting it to 4 encoding characters.  If the total number of bytes in the input stream is not a multiple of three, extra 0 value bytes are padded to the end of the stream to make it multiple of three.

     The decoding of the input stream is done on the receiving end by reversing the above method. 

     In Linux, the "base64" command is used to decode/encode base64.  Let us see an example.  We will convert the "ls" command binary to base64 and back and see if the resultant binary is still working as "ls".

safeer@penguinpower:~/TMP$ cp /bin/ls .
safeer@penguinpower:~/TMP$ /home/safeer/TMP/ls

ls

Convert the binary to base64

safeer@penguinpower:~/TMP$ cat /home/safeer/TMP/ls | base64 > ls.encoded

Compare both

safeer@penguinpower:~/TMP$ wc ls ls.encoded

   503   3271 104508 ls
  1834   1834 141178 ls.encoded
  2337   5105 245686 total



safeer@penguinpower:~/TMP$ head -1 ls.encoded
f0VMRgEBAQAAAAAAAAAAAAIAAwABAAAANL4ECDQAAADckwEAAAAAADQAIAAJACgAHAAbAAYAAAA0
safeer@penguinpower:~/TMP$ tail -1 ls.encoded
AAAA6JIBAPIAAAAAAAAAAAAAAAEAAAAAAAAA
safeer@penguinpower:~/TMP$ file ls.encoded
ls.encoded: ASCII text
safeer@penguinpower:~/TMP$ file ls
ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x83531f308f1fa18221be53eaf399303400c14638, stripped


Now convert the encoded data back

safeer@penguinpower:~/TMP$ cat ls.encoded |base64 -d > ls.decoded
Compare with original "ls" binary

safeer@penguinpower:~/TMP$ wc ls ls.decoded
   503   3271 104508 ls
   503   3271 104508 ls.decoded
  1006   6542 209016 total

safeer@penguinpower:~/TMP$ file ls.decoded
ls.decoded: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x83531f308f1fa18221be53eaf399303400c14638, stripped

safeer@penguinpower:~/TMP$ chmod a+x ls.decoded
safeer@penguinpower:~/TMP$ /home/safeer/TMP/ls.decoded

ls  ls.decoded    ls.encoded

Now that you know how the conversion works, let us see some other ways to decode/encode using base64.  We will use a simple string this time for simplicity.

Bash

Encoding

safeer@penguinpower:~/TMP$ printf "safeer"|base64
c2FmZWVy

Decoding

safeer@penguinpower:~/TMP$ printf "c2FmZWVy"|base64 -d
safeer

Perl

Encoding

safeer@penguinpower:~/TMP$ perl -MMIME::Base64 -e 'print encode_base64("safeer");'
c2FmZWVy

Decoding

safeer@penguinpower:~/TMP$ perl -MMIME::Base64 -e 'print decode_base64("c2FmZWVy");'
safeer

PHP

Encoding

safeer@penguinpower:~/TMP$ php -r "print base64_encode('safeer');"
c2FmZWVy

Decoding

safeer@penguinpower:~/TMP$ php -r "echo base64_decode('c2FmZWVy');"

safeer

Python

Encoding

safeer@penguinpower:~/TMP$ python -c "import base64;print base64.b64encode('safeer')"

c2FmZWVy

Decoding

safeer@penguinpower:~/TMP$ python -c "import base64;print base64.b64decode('c2FmZWVy')"
safeer

No comments:

Post a Comment