Tools to assist comparison of binary data
Sep 23rd, 2008 by dmess0r
As a part of my work with the MSR206 I needed to write some small scripts and command lines to help deal with binary data. Using vi(1) or emacs(1) just seemed too cumbersome for the lightweight data which I was analyzing, so I wrote some simple command lines to assist.
hexlate
The first one I call “hexlate”. hexlate is just hexdump with special formatting arguments.
#!/bin/sh
hexdump -v -e '"\\""x" 1/1 "%02x"'
Essentially here we run hexdump(1), pass it “-v” which causes hexdump to display all input data and not replace duplicate data with asterisks. Next is the formatting rule. Here I tell it that I want to display all characters in hexadecimal code in two character format.
Here is a test file:
asgar:tmp evan$ cat old
ABCDE
asgar:tmp evan$
Notice the carriage at the end.
Here is the same test file run through hexlate:
asgar:tmp evan$ cat old | ./hexlate
\x41\x42\x43\x44\x45\x0aasgar:tmp evan$
Notice the \x0a at the end? Thats the carriage return.
asgar:tmp evan$ echo -ne '\n' | ./hexlate
\x0aasgar:tmp evan$
hexlate is a really useful tool to quickly eyeball binary data, and then use it with perl. In this format, we can use the print function in perl to produce the binary output via stdout.
perl -e print “\x41\x42\x43\x44\x45\x0a”
asgar:tmp evan$ perl -e 'print "\x41\x42\x43\x44\x45\x0a"'
ABCDE
asgar:tmp evan$
If you had a large block of hexlate output, you could easily incorporate it into a script.
prep, compare
The next two tools are primarily used together, “prep” and “compare”.
The first one, prep, is (once again) another hexdump command line:
#!/bin/sh
hexdump -v -e '1/1 "%02x\n"' $1
This time the format looks like this:
asgar:tmp evan$ cat old | ./prep
41
42
43
44
45
0a
asgar:tmp evan$
Why would this be useful at all? I’ll tell you why, lets say you have two binary strings that you want to compare, literally. If you simply did a hexdump against them, or even a hexlate the output can be a bit difficult to easily spot changes. Let’s take a look at “compare” and see where I am going with this.
Here is what “compare” looks like:
export PATH=./:${PATH}
diff -yW10 <(prep $1) <(prep $2)
Here’s how to use it and the output:
asgar:tmp evan$ ./compare ./old ./new
41 41
42 42
43 43
44 | 20
45 45
0a 0a
Whoa wtf? Yeah, rad huh? Okay so here is whats happening. Lets start with the <(prep $1) and <(prep $2) action. In bash we can take the output of a command and turn it into a file, /dev/fd/<number>. When we do this, we get output as files. So we “prep” file “old” and we “prep” file “new”. While diff(1) can only will compare files, we’ve just turned the stdout into files so there is no problem. Next, the diff is passed the arguments “-y”, this forces the diff to be done as a “side-by-side” in two columns. That is how we get the files to line up. The other argument is “-W10″, this just sets the width to 10 possible print columns.
Ultimately we produce output side-by-side, and it is super easy to spot the difference.
File “old” is “ABCDE\n” and file “new” is “ABC E\n”. The letter “D” was replaced with a literal space ” “. We can see the change clear as day.
I hope you find these tools as useful as I have. Granted they’re pretty simple command lines, but damn are they useful.



