Binary Inspection [strings, xxd, base64]
Introduction
When you’re handed a compiled binary, an encoded string, or a raw data blob, you need tools that can peer inside without executing it. Three essential utilities for this work are strings, xxd, and base64. They allow you to extract readable text, inspect raw hexadecimal, and encode or decode data streams — all without leaving the safety of your terminal.
This note covers their usage in depth, with practical patterns drawn from forensics work and CTF challenges like OverTheWire Bandit.
strings — ASCII Extraction
strings scans a file byte by byte and prints any sequence of printable characters that meets a minimum length. It doesn’t parse structure — it just finds human-readable text buried in binary noise.
strings suspicious_binaryFiltering Significant Sequences
By default, strings prints sequences of four or more printable characters. Use -n to raise the threshold and filter out short, meaningless fragments:
strings -n 10 binary_file # Only sequences of 10+ charactersThis is particularly useful when scanning compiled binaries where short strings (like ELF, /lib, ld.) are structural noise rather than meaningful content.
Selecting the Character Encoding
Different binaries use different text encodings. strings defaults to the system’s native encoding but can be told to look for others:
strings -e l binary_file # 16-bit little-endian (UTF-16LE)
strings -e b binary_file # 16-bit big-endian (UTF-16BE)
strings -e L binary_file # 32-bit little-endian (UTF-32LE)This matters when extracting strings from Windows executables, Java class files, or firmware images where non-ASCII encodings are common.
Combining with Other Tools
strings is rarely used alone. Pipe its output into grep to search for specific patterns:
strings binary_file | grep -i password
strings binary_file | grep -E "flag\{.*\}" # CTF flag pattern
strings binary_file | grep -E "https?://" # URLs embedded in binaryPractical Applications
| Use Case | Command Pattern |
|---|---|
| Find hardcoded credentials | strings binary | grep -iE "pass|key|token|secret" |
| Extract URLs or IPs | strings binary | grep -E "https?|[0-9]+\.[0-9]+\.[0-9]+" |
| Find embedded config paths | strings binary | grep -E "^/|\.conf|\.cfg|\.ini" |
| Hunt for flags in CTFs | strings binary | grep -E "flag\{.*\}|FLAG|CTF" |
| Scan for email addresses | strings binary | grep -E "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" |
Key Flags
| Flag | Description |
|---|---|
-n <length> |
Minimum string length (default: 4) |
-e <encoding> |
Character encoding (l = little-endian, b = big-endian, L = 32-bit LE) |
-o |
Print the offset (in octal) of each string within the file |
-t x |
Print offsets in hexadecimal instead of octal |
--radix=x |
Print offsets in hexadecimal |
Forensics discipline: Run
stringsbefore attempting to execute an unknown binary. It’s passive — it reads without modifying the file or triggering any embedded behaviour.
xxd — Hexadecimal Manipulation
xxd creates a hex dump of a file, giving you a byte-by-byte view of its raw content. It can also reverse a hex dump back into its original binary form.
Generating a Hex Dump
xxd data.binThe default output shows three columns: the byte offset (address), the hex values in groups of two bytes, and the ASCII representation:
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0300 3e00 0100 0000 4010 0000 0000 0000 ..>.....@.......The rightmost column is invaluable — it highlights readable strings embedded in the binary data, letting you spot text fragments without running strings.
Limiting Output
For large files, you rarely need the entire dump:
xxd -l 256 data.bin # First 256 bytes only
xxd -s 0x100 -l 128 data.bin # Start at offset 0x100, read 128 bytesPlain Hex Output
When you need raw hex without address offsets or ASCII columns:
xxd -p data.binThis outputs a continuous stream of hex characters — useful when piping into other tools or when the address offsets are unnecessary noise.
Reversing a Hex Dump
xxd can reconstruct binary data from a hex dump:
xxd -r dump.txt restored.bin # Standard hex dump back to binary
xxd -r -p plain_hex.txt restored.bin # Plain hex (no offsets) back to binaryPractical Patterns
Inspect a file’s header to identify its type:
xxd -l 16 unknown_fileCommon magic bytes you’ll recognise:
| Hex Sequence | File Type |
|---|---|
7f 45 4c 46 |
ELF executable (Linux) |
4d 5a |
PE executable (Windows) |
89 50 4e 47 |
PNG image |
50 4b 03 04 |
ZIP archive |
1f 8b |
gzip compressed |
25 50 44 46 |
PDF document |
Search for a specific byte pattern in a binary:
xxd data.bin | grep "ca fe ba be" # Java class file magic bytesCreate a hex dump, edit it, and restore:
xxd data.bin > dump.txt # Dump to text
vim dump.txt # Edit specific bytes
xxd -r dump.txt data_modified.bin # Reconstruct the modified binaryKey Flags
| Flag | Description |
|---|---|
-l <length> |
Limit output to specified number of bytes |
-s <offset> |
Seek to a specific byte offset before reading |
-p |
Plain hex output — no addresses or ASCII column |
-r |
Reverse — convert hex dump back to binary |
-c <cols> |
Number of hex octets per line (default: 16) |
-g <bytes> |
Group hex octets (default: 2) |
When to reach for xxd: Use it when you need to see the raw bytes — file header identification, offset-specific inspection, or manual patching of binary data. If you only need readable strings,
stringsis faster and cleaner.
base64 — Stream Encoding
base64 encodes binary data into a 64-character ASCII alphabet, making it safe for transport through systems that only handle text (email, JSON, URLs, config files). It also decodes base64 back to its original form.
Encoding
echo "secret message" | base64
# Output: c2VjcmV0IG1lc3NhZ2UKOr encode a file directly:
base64 data.bin > data.b64Decoding
echo "c2VjcmV0IG1lc3NhZ2UK" | base64 -d
# Output: secret messageOr decode from a file:
base64 -d encoded.b64 > original.binBandit-Style Patterns
In OverTheWire Bandit, base64 files appear frequently. The standard workflow:
cat encoded.txt | base64 -d # Decode file contents to stdout
base64 -d < encoded.txt # Same result, using redirection
base64 -d encoded.txt # Direct file argument (also works)Handling Errors
When decoding fails, it’s almost always one of these:
Whitespace or non-base64 characters in the input:
base64 -d -i corrupted.b64 # -i ignores non-alphabet charactersMissing padding: Base64 requires input length to be a multiple of 4. If padding (= characters) is stripped — common in URLs — add it back manually:
# "c2VjcmV0" is missing padding; "c2VjcmV0==" is correct
echo "c2VjcmV0==" | base64 -dEncoding detection: If you’re unsure whether something is base64, look for the telltale signs: the character set is limited to A-Z, a-z, 0-9, +, /, and = for padding. Length is always a multiple of 4 (with padding).
Key Flags
| Flag | Description |
|---|---|
-d |
Decode mode |
-i |
Ignore non-alphabet characters during decoding |
-w 0 |
Disable line wrapping during encoding (useful for pipes and scripts) |
-w 76 |
Wrap encoded output at 76 characters (default) |
Practical note: When piping base64 output into other tools or scripts, always use
-w 0to prevent line breaks from interfering:base64 -w 0 data.bin | curl -X POST -d @- https://example.com/upload
Tool Comparison
| Tool | Input | Output | Primary Use |
|---|---|---|---|
strings |
Binary or any file | Human-readable text sequences | Find embedded text in binaries |
xxd |
Any file | Hex dump (or binary from hex) | Inspect raw bytes, identify file types, patch data |
base64 |
Binary or text | ASCII-encoded data (or decoded binary) | Transport-safe encoding, decode CTF challenges |
Chaining Them Together
These tools complement each other. A typical forensic triage workflow:
# Step 1: Quick check for readable content
strings -n 8 unknown.bin | head -20
# Step 2: Inspect the file header
xxd -l 32 unknown.bin
# Step 3: If it's base64-encoded, decode and re-inspect
base64 -d unknown.b64 > decoded.bin
strings decoded.bin | grep -iE "password\|key\|flag"
# Step 4: Hex dump for deeper analysis
xxd decoded.bin | less