Chapter Overview
This chapter covers the essential command-line tools that turn a basic shell user into someone who can actually get things done fast — file operations, searching, filtering, checksums, parallelism, and more.
Concatenating with cat
cat (concatenate) reads files and writes them to stdout. Simple but used everywhere.
1
2
3
| cat file.txt # print a file
cat file1.txt file2.txt # concatenate and print both
cat file1.txt file2.txt > combined.txt # merge into one file
|
Useful flags:
1
2
3
4
| cat -n file.txt # number every line
cat -b file.txt # number only non-blank lines
cat -s file.txt # squeeze multiple blank lines into one
cat -A file.txt # show hidden chars — tabs as ^I, line endings as $
|
Create a file from stdin (type content, Ctrl+D to finish):
Append to a file:
Pipe into other commands:
1
| cat access.log | grep "404" | wc -l
|
Recording and Playing Back Terminal Sessions
script records everything printed to the terminal — commands and output. Useful for documentation, demos, and audit trails.
Record a session:
1
2
3
| script session.log # start recording to session.log
# do your work
exit # stop recording
|
The timing file records exactly when each output occurred:
1
| script --timing=timing.log session.log
|
Replay the session at the original speed:
1
| scriptreplay timing.log session.log
|
Replay faster or slower:
1
2
| scriptreplay -d 2 timing.log session.log # 2x speed
scriptreplay -d 0.5 timing.log session.log # half speed
|
Finding Files and File Listing
ls
1
2
3
4
5
6
7
| ls -l # long format — permissions, size, date
ls -a # show hidden files (starting with .)
ls -h # human-readable sizes (KB, MB)
ls -t # sort by modification time (newest first)
ls -S # sort by size (largest first)
ls -r # reverse order
ls -lhtr # combine: long, human, time, reversed — most useful combo
|
find
find is the real power tool for locating files.
1
2
3
4
5
6
7
8
9
10
11
| find /path -name "*.log" # find by name (case-sensitive)
find /path -iname "*.log" # case-insensitive
find /path -type f # files only
find /path -type d # directories only
find /path -size +10M # files larger than 10MB
find /path -size -1k # files smaller than 1KB
find /path -mtime -7 # modified in the last 7 days
find /path -mtime +30 # modified more than 30 days ago
find /path -user omar # owned by user omar
find /path -perm 644 # exact permissions
find /path -perm /u+x # executable by owner
|
Execute a command on each result:
1
2
3
| find . -name "*.log" -exec rm {} \; # delete all .log files
find . -name "*.py" -exec chmod +x {} \; # make all .py executable
find . -type f -exec wc -l {} + # count lines in all files
|
Combine conditions:
1
2
3
| find . -name "*.sh" -size +1k -mtime -7 # AND (default)
find . -name "*.log" -o -name "*.txt" # OR
find . -not -name "*.bak" # NOT
|
Playing with xargs
xargs takes stdin and passes it as arguments to another command. Essential for combining find with other tools.
1
2
3
| find . -name "*.log" | xargs rm # delete all found files
find . -name "*.txt" | xargs wc -l # count lines in each
find . -name "*.py" | xargs grep "import" # search inside each
|
Why not just use pipes? Some commands don’t read from stdin — they only take arguments. xargs bridges that gap.
Handle filenames with spaces — use null delimiter:
1
| find . -name "*.txt" -print0 | xargs -0 rm
|
-print0 outputs null-separated names, -0 tells xargs to split on null. Always use this combo when filenames might have spaces.
Limit arguments per command:
1
| find . -name "*.log" | xargs -n 2 echo # pass 2 args at a time
|
Show what would be executed (dry run):
1
| find . -name "*.log" | xargs -p rm # -p asks confirmation for each
|
Run in parallel:
1
| find . -name "*.png" | xargs -P 4 -I {} convert {} {}.jpg
|
-P 4 runs 4 processes at once. -I {} defines a placeholder for the filename.
Translating with tr
tr translates (replaces) or deletes characters. Works on stdin only — no file arguments.
1
2
3
4
5
| echo "hello world" | tr 'a-z' 'A-Z' # lowercase to uppercase
echo "HELLO" | tr 'A-Z' 'a-z' # uppercase to lowercase
echo "hello 123" | tr -d '0-9' # -d delete digits
echo "hello world" | tr -s ' ' # -s squeeze repeated spaces to one
echo "hello:world" | tr ':' '\n' # replace : with newline
|
Common character classes:
1
2
3
4
| tr '[:lower:]' '[:upper:]' # portable way to uppercase
tr '[:digit:]' '*' # replace all digits with *
tr -d '[:space:]' # remove all whitespace
tr -d '\r' # remove Windows carriage returns (fix CRLF files)
|
Remove non-printable characters from a file:
1
| cat file.txt | tr -cd '[:print:]\n' > clean.txt
|
Checksum and Verification
Checksums verify file integrity — confirm a file wasn’t corrupted or tampered with.
Generate checksums:
1
2
3
4
| md5sum file.txt # MD5 (fast, not secure for crypto)
sha1sum file.txt # SHA-1
sha256sum file.txt # SHA-256 (standard for verification)
sha512sum file.txt # SHA-512 (stronger)
|
Verify a file against a known checksum:
1
2
| echo "expectedhash file.txt" | sha256sum --check
sha256sum --check checksums.txt # verify a whole list
|
Generate checksums for multiple files:
1
2
| sha256sum *.iso > checksums.txt # save all checksums
sha256sum --check checksums.txt # verify later
|
Compare two files quickly:
1
2
| sha256sum file1.txt file2.txt
# if the hashes match, the files are identical
|
OpenSSL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # Generate a random password
openssl rand -base64 32
# Hash a string
echo -n "password" | openssl dgst -sha256
# Encrypt a file (AES-256)
openssl enc -aes-256-cbc -salt -in plain.txt -out encrypted.enc
# Decrypt
openssl enc -aes-256-cbc -d -in encrypted.enc -out plain.txt
# Generate RSA key pair
openssl genrsa -out private.pem 2048
openssl rsa -in private.pem -pubout -out public.pem
|
gpg
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| # Generate a key pair
gpg --gen-key
# Encrypt a file for a recipient
gpg --encrypt --recipient "user@example.com" file.txt
# Decrypt
gpg --decrypt file.txt.gpg
# Sign a file
gpg --sign file.txt
# Verify a signature
gpg --verify file.txt.gpg
|
bcrypt / password hashing
1
2
| # Quick password hash with openssl (for scripts)
openssl passwd -6 "mypassword" # SHA-512 crypt format
|
Sorting, Unique, and Duplicates
sort
1
2
3
4
5
6
| sort file.txt # alphabetical sort
sort -r file.txt # reverse
sort -n numbers.txt # numeric sort
sort -k 2 file.txt # sort by 2nd field
sort -t ',' -k 3 -n file.csv # sort CSV by 3rd column numerically
sort -u file.txt # sort and remove duplicates
|
Sort by multiple keys:
1
| sort -k 1,1 -k 2,2n file.txt # sort by field 1 alpha, then field 2 numeric
|
uniq
uniq only removes adjacent duplicates — always sort first.
1
2
3
4
| sort file.txt | uniq # remove duplicates
sort file.txt | uniq -c # count occurrences of each line
sort file.txt | uniq -d # show only duplicated lines
sort file.txt | uniq -u # show only unique lines (not duplicated)
|
Find the most common lines in a log:
1
| sort access.log | uniq -c | sort -rn | head -10
|
comm
Compare two sorted files:
1
2
3
4
| comm file1.txt file2.txt
# column 1: lines only in file1
# column 2: lines only in file2
# column 3: lines in both
|
1
2
| comm -12 file1.txt file2.txt # only lines in BOTH files
comm -23 file1.txt file2.txt # only lines in file1 (not file2)
|
Temporary File Naming and Random Numbers
mktemp
Never manually name temp files — use mktemp to avoid race conditions and collisions.
1
2
3
| tmpfile=$(mktemp) # /tmp/tmp.XxXxXx
tmpfile=$(mktemp /tmp/myapp.XXXXXX) # custom prefix + random suffix
tmpdir=$(mktemp -d) # create a temp directory
|
Always clean up:
1
2
| tmpfile=$(mktemp)
trap "rm -f $tmpfile" EXIT # auto-delete when script exits
|
$RANDOM
Bash built-in — generates a random integer between 0 and 32767.
1
2
3
| echo $RANDOM # random number
echo $(( RANDOM % 100 )) # random 0-99
echo $(( RANDOM % 50 + 1 )) # random 1-50
|
Random filename:
1
| name="backup_$(date +%Y%m%d)_$RANDOM.tar.gz"
|
Shuffle lines in a file:
1
| sort -R file.txt # random sort (shuffle)
|
Splitting Files and Data
split
Split a large file into smaller pieces:
1
2
3
| split -l 1000 bigfile.txt part_ # 1000 lines per piece, prefix "part_"
split -b 10M bigfile.tar part_ # 10MB per piece
split -n 5 bigfile.txt part_ # split into exactly 5 pieces
|
Reassemble:
1
| cat part_* > reassembled.txt
|
csplit
Split on a pattern (context-aware split):
1
2
| csplit log.txt '/ERROR/' '{*}' # split every time ERROR appears
csplit file.txt 10 25 # split at lines 10 and 25
|
head and tail
1
2
3
4
| head -n 100 file.txt # first 100 lines
tail -n 100 file.txt # last 100 lines
tail -f log.txt # follow (live updates — great for logs)
tail -n +50 file.txt # from line 50 to end
|
Slicing Filenames Based on Extension
basename and dirname
1
2
3
| basename /home/omar/docs/report.pdf # report.pdf
basename /home/omar/docs/report.pdf .pdf # report (strip extension)
dirname /home/omar/docs/report.pdf # /home/omar/docs
|
Parameter expansion
The most efficient way — no subshell needed:
1
2
3
4
5
6
7
| filepath="/home/omar/docs/report.pdf"
echo "${filepath##*/}" # report.pdf (filename only)
echo "${filepath%/*}" # /home/omar/docs (directory only)
echo "${filepath##*.}" # pdf (extension only)
echo "${filepath%.*}" # /home/omar/docs/report (strip extension)
echo "${filepath##*.}" # pdf
|
Change extension in a loop:
1
2
3
| for f in *.txt; do
mv "$f" "${f%.txt}.md" # rename .txt to .md
done
|
Renaming and Moving Files in Bulk
mv in a loop
1
2
3
4
5
6
7
8
9
| # Add prefix to all .txt files
for f in *.txt; do
mv "$f" "backup_$f"
done
# Change extension
for f in *.jpeg; do
mv "$f" "${f%.jpeg}.jpg"
done
|
rename (Perl rename)
More powerful — uses regex:
1
2
3
4
| rename 's/\.txt$/.md/' *.txt # change extension
rename 's/ /_/g' * # replace spaces with underscores
rename 's/^/prefix_/' * # add prefix to all files
rename 'y/A-Z/a-z/' * # lowercase all filenames
|
Dry run first (check before applying):
1
| rename -n 's/\.txt$/.md/' *.txt # -n = dry run, shows what would happen
|
find + mv for subdirectories
1
| find . -name "*.log" -exec mv {} /archive/ \;
|
Spell Checking and Dictionary Manipulation
look
Search for words starting with a prefix — uses /usr/share/dict/words:
1
2
| look uni # all dictionary words starting with "uni"
look "uni" /usr/share/dict/words
|
aspell
Interactive spell checker:
1
2
3
| aspell check report.md # interactive correction
aspell list < file.txt # output only misspelled words
aspell -l en list < file.txt # specify language
|
Check a file and get only the bad words:
1
| cat essay.txt | aspell list | sort | uniq
|
/usr/share/dict/words
The system word list — useful in scripts:
1
2
3
4
5
| # Check if a word is valid
grep -w "^hello$" /usr/share/dict/words && echo "valid" || echo "invalid"
# Count total words in the dictionary
wc -l /usr/share/dict/words
|
echo / printf piping
The simplest approach — pipe answers directly:
1
2
| echo "y" | apt-get install package # auto-answer yes
echo -e "user\npassword\n" | ftp host # automate login prompts
|
yes
Continuously outputs “y” (or any string) — useful for confirming prompts:
1
2
3
| yes | apt-get install package # answer yes to everything
yes "no" | command # answer "no" to everything
yes "" | command # send empty lines
|
1
2
3
4
5
| command << EOF
answer1
answer2
answer3
EOF
|
Real example — automate an FTP session:
1
2
3
4
5
6
| ftp -n host << EOF
user username password
cd /upload
put file.txt
bye
EOF
|
expect
When timing matters — expect waits for specific output before sending input:
1
2
3
4
5
6
7
8
| expect << EOF
spawn ssh user@host
expect "password:"
send "mypassword\r"
expect "$ "
send "ls -la\r"
expect "$ "
EOF
|
Install: apt install expect
Parallel Processes
Background with &
Run a command in the background and continue:
1
2
| command &
pid=$! # $! holds the PID of the last background process
|
Wait for all background jobs:
1
2
3
4
| process1 &
process2 &
process3 &
wait # blocks until all background jobs finish
|
xargs -P
Easiest way to parallelize a loop:
1
2
3
4
5
| # Process 4 files at a time
find . -name "*.png" | xargs -P 4 -I {} convert {} {}.jpg
# Run 8 parallel jobs
cat urls.txt | xargs -P 8 -I {} curl -O {}
|
GNU parallel
More powerful — handles progress, logging, job control:
1
2
3
| parallel gzip ::: *.log # compress all .log files in parallel
parallel -j 4 wget {} ::: url1 url2 url3 # 4 parallel downloads
parallel "echo processing {}; sleep 1" ::: a b c d
|
From a file:
1
| parallel -a urls.txt wget {}
|
With progress bar:
1
| parallel --progress gzip ::: *.log
|
Install: apt install parallel
Timing comparison
1
2
3
4
5
| # Sequential
time for f in *.log; do gzip "$f"; done
# Parallel (4 jobs)
time find . -name "*.log" | xargs -P 4 gzip
|
The parallel version is typically N times faster where N is your CPU core count.
📚 References
You can find me online at: