sort

Basic Usage of sort - Sorting Text Data

The sort command is used to sort lines of text files or command output. It is commonly used in shell scripting, log analysis, report generation, and data processing.

sort file.txt

Example file:

banana
apple
orange
grape

Command:

sort file.txt

Output:

apple
banana
grape
orange

sort arranges lines alphabetically by default
Commonly used for organizing text data
Frequently combined with pipes and other commands
Important tool for shell scripting and automation

Sorting in Reverse Order

To sort in descending order:

sort -r file.txt

Output:

orange
grape
banana
apple

-r means reverse order
Reverses normal sorting behavior
Useful for reports and ranking outputs

Numeric Sorting

By default, sort performs alphabetical sorting.

Example file:

100
20
3

Default sorting:

sort numbers.txt

Output:

100
20
3

This happens because sorting is alphabetical, not numeric.

To sort numerically:

sort -n numbers.txt

Output:

3
20
100

-n means numeric sort
Interprets values as numbers
Essential for processing statistics and metrics

Human Readable Numeric Sorting

To sort human-readable sizes correctly:

sort -h sizes.txt

Example file:

2K
1G
500M

Output:

2K
500M
1G

-h understands KB, MB, GB, TB
Useful when processing df, du, or ls -lh output

Sorting by Specific Column

Example file:

john 50
anna 20
mike 35

Sort by second column:

sort -k2 numbers.txt

Output:

anna 20
mike 35
john 50

-k2 means sort using column 2
Useful for structured data
Frequently used in reports and logs

Numeric Column Sorting

To sort second column numerically:

sort -k2 -n numbers.txt

Combines column sorting with numeric sorting
Very common in shell scripts

Ignoring Case Sensitivity

Default sorting is case-sensitive.

Example file:

Banana
apple
Orange
grape

Default output:

Banana
Orange
apple
grape

To ignore letter case:

sort -f file.txt

Output:

apple
Banana
grape
Orange

-f means ignore case
Useful for user-generated text data

Removing Duplicate Lines

To sort and remove duplicates:

sort -u file.txt

Example input:

apple
banana
apple
orange

Output:

apple
banana
orange

-u means unique
Removes duplicate lines automatically
Useful for generating clean lists

Checking Whether a File Is Sorted

To verify sorting without modifying output:

sort -c file.txt

If file is sorted correctly:

no output is displayed

If file is not sorted:

sort: file.txt:2: disorder: apple

-c means check sorted order
Useful in scripts and automation

Sorting Delimited Data

Example CSV file:

john,50
anna,20
mike,35

Sort by second column:

sort -t "," -k2 -n users.csv

Output:

anna,20
mike,35
john,50

Breakdown:

-t "," defines comma delimiter
-k2 sorts using second field
-n performs numeric sorting

Combining sort with Pipes

The sort command is frequently used in pipelines.

Example:

cat names.txt | sort

Sort processes by memory usage:

ps aux | sort -k4 -n

Sort disk usage:

du -sh * | sort -h

sort is heavily used with pipes
Important in shell automation workflows

Random Sorting

To randomize lines:

sort -R file.txt

-R randomizes output order
Useful for random selection or testing

Combining Multiple Options

Example:

sort -nr numbers.txt

Breakdown:

-n numeric sorting
-r reverse order

Another example:

sort -t ":" -k3 -n /etc/passwd

This:

uses : delimiter
sorts by third field
performs numeric sorting

Common Administrative Examples

Sort disk usage by size:

du -sh * | sort -h

Sort users by UID:

sort -t ":" -k3 -n /etc/passwd

Remove duplicate IP addresses:

sort -u access.log

Sort process list by memory usage:

ps aux | sort -k4 -nr

Practical Script Example (Step-by-Step Explanation)

Script

#!/bin/bash

FILE="/var/log/access.log"

echo "Top unique IP addresses:"

cut -d " " -f1 $FILE | sort | uniq -c | sort -nr

Step 1: Shebang

#!/bin/bash

Defines Bash interpreter
Ensures script runs using Bash shell

Step 2: Defining log file

FILE="/var/log/access.log"

Stores log file path in variable
Makes script easier to modify later

Example value:

/var/log/access.log

Step 3: Printing informational message

echo "Top unique IP addresses:"

Displays readable heading
Helps structure output

Step 4: Extracting IP addresses

cut -d " " -f1 $FILE

Breakdown:

-d " " defines space delimiter
-f1 extracts first column
First column usually contains IP address in access logs

Example log line:

192.168.1.10 GET /index.html

Extracted result:

192.168.1.10

Step 5: Sorting IP addresses

sort

Groups identical IP addresses together
Required before using uniq

Example:

192.168.1.10
192.168.1.10
192.168.1.20

Step 6: Counting duplicates

uniq -c

Counts repeated lines
Requires sorted input

Example output:

2 192.168.1.10
1 192.168.1.20

Step 7: Sorting by highest count

sort -nr

Breakdown:

-n numeric sorting
-r reverse order

This displays highest counts first.

Example output:

120 192.168.1.10
85 192.168.1.20

What this script does

Step-by-step flow:

Reads access log
Extracts IP addresses
Sorts addresses
Counts duplicates
Displays most frequent IPs first

Why this matters in production

This workflow is extremely common in:

web server analysis
security investigations
DDoS detection
traffic analysis
log processing

The combination of:

cut
sort
uniq

is one of the most common Linux text-processing patterns.

Common Beginner Mistakes

Using numeric sorting incorrectly:

sort file.txt

instead of:

sort -n file.txt

Another mistake:

Using uniq without sorting first:

uniq file.txt

This only removes consecutive duplicates.

Correct workflow:

sort file.txt | uniq

Another mistake:

Sorting human-readable sizes without -h:

sort sizes.txt

Correct version:

sort -h sizes.txt

Summary

In this guide, you learned:

basic text sorting with sort
reverse sorting
numeric sorting
sorting by columns
case-insensitive sorting
removing duplicates
sorting CSV data
random sorting
combining sort with pipes
practical shell scripting with sort

These skills are essential for:

Linux administration
shell scripting
log analysis
automation
data processing

Additional sort parameters not covered in this guide include:

-M: Sort by month names
-V: Natural version sorting
-b: Ignore leading blanks
-o: Write output to file
--parallel: Use multiple CPU cores
--debug: Display sorting diagnostics
--help: Display help information
--version: Display version information