sort

Basic Usage of sort - Sorting Text Data

The sort command is used to sort lines of text files or command output. It is commonly used in shell scripting, log analysis, report generation, and data processing.

sort file.txt

Example file:

banana
apple
orange
grape

Command:

sort file.txt

Output:

apple
banana
grape
orange
  • sort arranges lines alphabetically by default
  • Commonly used for organizing text data
  • Frequently combined with pipes and other commands
  • Important tool for shell scripting and automation

Sorting in Reverse Order

To sort in descending order:

sort -r file.txt

Output:

orange
grape
banana
apple
  • -r means reverse order
  • Reverses normal sorting behavior
  • Useful for reports and ranking outputs

Numeric Sorting

By default, sort performs alphabetical sorting.

Example file:

100
20
3

Default sorting:

sort numbers.txt

Output:

100
20
3

This happens because sorting is alphabetical, not numeric.

To sort numerically:

sort -n numbers.txt

Output:

3
20
100
  • -n means numeric sort
  • Interprets values as numbers
  • Essential for processing statistics and metrics

Human Readable Numeric Sorting

To sort human-readable sizes correctly:

sort -h sizes.txt

Example file:

2K
1G
500M

Output:

2K
500M
1G
  • -h understands KB, MB, GB, TB
  • Useful when processing df, du, or ls -lh output

Sorting by Specific Column

Example file:

john 50
anna 20
mike 35

Sort by second column:

sort -k2 numbers.txt

Output:

anna 20
mike 35
john 50
  • -k2 means sort using column 2
  • Useful for structured data
  • Frequently used in reports and logs

Numeric Column Sorting

To sort second column numerically:

sort -k2 -n numbers.txt
  • Combines column sorting with numeric sorting
  • Very common in shell scripts

Ignoring Case Sensitivity

Default sorting is case-sensitive.

Example file:

Banana
apple
Orange
grape

Default output:

Banana
Orange
apple
grape

To ignore letter case:

sort -f file.txt

Output:

apple
Banana
grape
Orange
  • -f means ignore case
  • Useful for user-generated text data

Removing Duplicate Lines

To sort and remove duplicates:

sort -u file.txt

Example input:

apple
banana
apple
orange

Output:

apple
banana
orange
  • -u means unique
  • Removes duplicate lines automatically
  • Useful for generating clean lists

Checking Whether a File Is Sorted

To verify sorting without modifying output:

sort -c file.txt

If file is sorted correctly:

  • no output is displayed

If file is not sorted:

sort: file.txt:2: disorder: apple
  • -c means check sorted order
  • Useful in scripts and automation

Sorting Delimited Data

Example CSV file:

john,50
anna,20
mike,35

Sort by second column:

sort -t "," -k2 -n users.csv

Output:

anna,20
mike,35
john,50

Breakdown:

  • -t "," defines comma delimiter
  • -k2 sorts using second field
  • -n performs numeric sorting

Combining sort with Pipes

The sort command is frequently used in pipelines.

Example:

cat names.txt | sort

Sort processes by memory usage:

ps aux | sort -k4 -n

Sort disk usage:

du -sh * | sort -h
  • sort is heavily used with pipes
  • Important in shell automation workflows

Random Sorting

To randomize lines:

sort -R file.txt
  • -R randomizes output order
  • Useful for random selection or testing

Combining Multiple Options

Example:

sort -nr numbers.txt

Breakdown:

  • -n numeric sorting
  • -r reverse order

Another example:

sort -t ":" -k3 -n /etc/passwd

This:

  • uses : delimiter
  • sorts by third field
  • performs numeric sorting

Common Administrative Examples

Sort disk usage by size:

du -sh * | sort -h

Sort users by UID:

sort -t ":" -k3 -n /etc/passwd

Remove duplicate IP addresses:

sort -u access.log

Sort process list by memory usage:

ps aux | sort -k4 -nr

Practical Script Example (Step-by-Step Explanation)

Script

#!/bin/bash

FILE="/var/log/access.log"

echo "Top unique IP addresses:"

cut -d " " -f1 $FILE | sort | uniq -c | sort -nr

Step 1: Shebang

#!/bin/bash
  • Defines Bash interpreter
  • Ensures script runs using Bash shell

Step 2: Defining log file

FILE="/var/log/access.log"
  • Stores log file path in variable
  • Makes script easier to modify later

Example value:

/var/log/access.log

Step 3: Printing informational message

echo "Top unique IP addresses:"
  • Displays readable heading
  • Helps structure output

Step 4: Extracting IP addresses

cut -d " " -f1 $FILE

Breakdown:

  • -d " " defines space delimiter
  • -f1 extracts first column
  • First column usually contains IP address in access logs

Example log line:

192.168.1.10 GET /index.html

Extracted result:

192.168.1.10

Step 5: Sorting IP addresses

sort
  • Groups identical IP addresses together
  • Required before using uniq

Example:

192.168.1.10
192.168.1.10
192.168.1.20

Step 6: Counting duplicates

uniq -c
  • Counts repeated lines
  • Requires sorted input

Example output:

2 192.168.1.10
1 192.168.1.20

Step 7: Sorting by highest count

sort -nr

Breakdown:

  • -n numeric sorting
  • -r reverse order

This displays highest counts first.

Example output:

120 192.168.1.10
85 192.168.1.20

What this script does

Step-by-step flow:

  1. Reads access log
  2. Extracts IP addresses
  3. Sorts addresses
  4. Counts duplicates
  5. Displays most frequent IPs first

Why this matters in production

This workflow is extremely common in:

  • web server analysis
  • security investigations
  • DDoS detection
  • traffic analysis
  • log processing

The combination of:

  • cut
  • sort
  • uniq

is one of the most common Linux text-processing patterns.


Common Beginner Mistakes

Using numeric sorting incorrectly:

sort file.txt

instead of:

sort -n file.txt

Another mistake:

Using uniq without sorting first:

uniq file.txt

This only removes consecutive duplicates.

Correct workflow:

sort file.txt | uniq

Another mistake:

Sorting human-readable sizes without -h:

sort sizes.txt

Correct version:

sort -h sizes.txt

Summary

In this guide, you learned:

  • basic text sorting with sort
  • reverse sorting
  • numeric sorting
  • sorting by columns
  • case-insensitive sorting
  • removing duplicates
  • sorting CSV data
  • random sorting
  • combining sort with pipes
  • practical shell scripting with sort

These skills are essential for:

  • Linux administration
  • shell scripting
  • log analysis
  • automation
  • data processing

Additional sort parameters not covered in this guide include:

-M: Sort by month names
-V: Natural version sorting
-b: Ignore leading blanks
-o: Write output to file
--parallel: Use multiple CPU cores
--debug: Display sorting diagnostics
--help: Display help information
--version: Display version information