uniq
Basic Usage of uniq - Removing Duplicate Lines
The uniq command is used to filter or remove repeated lines from text input. It is commonly used together with the sort command in Linux shell scripting, log analysis, and data processing.
uniq file.txt
Example file:
apple
apple
banana
banana
orange
Command:
uniq file.txt
Output:
apple
banana
orange
uniqremoves duplicate consecutive lines- Only repeated adjacent lines are detected
- Frequently combined with
sort - Common tool in shell scripting and automation
Important Concept: Consecutive Duplicates Only
The uniq command only works on consecutive duplicate lines.
Example file:
apple
banana
apple
orange
Command:
uniq file.txt
Output:
apple
banana
apple
orange
Nothing changes because duplicate lines are not next to each other.
Correct workflow:
sort file.txt | uniq
Output:
apple
banana
orange
sortgroups identical lines togetheruniqthen removes duplicates- This combination is extremely common in Linux
Displaying Duplicate Counts
To count duplicate occurrences:
uniq -c file.txt
Example input:
apple
apple
banana
banana
banana
orange
Output:
2 apple
3 banana
1 orange
-cmeans count occurrences- Displays number of repeated lines
- Useful for statistics and log analysis
Displaying Only Duplicate Lines
To display only duplicated entries:
uniq -d file.txt
Example input:
apple
apple
banana
banana
orange
Output:
apple
banana
-dmeans duplicates only- Unique lines are hidden
- Useful for finding repeated data
Displaying Only Unique Lines
To display only lines that appear once:
uniq -u file.txt
Example input:
apple
apple
banana
orange
Output:
banana
orange
-umeans unique lines only- Duplicate lines are removed completely
- Useful for identifying uncommon entries
Ignoring Case Differences
By default, uniq is case-sensitive.
Example input:
Apple
apple
APPLE
Default behavior:
uniq file.txt
Output:
Apple
apple
APPLE
To ignore case differences:
uniq -i file.txt
-imeans case-insensitive comparison- Useful for user-generated text data
Skipping Fields
Sometimes only part of each line should be compared.
Example file:
2026 apple
2027 apple
2028 banana
Command:
uniq -f1 file.txt
-f1skips first field- Comparison starts from second field
Result:
2026 apple
2028 banana
- Useful for log processing and structured data
Skipping Characters
To ignore first characters during comparison:
uniq -s3 file.txt
-s3skips first 3 characters- Useful for fixed-width text formats
Combining uniq with sort
This is the most common real-world usage pattern.
Example:
sort names.txt | uniq
Count unique values:
sort access.log | uniq -c
Find duplicate IP addresses:
sort ips.txt | uniq -d
sortprepares data foruniq- Essential workflow in Linux text processing
Using uniq with Pipes
The uniq command is commonly used in pipelines.
Example:
cat users.txt | sort | uniq
Count active shells:
cut -d ":" -f7 /etc/passwd | sort | uniq -c
Example output:
15 /bin/bash
3 /usr/sbin/nologin
- Useful for reports and audits
- Common in system administration
Combining Multiple Options
Example:
uniq -ci file.txt
Breakdown:
-ccounts duplicates-iignores case
Another example:
uniq -du file.txt
This combination is invalid because:
-dshows duplicates only-ushows unique lines only
These options conflict logically.
Common Administrative Examples
Count failed login attempts:
cut -d " " -f1 auth.log | sort | uniq -c
Find duplicate usernames:
sort users.txt | uniq -d
List unique shells:
cut -d ":" -f7 /etc/passwd | sort | uniq
Count most common IP addresses:
sort access.log | uniq -c | sort -nr
Practical Script Example (Step-by-Step Explanation)
Script
#!/bin/bash
LOG="/var/log/access.log"
echo "Top IP addresses:"
cut -d " " -f1 $LOG | sort | uniq -c | sort -nr | head -10
Step 1: Shebang
#!/bin/bash
- Defines Bash interpreter
- Ensures script executes correctly
Step 2: Defining log file
LOG="/var/log/access.log"
- Stores log file location
- Makes script easier to maintain
Example value:
/var/log/access.log
Step 3: Printing informational message
echo "Top IP addresses:"
- Displays readable heading
- Helps organize script output
Step 4: Extracting IP addresses
cut -d " " -f1 $LOG
Breakdown:
-d " "defines space delimiter-f1extracts first field- First field usually contains client IP address
Example log line:
192.168.1.15 GET /index.html
Extracted result:
192.168.1.15
Step 5: Sorting the data
sort
- Groups identical IP addresses together
- Required before using
uniq
Example:
192.168.1.10
192.168.1.10
192.168.1.20
Step 6: Counting duplicates
uniq -c
- Counts repeated lines
- Displays frequency of each IP address
Example output:
15 192.168.1.10
8 192.168.1.20
Step 7: Sorting by highest count
sort -nr
Breakdown:
-nnumeric sorting-rreverse order
Displays most frequent IP addresses first.
Step 8: Limiting output
head -10
- Displays first 10 lines only
- Prevents overwhelming output
- Shows top 10 IP addresses
What this script does
Step-by-step flow:
- Reads access log
- Extracts IP addresses
- Sorts addresses
- Counts occurrences
- Sorts by highest frequency
- Displays top 10 results
Why this matters in production
This workflow is heavily used for:
- traffic analysis
- security monitoring
- detecting brute-force attacks
- identifying abusive clients
- web server analytics
The combination of:
cutsortuniq
forms one of the most important Linux text-processing pipelines.
Common Beginner Mistakes
Using uniq without sorting first:
uniq file.txt
This only removes consecutive duplicates.
Correct workflow:
sort file.txt | uniq
Another mistake:
Assuming uniq removes all duplicates automatically.
Incorrect expectation:
apple
banana
apple
Result remains unchanged without sorting.
Another mistake:
Using conflicting options:
uniq -du file.txt
This combination does not make logical sense.
Summary
In this guide, you learned:
- how
uniqremoves duplicate lines - why sorting is important before using uniq
- counting duplicates
- displaying duplicate-only lines
- displaying unique-only lines
- case-insensitive comparison
- skipping fields and characters
- combining uniq with pipes
- practical shell scripting with
uniq
These skills are essential for:
- Linux administration
- shell scripting
- log analysis
- automation
- text processing
Additional uniq parameters not covered in this guide include:
-w: Compare only specified number of characters
-z: Use null-terminated lines
--group: Display grouped duplicates
--all-repeated: Show all duplicate groups
--help: Display help information
--version: Display version information