uniq

Basic Usage of uniq - Removing Duplicate Lines

The uniq command is used to filter or remove repeated lines from text input. It is commonly used together with the sort command in Linux shell scripting, log analysis, and data processing.

uniq file.txt

Example file:

apple
apple
banana
banana
orange

Command:

uniq file.txt

Output:

apple
banana
orange
  • uniq removes duplicate consecutive lines
  • Only repeated adjacent lines are detected
  • Frequently combined with sort
  • Common tool in shell scripting and automation

Important Concept: Consecutive Duplicates Only

The uniq command only works on consecutive duplicate lines.

Example file:

apple
banana
apple
orange

Command:

uniq file.txt

Output:

apple
banana
apple
orange

Nothing changes because duplicate lines are not next to each other.

Correct workflow:

sort file.txt | uniq

Output:

apple
banana
orange
  • sort groups identical lines together
  • uniq then removes duplicates
  • This combination is extremely common in Linux

Displaying Duplicate Counts

To count duplicate occurrences:

uniq -c file.txt

Example input:

apple
apple
banana
banana
banana
orange

Output:

2 apple
3 banana
1 orange
  • -c means count occurrences
  • Displays number of repeated lines
  • Useful for statistics and log analysis

Displaying Only Duplicate Lines

To display only duplicated entries:

uniq -d file.txt

Example input:

apple
apple
banana
banana
orange

Output:

apple
banana
  • -d means duplicates only
  • Unique lines are hidden
  • Useful for finding repeated data

Displaying Only Unique Lines

To display only lines that appear once:

uniq -u file.txt

Example input:

apple
apple
banana
orange

Output:

banana
orange
  • -u means unique lines only
  • Duplicate lines are removed completely
  • Useful for identifying uncommon entries

Ignoring Case Differences

By default, uniq is case-sensitive.

Example input:

Apple
apple
APPLE

Default behavior:

uniq file.txt

Output:

Apple
apple
APPLE

To ignore case differences:

uniq -i file.txt
  • -i means case-insensitive comparison
  • Useful for user-generated text data

Skipping Fields

Sometimes only part of each line should be compared.

Example file:

2026 apple
2027 apple
2028 banana

Command:

uniq -f1 file.txt
  • -f1 skips first field
  • Comparison starts from second field

Result:

2026 apple
2028 banana
  • Useful for log processing and structured data

Skipping Characters

To ignore first characters during comparison:

uniq -s3 file.txt
  • -s3 skips first 3 characters
  • Useful for fixed-width text formats

Combining uniq with sort

This is the most common real-world usage pattern.

Example:

sort names.txt | uniq

Count unique values:

sort access.log | uniq -c

Find duplicate IP addresses:

sort ips.txt | uniq -d
  • sort prepares data for uniq
  • Essential workflow in Linux text processing

Using uniq with Pipes

The uniq command is commonly used in pipelines.

Example:

cat users.txt | sort | uniq

Count active shells:

cut -d ":" -f7 /etc/passwd | sort | uniq -c

Example output:

15 /bin/bash
3 /usr/sbin/nologin
  • Useful for reports and audits
  • Common in system administration

Combining Multiple Options

Example:

uniq -ci file.txt

Breakdown:

  • -c counts duplicates
  • -i ignores case

Another example:

uniq -du file.txt

This combination is invalid because:

  • -d shows duplicates only
  • -u shows unique lines only

These options conflict logically.


Common Administrative Examples

Count failed login attempts:

cut -d " " -f1 auth.log | sort | uniq -c

Find duplicate usernames:

sort users.txt | uniq -d

List unique shells:

cut -d ":" -f7 /etc/passwd | sort | uniq

Count most common IP addresses:

sort access.log | uniq -c | sort -nr

Practical Script Example (Step-by-Step Explanation)

Script

#!/bin/bash

LOG="/var/log/access.log"

echo "Top IP addresses:"

cut -d " " -f1 $LOG | sort | uniq -c | sort -nr | head -10

Step 1: Shebang

#!/bin/bash
  • Defines Bash interpreter
  • Ensures script executes correctly

Step 2: Defining log file

LOG="/var/log/access.log"
  • Stores log file location
  • Makes script easier to maintain

Example value:

/var/log/access.log

Step 3: Printing informational message

echo "Top IP addresses:"
  • Displays readable heading
  • Helps organize script output

Step 4: Extracting IP addresses

cut -d " " -f1 $LOG

Breakdown:

  • -d " " defines space delimiter
  • -f1 extracts first field
  • First field usually contains client IP address

Example log line:

192.168.1.15 GET /index.html

Extracted result:

192.168.1.15

Step 5: Sorting the data

sort
  • Groups identical IP addresses together
  • Required before using uniq

Example:

192.168.1.10
192.168.1.10
192.168.1.20

Step 6: Counting duplicates

uniq -c
  • Counts repeated lines
  • Displays frequency of each IP address

Example output:

15 192.168.1.10
8 192.168.1.20

Step 7: Sorting by highest count

sort -nr

Breakdown:

  • -n numeric sorting
  • -r reverse order

Displays most frequent IP addresses first.


Step 8: Limiting output

head -10
  • Displays first 10 lines only
  • Prevents overwhelming output
  • Shows top 10 IP addresses

What this script does

Step-by-step flow:

  1. Reads access log
  2. Extracts IP addresses
  3. Sorts addresses
  4. Counts occurrences
  5. Sorts by highest frequency
  6. Displays top 10 results

Why this matters in production

This workflow is heavily used for:

  • traffic analysis
  • security monitoring
  • detecting brute-force attacks
  • identifying abusive clients
  • web server analytics

The combination of:

  • cut
  • sort
  • uniq

forms one of the most important Linux text-processing pipelines.


Common Beginner Mistakes

Using uniq without sorting first:

uniq file.txt

This only removes consecutive duplicates.

Correct workflow:

sort file.txt | uniq

Another mistake:

Assuming uniq removes all duplicates automatically.

Incorrect expectation:

apple
banana
apple

Result remains unchanged without sorting.

Another mistake:

Using conflicting options:

uniq -du file.txt

This combination does not make logical sense.


Summary

In this guide, you learned:

  • how uniq removes duplicate lines
  • why sorting is important before using uniq
  • counting duplicates
  • displaying duplicate-only lines
  • displaying unique-only lines
  • case-insensitive comparison
  • skipping fields and characters
  • combining uniq with pipes
  • practical shell scripting with uniq

These skills are essential for:

  • Linux administration
  • shell scripting
  • log analysis
  • automation
  • text processing

Additional uniq parameters not covered in this guide include:

-w: Compare only specified number of characters
-z: Use null-terminated lines
--group: Display grouped duplicates
--all-repeated: Show all duplicate groups
--help: Display help information
--version: Display version information