Nginx log analytics on the command line

Understanding and analyzing web server logs is crucial for maintaining optimal website performance and security. In this article, we’ll…

Nginx log analytics on the command line
Photo by Kier in Sight Archives on Unsplash

Understanding and analyzing web server logs is crucial for maintaining optimal website performance and security. In this article, we’ll explore how to visualize Nginx access logs using Bash scripting. We’ll create a script that presents unique requests by IP and hour, total requests per hour, and statistics on the top 10 IP addresses, including their count and percentage of total requests, you might ask, why do this and not use a solution like ELK (Elastic Search, Kibana, Logstash), you can but do you needed? recently i created a web site that generates WiFi QR codes https://quackqr.com ,that most likely will never get much of a traffic and there is no need to spend time and resources setting up ELK, so this script is very handy and does the job effieciently!

Why Analyze Nginx Logs?

Nginx access logs provide valuable insights into how users interact with your website. By analyzing these logs, you can:

  • Identify traffic patterns
  • Detect potential security threats
  • Optimize server performance
  • Understand user behavior

Prerequisites

Before you proceed, ensure you have:

  • Access to an Nginx server with logs available at /var/log/nginx/access.log
  • Basic familiarity with Bash scripting

The Bash Script

Here’s a complete Bash script that accomplishes our goal:

#!/bin/bash 
 
# Function to print a horizontal bar of # characters 
print_bar() { 
    local count=$1 
    local max_length=50 
    local bar_length=$((count * max_length / max_count)) 
    printf "%-${max_length}s" "$(printf '#%.0s' $(seq 1 $bar_length))" # Print bar 
} 
 
# Get total requests count for percentage calculations 
total_requests=$(cat /var/log/nginx/access.log | wc -l) 
 
echo "Unique Requests by IP and Hour:" 
echo "---------------------------------" 
# Find maximum request count for scaling 
max_count=$(cat /var/log/nginx/access.log | tr -s " " | cut -d " " -f1,4 | tr -d "[" | cut -d ":" -f1,2 | sort -n | uniq -c | awk '{print $1}' | sort -n | tail -1) 
 
# Print unique requests by IP and hour 
cat /var/log/nginx/access.log | tr -s " " | cut -d " " -f1,4 | tr -d "[" | cut -d ":" -f1,2 | sort -n | uniq -c | while read count hour; do 
    printf "%s | " "$hour" 
    print_bar "$count" 
    echo " ($count)" 
done 
 
echo "" 
echo "Total Requests Per Hour:" 
echo "-------------------------" 
# Reset max_count for total requests 
max_count=$(cat /var/log/nginx/access.log | tr -s " " | cut -d " " -f4 | tr -d "[" | cut -d ":" -f1,2 | sort | uniq -c | awk '{print $1}' | sort -n | tail -1) 
 
# Print total requests per hour 
cat /var/log/nginx/access.log | tr -s " " | cut -d " " -f4 | tr -d "[" | cut -d ":" -f1,2 | sort | uniq -c | while read count hour; do 
    printf "%s | " "$hour" 
    print_bar "$count" 
    echo " ($count)" 
done 
 
echo "" 
echo "Top 10 IP Addresses:" 
echo "----------------------" 
 
# Get the top 10 IP addresses 
top_ips=$(cat /var/log/nginx/access.log | awk '{print $1}' | sort | uniq -c | sort -nr | head -10) 
 
# Print header 
printf "%-15s | %-10s | %-8s\n" "IP Address" "Count" "Percentage" 
echo "--------------------------------------" 
 
# Iterate over top IPs and calculate the percentage 
while read count ip; do 
    percentage=$(echo "scale=2; $count / $total_requests * 100" | bc) 
    printf "%-15s | %-10s | %-8s%%\n" "$ip" "$count" "$percentage" 
done <<< "$top_ips"

Script Breakdown

Function to Print Horizontal Bars:

  • The print_bar function generates a visual representation of the request counts as horizontal bars made of # characters. It scales the length of the bars based on the maximum count.

Total Requests Count:

  • The variable total_requests captures the total number of requests from the log file using wc -l.

Unique Requests by IP and Hour:

  • The script processes the log to count unique requests per hour, displaying the counts in a horizontal bar format.

Total Requests Per Hour:

  • Similarly, the script counts total requests for each hour, again visualizing the data using horizontal bars.

Top 10 IP Addresses:

  • The script calculates and displays the top 10 IP addresses based on request counts, along with their respective counts and percentage of total requests.

Running the Script

  1. Save the script as visualize_requests_with_top_ips.sh.
  2. Make it executable:
chmod +x visualize_requests_with_top_ips.sh

3. Run the script:

./visualize_requests_with_top_ips.sh

Example Output

When you run the script, you can expect output similar to this:

Unique Requests by IP and Hour: 
--------------------------------- 
10:00 | ##########                              (10) 
10:01 | ##########                              (10) 
10:02 | #######                                 (7) 
10:03 | ######                                  (6) 
 
Total Requests Per Hour: 
------------------------- 
10:00 | ##########                              (10) 
10:01 | ##########                              (10) 
10:02 | #######                                 (7) 
10:03 | ######                                  (6) 
 
Top 10 IP Addresses: 
---------------------- 
IP Address       | Count      | Percentage 
-------------------------------------- 
192.168.1.1     | 100        | 25.00% 
192.168.1.2     | 80         | 20.00% 
192.168.1.3     | 50         | 12.50% 
192.168.1.4     | 40         | 10.00% 
192.168.1.5     | 30         | 7.50% 
192.168.1.6     | 20         | 5.00% 
192.168.1.7     | 15         | 3.75% 
192.168.1.8     | 10         | 2.50% 
192.168.1.9     | 8          | 2.00% 
192.168.1.10    | 7          | 1.75%

Conclusion

Visualizing Nginx access logs with Bash provides a straightforward way to analyze web traffic. This script empowers you to understand user interactions better, allowing you to optimize your web server and enhance the overall user experience. By incorporating statistics about unique requests, total requests, and the top IP addresses, you can make informed decisions to improve your web application.