Query AWS Load Balancer Logs: Over 80 Queries from My Book for FREE!
November 29, 2024 | by mybuddyhoward.com
In my previous blog post, I shared the story behind writing and publishing my first book. Today, I want to extend some of that hard work to the world, free of charge. I’ve decided to publish all of the example queries from my book in this blog post. Obviously, incorporating the entire content of the book into a blog would lead to an insane amount of scrolling. Therefore, I will focus on sharing the core elements. In my book, each SQL query is accompanied by helpful explanations and tips—details that I can’t fully replicate in this blog post. Nonetheless, I believe these SQL queries will provide a solid starting point for anyone embarking on their journey to analyze AWS load balancer logs using Amazon Athena. For those who find themselves deep in the trenches and in need of more comprehensive details for each example, my book may be worth exploring. Check out my book here: https://a.co/d/9x8YPLF.
So, without further ado, let’s dive into the examples.
1. Analyzing High-Latency Targets
Category: Performance Optimization
Objective: Identify targets with the highest average response time to help optimize backend performance.
SELECT target_ip, AVG(response_processing_time) AS avg_latency
FROM alb_logs
GROUP BY target_ip
ORDER BY avg_latency DESC
LIMIT 10;
Application:
Focus on these targets for performance optimization to reduce latency and improve user experience.
2. Detecting Frequent 5xx Errors
Category: Error Analysis
Objective: Identify the top 5 targets generating 5xx errors.
SELECT target_ip, COUNT(*) AS error_count
FROM alb_logs
WHERE elb_status_code LIKE '5%'
GROUP BY target_ip
ORDER BY error_count DESC
LIMIT 5;
Application:
Fix issues with these targets to enhance reliability and reduce downtime.
3. Identifying Top Referrers
Category: Business Insights
Objective: List top 10 referrers generating traffic.
SELECT referrer, COUNT(*) AS traffic_count
FROM alb_logs
GROUP BY referrer
ORDER BY traffic_count DESC
LIMIT 10;
Application:
Understand referral sources and focus on strengthening partnerships or campaigns.
4. Classifying Request Types
Category: Business Insights
Objective: Count requests based on HTTP methods (GET, POST, etc.).
SELECT request_verb, COUNT(*) AS request_count
FROM alb_logs
GROUP BY request_verb
ORDER BY request_count DESC;
Application:
Evaluate API usage patterns for optimization and scaling.
5. Tracking Large Downloads
Category: Security & Compliance
Objective: Identify responses with high sent_bytes
.
SELECT request_url, sent_bytes
FROM alb_logs
WHERE sent_bytes > (SELECT approx_percentile(sent_bytes, 0.99) FROM alb_logs)
ORDER BY sent_bytes DESC;
Application:
Spot potential unauthorized data downloads or performance bottlenecks.
6. Peak Hour Analysis
Category: Performance Optimization
Objective: Determine hours with the most traffic.
SELECT HOUR(time) AS hour, COUNT(*) AS request_count
FROM alb_logs
GROUP BY HOUR(time)
ORDER BY request_count DESC;
Application:
Prepare for peak loads by scaling resources during high-traffic periods.
7. Monitoring Target Utilization
Category: Business Insights
Objective: Find the top 5 most accessed targets.
SELECT target_ip, COUNT(*) AS hit_count
FROM alb_logs
GROUP BY target_ip
ORDER BY hit_count DESC
LIMIT 5;
Application:
Balance traffic to prevent overloading critical resources.
8. Identifying Long-Lived Sessions
Category: Performance Optimization
Objective: Find sessions with extended durations.
SELECT session_id, MAX(session_duration) AS max_duration
FROM alb_logs
GROUP BY session_id
ORDER BY max_duration DESC
LIMIT 10;
Application:
Optimize resource usage for long sessions.
9. Average Payload Size
Category: Business Insights
Objective: Calculate the average size of requests.
SELECT AVG(received_bytes) AS avg_request_size
FROM alb_logs;
Application:
Adjust payload size limits to align with typical usage.
10. Request Patterns by Day
Category: Business Insights
Objective: Analyze request volume trends over days.
SELECT DATE(time) AS request_date, COUNT(*) AS request_count
FROM alb_logs
GROUP BY request_date
ORDER BY request_date;
Application:
Plan maintenance and campaigns around high-traffic days.
11. TLS Protocol Distribution
Category: Security & Compliance
Objective: Count requests based on SSL/TLS protocols.
SELECT ssl_protocol, COUNT(*) AS protocol_count
FROM alb_logs
GROUP BY ssl_protocol
ORDER BY protocol_count DESC;
Application:
Identify and deprecate outdated protocols for improved security.
12. Client Browser Distribution
Category: Business Insights
Objective: Determine which browsers your users are using.
SELECT user_agent, COUNT(*) AS usage_count
FROM alb_logs
GROUP BY user_agent
ORDER BY usage_count DESC
LIMIT 10;
Application:
Prioritize compatibility testing for popular browsers.
13. Analyzing Redirects
Category: Error Analysis
Objective: List the most common redirect URLs.
SELECT redirect_url, COUNT(*) AS redirect_count
FROM alb_logs
GROUP BY redirect_url
ORDER BY redirect_count DESC
LIMIT 10;
Application:
Optimize redirect rules to improve user navigation.
14. Unusual Response Times
Category: Performance Optimization
Objective: Detect requests with extreme response times.
SELECT *
FROM alb_logs
WHERE response_processing_time > (SELECT approx_percentile(response_processing_time, 0.99) FROM alb_logs);
Application:
Investigate and resolve outliers causing delays.
15. Finding Top File Uploads
Category: Business Insights
Objective: Identify the largest uploaded files.
SELECT request_url, received_bytes
FROM alb_logs
ORDER BY received_bytes DESC
LIMIT 10;
Application:
Spot trends or anomalies in file upload activity.
16. Regional Traffic Distribution
Category: Business Insights
Objective: Count traffic by geographical regions.
SELECT region, COUNT(*) AS request_count
FROM alb_logs
GROUP BY region
ORDER BY request_count DESC;
Application:
Optimize regional content delivery for better user experience.
17. Backend Health Check
Category: Performance Optimization
Objective: Monitor unhealthy target status codes.
SELECT target_status_code, COUNT(*) AS count
FROM alb_logs
GROUP BY target_status_code
ORDER BY count DESC;
Application:
Fix failing backend services promptly.
18. Correlating User Sessions
Category: Business Insights
Objective: Track the number of requests per session.
SELECT session_id, COUNT(*) AS request_count
FROM alb_logs
GROUP BY session_id
ORDER BY request_count DESC
LIMIT 10;
Application:
Understand user activity and session behavior.
19. HTTP Version Usage
Category: Business Insights
Objective: Analyze traffic by HTTP version.
SELECT http_version, COUNT(*) AS request_count
FROM alb_logs
GROUP BY http_version
ORDER BY request_count DESC;
Application:
Evaluate HTTP/2 or HTTP/3 adoption rates for performance improvements.
20. Frequent Error Reasons
Category: Error Analysis
Objective: List top error reasons from requests.
SELECT error_reason, COUNT(*) AS error_count
FROM alb_logs
GROUP BY error_reason
ORDER BY error_count DESC
LIMIT 10;
Application:
Address common causes of request failures to enhance reliability.
I hope these examples give you a solid starting point for exploring the power of Amazon Athena in analyzing AWS ALB logs. Whether you’re optimizing performance, troubleshooting errors, or gaining business insights, these queries can help you uncover valuable patterns and actionable data from your logs.
If you find these examples helpful and want to dive deeper, my book, Analyzing AWS Load Balancer Logs: A Quick Guide with Over 80 Essential Queries Using Amazon Athena, offers a comprehensive collection of strategies and tips to take your log analysis skills to the next level.
As always, I’d love to hear your feedback, questions, or any creative ways you’ve used these queries in your projects. Feel free to leave a comment below or connect with me.
RELATED POSTS
View all
Leave a Reply