ELB Health Checks Overview
Amazon Elastic Load Balancer (ELB) health checks are a critical feature that helps ensure traffic is routed only to healthy instances. By performing regular health checks, ELB monitors the state of the targets (EC2 instances) in a target group and ensures that only healthy instances receive traffic. If an instance is deemed unhealthy, the load balancer stops routing traffic to that instance until it becomes healthy again.
There are three types of load balancers in AWS:
- Application Load Balancer (ALB)
- Network Load Balancer (NLB)
- Classic Load Balancer (CLB)
Each type of load balancer supports health checks, but the configurations differ slightly between them. Let’s focus on the general health check setup and common attributes.
Health Check Configuration
Health checks can be configured for the target group in the case of an ALB or NLB. In the case of CLB, the health checks are applied at the load balancer level, and there are some minor differences.
Here are the key components of health checks for ELBs:
- Protocol:
- The protocol defines how the health check is performed. The supported protocols are:
- HTTP (for ALB)
- HTTPS (for ALB)
- TCP (for NLB and CLB)
- SSL (for NLB)
- The protocol defines how the health check is performed. The supported protocols are:
- Port:
- The port specifies which port the load balancer should use for health checks. This can either be the default port (e.g., port 80 for HTTP or port 443 for HTTPS) or a custom port defined for the service on the target instance.
- Path (ALB only):
- For HTTP/HTTPS health checks, you can specify a URL path (e.g.,
/health
or/status
) to check if the application is healthy. This path should return a 2xx or 3xx HTTP response to indicate health. If the target returns any other status code, it will be considered unhealthy.
- For HTTP/HTTPS health checks, you can specify a URL path (e.g.,
- Success Codes:
- A list of HTTP status codes that indicate a healthy target. For example, you might want to consider a 200 status code as successful.
- Interval:
- The frequency (in seconds) with which the load balancer checks the health of a target. The default is 30 seconds, but it can be configured to be more frequent or less frequent.
- Timeout:
- The amount of time (in seconds) to wait for a response from a target before considering the health check as failed. The default is 5 seconds.
- Unhealthy Threshold:
- The number of consecutive failed health checks before the target is considered unhealthy. For example, if this is set to 2, the target must fail two consecutive health checks before it is deemed unhealthy.
- Healthy Threshold:
- The number of consecutive successful health checks before the target is considered healthy again. This ensures that a target isn’t prematurely re-registered as healthy after a single success.
- Response Time (Optional):
- Health checks also measure the response time. A target may be considered unhealthy if the response time exceeds a certain threshold.
Health Check Behavior Across Different ELB Types
1. Application Load Balancer (ALB) Health Checks
- Protocol: HTTP/HTTPS
- Path: You can specify a custom URL path (e.g.,
/healthcheck
). - Success Codes: Default is 200-299 for HTTP. You can also define specific success codes.
- Default Configuration:
- Interval: 30 seconds
- Timeout: 5 seconds
- Unhealthy threshold: 2 failures
- Healthy threshold: 5 successes
ALBs use HTTP/HTTPS health checks to monitor the health of applications running behind them, and the health check URL can be customized to test specific application endpoints.
2. Network Load Balancer (NLB) Health Checks
- Protocol: TCP/SSL
- Port: Typically, health checks are performed over TCP or SSL connections.
- Default Configuration:
- Interval: 10 seconds
- Timeout: 5 seconds
- Unhealthy threshold: 2 failures
- Healthy threshold: 10 successes
NLBs perform TCP/SSL health checks, which determine if a network connection to a target is working.
3. Classic Load Balancer (CLB) Health Checks
- Protocol: HTTP/HTTPS/TCP
- Path: Similar to ALBs, CLBs can use a URL path for HTTP/HTTPS health checks.
- Default Configuration:
- Interval: 30 seconds
- Timeout: 5 seconds
- Unhealthy threshold: 2 failures
- Healthy threshold: 10 successes
CLBs are typically used for legacy applications and allow both TCP and HTTP/HTTPS health checks.
Health Check Example for ALB:
Here is an example of how you might configure health checks for an Application Load Balancer (ALB):
- Protocol: HTTP
- Path:
/health
(or another application-specific health check path) - Port: 80 (or another application port)
- Success Codes: 200
- Interval: 30 seconds
- Timeout: 5 seconds
- Unhealthy Threshold: 2
- Healthy Threshold: 5
How ELB Health Checks Work:
- Healthy Target: The load balancer routes traffic to targets (EC2 instances, containers, etc.) that pass the health check criteria, meaning the targets respond with the expected HTTP status or meet the specified network criteria.
- Unhealthy Target: If a target fails the health check multiple times (based on the unhealthy threshold), the load balancer stops routing traffic to that target. The target remains unhealthy until it passes a number of consecutive successful health checks, as defined by the healthy threshold.
Troubleshooting Health Check Failures:
If an instance is marked as unhealthy, you should consider the following steps to troubleshoot:
- Check the application logs: Verify that the application running on the instance is functioning correctly and responding to requests as expected.
- Review the health check path and response: Ensure the path specified for the health check is correct (for ALBs and CLBs), and that it returns the expected status codes.
- Ensure firewall settings allow health check traffic: If the instance’s security group or network ACLs block the health check traffic, the health check will fail.
- Monitor instance resource utilization: If the instance is under heavy load, it may be slow to respond to health checks, leading to failure.
Summary
Amazon ELB health checks are essential for ensuring that only healthy EC2 instances receive traffic. The configuration allows for flexible monitoring of instance health using different protocols and customizable thresholds. The health check setup varies slightly between Application Load Balancer (ALB), Network Load Balancer (NLB), and Classic Load Balancer (CLB), but the core concept remains the same — to ensure only healthy targets receive traffic.
By properly configuring health checks, you ensure high availability and fault tolerance for your applications, as the load balancer will automatically reroute traffic away from unhealthy instances to healthy ones.