root69's questions -server

root69

Asked: 2024-10-04 13:46:55 +0800 CST

AWS Application Load Balancer sends traffic to unhealthy target group

I have 3 instances(node-0, node-1, node-2) running 2 services - one is a websocket and the other one an API (both services run in each instance).

Target Group Setup:

Target Group	Instance	Health Check Path
api-node-0	node-0	/some-path/api/v1/ping
api-node-1	node-1	/some-path/api/v1/ping
api-node-2	node-2	/some-path/api/v1/ping
websocket-node-0	node-0	/some-path/websocket/v1/ping
websocket-node-1	node-1	/some-path/websocket/v1/ping
websocket-node-2	node-2	/some-path/websocket/v1/ping

Listener and Rules:

HTTPS:443 Listener

Rules:

api

Condition: Path /some-path/api/*
Action: Forward to target group:
- api-node-0 (33.33%)
- api-node-1 (33.33%)
- api-node-2 (33.33%)
- Stickiness: Off

websocket

Condition: Path /some-path/websocket/*
Action: Forward to target group:
- websocket-node-0 (33.33%)
- websocket-node-1 (33.33%)
- websocket-node-2 (33.33%)
- Stickiness: Off

default

Condition: No other rule applies
Action: Forward to target group:
- api-node-0 (100%)

Health Check attributes:

Interval: 30 seconds
Timeout: 5 seconds
Healthy threshold: 2
Unhealthy threshold: 2
Healthy threshold: 2 consecutive health check successes
Unhealthy threshold: 2 consecutive health check failures
Success codes: 200

Load Balancer attributes:

HTTP client keepalive duration: 3600 seconds
Connection idle timeout: 60 seconds
X-Forwarded-For header: Append
Cross-zone load balancing: On

P.S. If you need any more information regarding the setup please let me know.

During normal testing where all target groups are healthy the ALB seems to be operating as expected. Issue arises when I want to simulate a scenario when one of the services on a node becomes unhealthy, I changed the health check path of i.e api-node-1, it shows up as unhealthy (Error 404)but traffic is still being send to it. Confirmed both via Access logs and CloudWatch Metrics (RequestCountPerTarget). I also tried as a simulation of an unhealthy group to block the access of the ALB by removing the relevant security group from the instance. (Error 400)

Testing methods (with unhealthy target group): Using curl (10-20 times) or a Grafana k6 Load Test and monitored traffic both in Access Logs and Cloudwatch - traffic was still being routed to all the instances and one of them was shown as unhealthy.

You can find another question that discussed this issue linked here.

AWS Application Load Balancer sends traffic to unhealthy target group

api

websocket

default

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?