BUG OKD "tcp server port" check is not enough when a node is half broken #163
Labels
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: NationTech/harmony#163
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Seen in production :
cp0 node dies in a cluster
k get nodes shows node is dead
haproxy still sending traffic on ports 80 and 443 (probably 22623 too) as only the api server readyz health check failed so only port 6443 was corectly marked as down.
We need to figure out a better production configuration for this health check. When the server shuts down, or somehow the networking becomdes completely unavailable, tcp serverport is enough.
The correct solution would probably be to perform a full on http request on all ports.