doc for removing worker flag from cp on UPI #165
56
docs/doc-remove-worker-flag.md
Normal file
56
docs/doc-remove-worker-flag.md
Normal file
@ -0,0 +1,56 @@
|
||||
## **Remove Worker flag from OKD Control Planes**
|
||||
|
||||
### **Context**
|
||||
On OKD user provisioned infrastructure the control plane nodes can have the flag node-role.kubernetes.io/worker which allows non critical workloads to be scheduled on the control-planes
|
||||
|
||||
### **Observed Symptoms**
|
||||
- After adding HAProxy servers to the backend each back end appears down
|
||||
- Traffic is redirected to the control planes instead of workers
|
||||
- The pods router-default are incorrectly applied on the control planes rather than on the workers
|
||||
- Pods are being scheduled on the control planes causing cluster instability
|
||||
|
||||
```
|
||||
ss -tlnp | grep 80
|
||||
```
|
||||
- shows process haproxy is listening at 0.0.0.0:80 on cps
|
||||
- same problem for port 443
|
||||
- In namespace rook-ceph certain pods are deploted on cps rather than on worker nodes
|
||||
|
||||
### **Cause**
|
||||
- when intalling UPI, the roles (master, worker) are not managed by the Machine Config operator and the cps are made schedulable by default.
|
||||
|
||||
### **Diagnostic**
|
||||
check node labels:
|
||||
```
|
||||
oc get nodes --show-labels | grep control-plane
|
||||
```
|
||||
Inspecter kubelet configuration:
|
||||
|
||||
```
|
||||
cat /etc/systemd/system/kubelet.service
|
||||
```
|
||||
|
||||
find the line:
|
||||
```
|
||||
--node-labels=node-role.kubernetes.io/control-plane,node-role.kubernetes.io/master,node-role.kubernetes.io/worker
|
||||
```
|
||||
→ presence of label worker confirms the problem.
|
||||
|
||||
Verify the flag doesnt come from MCO
|
||||
```
|
||||
oc get machineconfig | grep rendered-master
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
To make the control planes non schedulable you must patch the cluster scheduler resource
|
||||
|
||||
```
|
||||
oc patch scheduler cluster --type merge -p '{"spec":{"mastersSchedulable":false}}'
|
||||
```
|
||||
after the patch is applied the workloads can be deplaced by draining the nodes
|
||||
|
||||
```
|
||||
oc adm cordon <cp-node>
|
||||
oc adm drain <cp-node> --ignore-daemonsets –delete-emptydir-data
|
||||
```
|
||||
|
||||
Loading…
Reference in New Issue
Block a user