doc for removing worker flag from cp on UPI #165
							
								
								
									
										56
									
								
								docs/doc-remove-worker-flag.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										56
									
								
								docs/doc-remove-worker-flag.md
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,56 @@ | |||||||
|  | ## **Remove Worker flag from OKD Control Planes**  | ||||||
|  | 
 | ||||||
|  | ### **Context** | ||||||
|  | On OKD user provisioned infrastructure the control plane nodes can have the flag node-role.kubernetes.io/worker which allows non critical workloads to be scheduled on the control-planes | ||||||
|  | 
 | ||||||
|  | ### **Observed Symptoms** | ||||||
|  | - After adding HAProxy servers to the backend each back end appears down  | ||||||
|  | - Traffic is redirected to the control planes instead of workers | ||||||
|  | - The pods router-default are incorrectly applied on the control planes rather than on the workers | ||||||
|  | - Pods are being scheduled on the control planes causing cluster instability | ||||||
|  | 
 | ||||||
|  | ``` | ||||||
|  |   ss -tlnp | grep 80 | ||||||
|  | ``` | ||||||
|  | - shows process haproxy  is listening at 0.0.0.0:80 on cps | ||||||
|  | - same problem for port 443 | ||||||
|  | - In namespace rook-ceph certain pods are deploted on cps rather than on worker nodes | ||||||
|  | 
 | ||||||
|  |  ### **Cause** | ||||||
|  |  - when intalling UPI, the roles (master, worker) are not managed by the Machine Config operator and the cps are made schedulable by default. | ||||||
|  | 
 | ||||||
|  |  ### **Diagnostic** | ||||||
|  | check node labels: | ||||||
|  | ``` | ||||||
|  |    oc get nodes --show-labels | grep control-plane | ||||||
|  | ``` | ||||||
|  | Inspecter kubelet configuration: | ||||||
|  | 
 | ||||||
|  | ``` | ||||||
|  | cat /etc/systemd/system/kubelet.service | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | find the line: | ||||||
|  | ``` | ||||||
|  |    --node-labels=node-role.kubernetes.io/control-plane,node-role.kubernetes.io/master,node-role.kubernetes.io/worker | ||||||
|  | ``` | ||||||
|  |    → presence of label worker confirms the problem. | ||||||
|  | 
 | ||||||
|  | Verify the flag doesnt come from MCO | ||||||
|  | ``` | ||||||
|  |    oc get machineconfig | grep rendered-master | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
|  | **Solution:** | ||||||
|  | To make the control planes non schedulable you must patch the cluster scheduler resource | ||||||
|  | 
 | ||||||
|  | ```	 | ||||||
|  | oc patch scheduler cluster --type merge -p '{"spec":{"mastersSchedulable":false}}' | ||||||
|  | ``` | ||||||
|  | after the patch is applied the workloads can be deplaced by draining the nodes | ||||||
|  | 
 | ||||||
|  | ``` | ||||||
|  | oc adm cordon <cp-node> | ||||||
|  | oc adm drain <cp-node> --ignore-daemonsets –delete-emptydir-data | ||||||
|  | ``` | ||||||
|  | 
 | ||||||
		Loading…
	
		Reference in New Issue
	
	Block a user