本篇内容介绍了“Kubernetes Node Controller怎么启动”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!
Node Controller的启动
if ctx.IsControllerEnabled(nodeControllerName) {
// 解析得到Cluster CIDR, # clusterCIDR is CIDR Range for Pods in cluster.
_, clusterCIDR, err := net.ParseCIDR(s.ClusterCIDR)
// 解析得到Service CIDR,# serviceCIDR is CIDR Range for Services in cluster.
_, serviceCIDR, err := net.ParseCIDR(s.ServiceCIDR)
// 创建NodeController实例
nodeController, err := nodecontroller.NewNodeController(
sharedInformers.Core().V1().Pods(),
sharedInformers.Core().V1().Nodes(),
sharedInformers.Extensions().V1beta1().DaemonSets(),
cloud,
clientBuilder.ClientOrDie("node-controller"),
s.PodEvictionTimeout.Duration,
s.NodeEvictionRate,
s.SecondaryNodeEvictionRate,
s.LargeClusterSizeThreshold,
s.UnhealthyZoneThreshold,
s.NodeMonitorGracePeriod.Duration,
s.NodeStartupGracePeriod.Duration,
s.NodeMonitorPeriod.Duration,
clusterCIDR,
serviceCIDR,
int(s.NodeCIDRMaskSize),
s.AllocateNodeCIDRs,
s.EnableTaintManager,
utilfeature.DefaultFeatureGate.Enabled(features.TaintBasedEvictions),
)
// 执行Run方法启动该Controller
nodeController.Run()
// sleep一个随机时间,该时间大小为 “ControllerStartInterval + rand.Float64()*1.0*float64(ControllerStartInterval))”,其中ControllerStartInterval可以通过配置kube-controller-manager的"--controller-start-interval”参数指定。
time.Sleep(wait.Jitter(s.ControllerStartInterval.Duration, ControllerStartJitter))
}
因此,很清晰地,关键就在以下两步:
nodeController, err := nodecontroller.NewNodeController
创建NodeController实例。
nodeController.Run()
执行Run方法启动该Controller。
NodeController的定义
在分析NodeController的原理之前,我们有必要先看看NodeController是如何定义的,其完整的定义如下:
type NodeController struct {
allocateNodeCIDRs bool
cloud cloudprovider.Interface
clusterCIDR *net.IPNet
serviceCIDR *net.IPNet
knownNodeSet map[string]*v1.Node
kubeClient clientset.Interface
// Method for easy mocking in unittest.
lookupIP func(host string) ([]net.IP, error)
// Value used if sync_nodes_status=False. NodeController will not proactively
// sync node status in this case, but will monitor node status updated from kubelet. If
// it doesn't receive update for this amount of time, it will start posting "NodeReady==
// ConditionUnknown". The amount of time before which NodeController start evicting pods
// is controlled via flag 'pod-eviction-timeout'.
// Note: be cautious when changing the constant, it must work with nodeStatusUpdateFrequency
// in kubelet. There are several constraints:
// 1. nodeMonitorGracePeriod must be N times more than nodeStatusUpdateFrequency, where
// N means number of retries allowed for kubelet to post node status. It is pointless
// to make nodeMonitorGracePeriod be less than nodeStatusUpdateFrequency, since there
// will only be fresh values from Kubelet at an interval of nodeStatusUpdateFrequency.
// The constant must be less than podEvictionTimeout.
// 2. nodeMonitorGracePeriod can't be too large for user experience - larger value takes
// longer for user to see up-to-date node status.
nodeMonitorGracePeriod time.Duration
// Value controlling NodeController monitoring period, i.e. how often does NodeController
// check node status posted from kubelet. This value should be lower than nodeMonitorGracePeriod.
// TODO: Change node status monitor to watch based.
nodeMonitorPeriod time.Duration
// Value used if sync_nodes_status=False, only for node startup. When node
// is just created, e.g. cluster bootstrap or node creation, we give a longer grace period.
nodeStartupGracePeriod time.Duration
// per Node map storing last observed Status together with a local time when it was observed.
// This timestamp is to be used instead of LastProbeTime stored in Condition. We do this
// to aviod the problem with time skew across the cluster.
nodeStatusMap map[string]nodeStatusData
now func() metav1.Time
// Lock to access evictor workers
evictorLock sync.Mutex
// workers that evicts pods from unresponsive nodes.
zonePodEvictor map[string]*RateLimitedTimedQueue
// workers that are responsible for tainting nodes.
zoneNotReadyOrUnreachableTainer map[string]*RateLimitedTimedQueue
podEvictionTimeout time.Duration
// The maximum duration before a pod evicted from a node can be forcefully terminated.
maximumGracePeriod time.Duration
recorder record.EventRecorder
nodeLister corelisters.NodeLister
nodeInformerSynced cache.InformerSynced
daemonSetStore extensionslisters.DaemonSetLister
daemonSetInformerSynced cache.InformerSynced
podInformerSynced cache.InformerSynced
// allocate/recycle CIDRs for node if allocateNodeCIDRs == true
cidrAllocator CIDRAllocator
// manages taints
taintManager *NoExecuteTaintManager
forcefullyDeletePod func(*v1.Pod) error
nodeExistsInCloudProvider func(types.NodeName) (bool, error)
computeZoneStateFunc func(nodeConditions []*v1.NodeCondition) (int, zoneState)
enterPartialDisruptionFunc func(nodeNum int) float32
enterFullDisruptionFunc func(nodeNum int) float32
zoneStates map[string]zoneState
evictionLimiterQPS float32
secondaryEvictionLimiterQPS float32
largeClusterThreshold int32
unhealthyZoneThreshold float32
// if set to true NodeController will start TaintManager that will evict Pods from
// tainted nodes, if they're not tolerated.
runTaintManager bool
// if set to true NodeController will taint Nodes with 'TaintNodeNotReady' and 'TaintNodeUnreachable'
// taints instead of evicting Pods itself.
useTaintBasedEvictions bool
}
NodeController的行为配置
整个NodeController结构体非常复杂,包含30+项,我们将重点关注:
clusterCIDR
- 通过--cluster-cidr
来设置,表示CIDR Range for Pods in cluster。
serivceCIDR
- 通过--service-cluster-ip-range
来设置,表示CIDR Range for Services in cluster。
knownNodeSet
- 用来记录NodeController observed节点的集合。
nodeMonitorGracePeriod
- 通过--node-monitor-grace-period
来设置,默认为40s,表示在标记某个Node为unhealthy前,允许40s内该Node unresponsive。
nodeMonitorPeriod
- 通过--node-monitor-period
来设置,默认为5s,表示在NodeController中同步NodeStatus的周期。
nodeStatusMap
- 用来记录每个Node最近一次观察到的Status。
zonePodEvictor
- workers that evicts pods from unresponsive nodes.
zoneNotReadyOrUnreachableTainer
- workers that are responsible for tainting nodes.
podEvictionTimeout
- 通过--pod-eviction-timeout
设置,默认为5min,表示在强制删除Pod时,允许的最大的Pod eviction时间。
maximumGracePeriod
- The maximum duration before a pod evicted from a node can be forcefully terminated. 不可配置,代码中写死为5min。
nodeLister
- 用来获取Node数据的Interface。
daemonSetStore
- 用来获取 daemonSet数据的Interface。在通过Eviction方式删除Pods时,会跳过该Node上所有的daemonSet对应的Pods。
taintManager
- 它是一个NoExecuteTaintManager
对象,当runTaintManager
(默认true)为true时:
PodInformer和NodeInformer将监听到PodAdd,PodDelete,PodUpdate和NodeAdd,NodeDelete,NodeUpdate事件后,
触发TraintManager执行对应的NoExecuteTaintManager.PodUpdated
和NoExecuteTaintManager.NodeUpdated
方法,
将事件加入到对应的queue(podUpdateQueue and nodeUpdateQueue),TaintController会从这些queue中消费这些消息,
TaintController分别调用handlePodUpdate和handleNodeUpdate处理。
具体的TaintController的处理逻辑,后续再单独分析。
forcefullyDeletePod
- 该方法用来NodeController调用apiserver接口强制删除该Pod。用来删除那些被调度到kubelet version 小于v1.1.0 Node上的Pod,因为kubelet v1.1.0之前的版本不支持graceful termination。
computeZoneStateFunc
- 该方法返回Zone中NotReadyNodes数量以及该Zone的state。
如果没有一个Ready Node,则该node state为FullDisruption
;
如果unhealthy Nodes所占的比例大于等于unhealthyZoneThreshold
,则该node state为PartialDisruption
;
否则该node state就是Narmal
。
enterPartialDisruptionFunc
- 该方法用当前node num对比largeClusterThreshold
:
enterFullDisruptionFunc
- 用来获取evictionLimiterQPS
(默认为0.1)的方法,关于evictionLimiterQPS
的理解见下。
zoneStates
- 表示各个zone的状态,状态值可以为
Initial
;
Normal
;
FullDisruption
;
PartialDisruption
;
evictionLimiterQPS
- 通过--node-eviction-rate
设置,默认为0.1,表示当某个Zone status为healthy时,每秒应该剔除的Nodes数量,即每10s剔除1个Node。
secondaryEvictionLimiterQPS
- 通过--secondary-node-eviction-rate
设置,默认为0.01,表示当某个Zone status为unhealthy时,每秒应该剔除的Nodes数量,即每100s剔除1个Node。
largeClusterThreshold
- 通过--large-cluster-size-threshold
设置,默认为50,表示当健康nodes组成的集群规模小于等于50时,secondary-node-eviction-rate
将被设置为0。
unhealthyZoneThreshold
- 通过--unhealthy-zone-threshold
设置,默认为0.55,表示当某个Zone中unhealthy Nodes(最少为3)所占的比例达到0.55时,就认为该Zone的状态为unhealthy。
runTaintManager
- 在--enable-taint-manager
中指定,默认为true。如果为true,则表示NodeController将会启动TaintManager,由TaintManager负责将不能容忍该Taint的Nodes上的Pods进行evict操作。
useTaintBasedEvictions
- 在--feature-gates
中指定,默认TaintBasedEvictions=false
,仍属于Alpha特性。如果为true,则表示将通过Taint Nodes的方式来Evict Pods。
“Kubernetes Node Controller怎么启动”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注天达云网站,小编将为大家输出更多高质量的实用文章!