ali lxcfs daemonset方式运行
更新:HHH   时间:2023-1-7


  • 刚开始按照相关文档将apiserver ,kubelet节点的特权模式开启--allow-privileged=true,再以ali的文档执行,完全无法运行。参考github里的issue得知,其实都是在问题为啥运行不起来的,但回复不详,其中也确实提到需要在宿主机上支持fuse。

  • 开始自行排错:无法搜索到相关资料,下载源码,编译排查
    git clone https://github.com/denverdino/lxcfs-initializer.git

  • 从Dockfile里也可知,里面的库等文件并不能适合自身的版本需要,目前只是需要让他运行起来,然后再里面执行start.sh的脚本内容,查看具体出错是在哪?

  • 据自己环境变更lxcfs-image/Dockerfile 内容如下
FROM daocloud.io/centos:7.3.1611
RUN yum -y install fuse fuse-devel pam-devel wget install gcc automake autoconf libtool make
ENV LXCFS_VERSION 2.0.8
RUN wget https://linuxcontainers.org/downloads/lxcfs/lxcfs-$LXCFS_VERSION.tar.gz && \
mkdir /lxcfs && tar xzvf lxcfs-$LXCFS_VERSION.tar.gz -C /lxcfs --strip-components=1 && \
cd /lxcfs && ./configure && make && make install
STOPSIGNAL SIGINT
ADD start.sh /
CMD ["/bin/sleep","10000"]
  • build lxcfs:sleep镜像

    [root@ns-yun-020037 ~]# cd lxcfs-initializer/
    docker build -t lxcfs:sleep lxcfs-image
  • 根据原始daemonSet的yaml文件将镜像名改为lxcfs:sleep即可

  • 进入节点容器定位问题,根据/start.sh 脚本执行相关命令,可见在最后一步执行时无法找到lxcfs这个文件
[root@yun-020040 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4e1cb10dd73e xxx:80/test/lxcfs "/bin/sleep 10000" 52 seconds ago Up 51 seconds k8s_lxcfs_lxcfs-4m5g7_default_b1306fd2-3bd4-11e9-bb5d-ec388f7928b2_0

[root@yun-020040 ~]# docker exec -it 4e1cb10dd73e /bin/bash 
[root@lxcfs-4m5g7 /]#
[root@lxcfs-4m5g7 /]# nsenter -m/proc/1/ns/mnt fusermount -u /var/lib/lxcfs 2> /dev/null || true
[root@lxcfs-4m5g7 /]# nsenter -m/proc/1/ns/mnt [ -L /etc/mtab ] ||sed -i "/^lxcfs \/var\/lib\/lxcfs fuse.lxcfs/d" /etc/mtab
[root@lxcfs-4m5g7 /]# mkdir -p /usr/local/lib/lxcfs /var/lib/lxcfs
[root@lxcfs-4m5g7 /]# exec nsenter -m/proc/1/ns/mnt lxcfs /var/lib/lxcfs/
nsenter: failed to execute lxcfs: No such file or directory
  • 根据Dockerfile的内容可以得知,其实容器应该是有该文件的
    https://github.com/denverdino/lxcfs-initializer/blob/master/lxcfs-image/Dockerfile

  • 直接用docker的方式来启动看下是否有问题,执行start.sh命令,能正常执行

    [root@yun-020040 ~]# docker  run --privileged=true -it lxcfs:sleep  /bin/bash
    [root@10ca4ad41ce4 /]# nsenter -m/proc/1/ns/mnt fusermount -u /var/lib/lxcfs 2> /dev/null || true
    [root@10ca4ad41ce4 /]# nsenter -m/proc/1/ns/mnt [ -L /etc/mtab ] ||sed -i "/^lxcfs \/var\/lib\/lxcfs fuse.lxcfs/d" /etc/mtab
    [root@10ca4ad41ce4 /]# mkdir -p /usr/local/lib/lxcfs /var/lib/lxcfs
    [root@10ca4ad41ce4 /]# exec nsenter -m/proc/1/ns/mnt lxcfs /var/lib/lxcfs/
    hierarchies:
    0: fd: 5: perf_event
    1: fd: 6: hugetlb
    2: fd: 7: pids
    3: fd: 8: cpuacct,cpu
    4: fd: 9: blkio
    5: fd: 10: devices
    6: fd: 11: cpuset
    7: fd: 12: memory
    8: fd: 13: freezer
    9: fd: 14: net_prio,net_cls
    10: fd: 15: name=systemd
  • 回看k8s的yaml文件,里面有把宿主机的/usr/local目录挂载的,且为宿主的文件,如下粗体所示
volumeMounts:
- name: cgroup
mountPath: /sys/fs/cgroup
- name: lxcfs
mountPath: /var/lib/lxcfs
mountPropagation: Bidirectional
- name: usr-local
**mountPath: /usr/local**
volumes:
- name: cgroup
hostPath:
path: /sys/fs/cgroup
**- name: usr-local**
hostPath:
path: /usr/local
- name: lxcfs
hostPath:
path: /var/lib/lxcfs
type: DirectoryOrCreate
  • 将yaml 文件中的/usr/local的挂载去掉,看是否能正常使用容器内部的lxcfs文件,经验证失败

  • 据此提示,在宿主机再次安装部署lxcfs,再结合运行daemonSet,程序运行正常

测试结果
此项目只是将宿主机启动进程托管给daemonSet,方便统一管理,宿主机还得提供相关二进制文件lib库等...

  • 进一步分析他的init容器内容,main.go,方便了挂载目录。

    flag.StringVar(&annotation, "annotation", defaultAnnotation, "The annotation to trigger initialization")
    flag.StringVar(&initializerName, "initializer-name", defaultInitializerName, "The initializer name")
    flag.StringVar(&namespace, "namespace", "default", "The configuration namespace")
    flag.BoolVar(&requireAnnotation, "require-annotation", true, "Require annotation for initialization")
    flag.Parse()
    
    log.Println("Starting the Kubernetes initializer...")
    log.Printf("Initializer name set to: %s", initializerName)
    
    clusterConfig, err := rest.InClusterConfig()
    if err != nil {
        log.Fatal(err.Error())
    }
    
    clientset, err := kubernetes.NewForConfig(clusterConfig)
    if err != nil {
        log.Fatal(err)
    }
    
    // -v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw
    // -v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw
    // -v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw
    // -v /var/lib/lxcfs/proc/stat:/proc/stat:rw
    // -v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw
    // -v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw
    c := &config{
        volumeMounts: []corev1.VolumeMount{
            corev1.VolumeMount{
                Name:      "lxcfs-proc-cpuinfo",
                MountPath: "/proc/cpuinfo",
            },

参考资料:
https://www.alibabacloud.com/blog/kubernetes-demystified%3A-using-lxcfs-to-improve-container-resource-visibility_594109?spm=a2c41.12195345.0.0
https://github.com/denverdino/lxcfs-initializer

返回云计算教程...