Rancher(2),K8S持久性存储Ceph RBD搭建及配置
更新:HHH   时间:2023-1-7


1、配置host,安装ntp(非必须)
2、配置免密ssh
3、配置ceph,yum源

vim /etc/yum.repo.d/ceph.cepo

[ceph]
name=ceph
baseurl=http://mirrors.cloud.tencent.com/ceph/rpm-luminous/el7/x86_64/
gpgcheck=0
priority=1

[ceph-noarch]
name=cephnoarch
baseurl=http://mirrors.cloud.tencent.com/ceph/rpm-luminous/el7/noarch/
gpgcheck=0
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.cloud.tencent.com/ceph/rpm-luminous/el7/SRPMS
enabled=0   
gpgcheck=1
type=rpm-md
gpgkey=http://mirrors.cloud.tencent.com/ceph/keys/release.asc
priority=1

4、安装ceph-deploy

yum update
yum install ceph-deploy

5、安装

安装过程中,如果报错,可以使用以下命令清除配置:


ceph-deploy purgedata {ceph-node} [{ceph-node}]
ceph-deploy forgetkeys

以下命令把ceph安装包一起清除:


ceph-deploy purge {ceph-node} [{ceph-node}]

mkdir -p /root/cluster
cd /root/cluster/
ceph-deploy new yj-ceph2

如果报错:
Traceback
(most recent call last):
File "/usr/bin/ceph-deploy", line 18, in <module>
from ceph_deploy.cli import main
File "/usr/lib/python2.7/site-packages/ceph_deploy/cli.py", line 1, in
<module>
import pkg_resources
ImportError: No module named pkg_resources

安装:

yum install python-setuptools

把 Ceph 配置文件里的默认副本数从 3 改成 2 ,这样只有两个 OSD 也可以达到 active + clean 状态。

vim ceph.conf 

[global]
fsid = 8764fad7-a8f0-4812-b4db-f1a65af66e4a
mon_initial_members = ceph2,ceph3
mon_host = 192.168.10.211,192.168.10.212
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 2
mon clock drift allowed = 5
mon clock drift warn backoff = 30

ceph-deploy install yj-ceph2 yj-ceph3
ceph-deploy mon create-initial

ceph-deploy osd create --data /dev/vdb yj-ceph2
ceph-deploy osd create --data /dev/vdb yj-ceph3

用 ceph-deploy 把配置文件和 admin 密钥拷贝到管理节点和 Ceph 节点,这样你每次执行 Ceph 命令行时就无需指定 monitor 地址和 ceph.client.admin.keyring了

ceph-deploy admin yj-ceph2 yj-ceph3

ceph osd tree
ceph-deploy mgr create yj-ceph2

ceph health
ceph -s

一个ceph集群可以有多个pool,每个pool是逻辑上的隔离单位,不同的pool可以有完全不一样的数据处理方式,比如Replica Size(副本数)、Placement Groups、CRUSH Rules、快照、所属者等。
通常在创建pool之前,需要覆盖默认的pg_num,官方推荐:

若少于5个OSD, 设置pg_num为128。
5~10个OSD,设置pg_num为512。
10~50个OSD,设置pg_num为4096。
超过50个OSD,可以参考pgcalc计算。

osd pool default pg num = 128
osd pool default pgp num = 128

ceph osd pool create k8s-pool 128 128

需要把管理员的key存储为secret到k8s,最好就配置在default空间
ceph auth get-key client.admin|base64
把得到的key值替换下面的key

vim ceph-secret-admin.yaml

apiVersion: v1
kind: Secret
metadata:
   name: ceph-secret-admin
type: "kubernetes.io/rbd"
data:
   key: QVFBTHhxxxxxxxxxxFpRQmltbnBDelRkVmc9PQ==

kubectl apply -f ceph-secret-admin.yaml

rancher 报错:

MountVolume.SetUp failed for volume "pvc-a2754739-cf6f-11e7-a7a5-02e985942c89" :
rbd: map failed exit status 2 2017-11-22 12:35:53.503224 7f0753c66100 -1 did not load config file,
using default settings. libkmod: ERROR ../libkmod/libkmod.c:586 kmod_search_moddep:
could not open moddep file '/lib/modules/4.9.45-rancher/modules.dep.bin' modinfo: ERROR:
Module alias rbd not found. modprobe:
ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file
'/lib/modules/4.9.45-rancher/modules.dep.bin' modprobe:
FATAL: Module rbd not found in directory /lib/modules/4.9.45-rancher rbd: failed to load rbd kernel module (1)
rbd: sysfs write failed In some cases useful info is found in syslog - try "dmesg | tail" or so. rbd: map failed:
(2) No such file or directory

新节点需要安装ceph-client已经配置ceph配置文件:

yum install ceph-common

配置用户
ceph.client.admin.keyring ceph.client.kube.keyring ceph.client.test.keyring ceph.conf 复制到/etc/ceph/下

是因为容器无法访问到/lib/modules,需要在rke的配置上添加:

services:
  etcd:
    backup_config:
        enabled: true
        interval_hours: 6
        retention: 60
  kubelet:
    extra_binds:
      - "/lib/modules:/lib/modules"

然后使用:

rke up --config rancher-cluster.yml

还是rancher使用ceph的问题:

sc 搭建好了,
部署使用pvc创建,报错,

大意是ceph map失败~~~

emm,

最后发现需要在每个节点,手动map一次~,就没有再次报错了~~~MMP

rbd create foo --size 1024 --image-feature=layring -p test
rbd map foo -p test

ceph rbd扩容:

查看到需要扩容的镜像id,在ceph上扩容:

rbd resize --size 2048 kubernetes-dynamic-pvc-572a74e9-db6a-11e9-9b3a-525400e65297 -p test

修改pv配置为对应大小,重启对应容器即可。

返回云计算教程...