C Sébastien Han

Stacker! Cepher! What's next?

Check the cluster status

Normal output:

1
2
3
4
5
6
$ ceph -s
health HEALTH_OK
monmap e1: 3 mons at {0=10.19.0.173:6789/0,1=10.19.0.176:6789/0,2=10.19.0.178:6789/0}, election epoch 14, quorum 0,1,2 0,1,2
osdmap e70: 3 osds: 3 up, 3 in
pgmap v8616: 592 pgs: 592 active+clean; 53934 MB data, 105 GB used, 451 GB / 556 GB avail
mdsmap e1: 0/0/1 up

Short version:

1
2
$ ceph health
HEALTH_OK

Watch mode:

1
2
3
$ ceph -w
...
...

Way more:

1
2
3
4
5
6
7
8
9
$ ceph health detail
HEALTH_WARN 217 pgs peering; 221 pgs stuck inactive; 221 pgs stuck unclean
pg 1.d is stuck peering, last acting [4,2]
pg 0.c is stuck peering, last acting [1,0]
pg 1.a is stuck peering, last acting [4,2]
pg 0.b is stuck peering, last acting [4,2]
pg 1.b is stuck peering, last acting [1,0]
...
...

Global view of your cluster:

1
2
3
4
5
6
7
8
9
10
11
$ ceph osd tree
dumped osdmap tree epoch 115
# id    weight  type name   up/down reweight
-1  3   pool default
-3  3       rack unknownrack
-2  1           host ceph01
0   1               osd.0   up  1
-4  1           host ceph02
1   1               osd.1   up  1
-5  1           host ceph03
2   1               osd.2   up  1

Check your replication level:

1
2
3
4
5
6
7
ceph osd dump | grep size
pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 192 pgp_num 192 last_change 1 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 192 pgp_num 192 last_change 1 owner 0
pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 192 pgp_num 192 last_change 1 owner 0
pool 13 'bench' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 44 owner 18446744073709551615
pool 14 'aaa' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 69 owner 18446744073709551615
pool 16 'seb' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 74 owner 18446744073709551615

K Useful links: