CephBasics: Difference between revisions
(6 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
{{DISPLAYTITLE:Operating a Ceph cluster}} | {{DISPLAYTITLE:Operating a Ceph cluster}} | ||
== Where to operate ? == | |||
All operations should be done on the ceph-admin machine. In our experimental testbed, it is cephq1.wn.iihe.ac.be. | |||
== Check Ceph cluster status == | == Check Ceph cluster status == | ||
The command : | The command : | ||
Line 25: | Line 27: | ||
== Remove OSDs == | == Remove OSDs == | ||
When you want to remove a machine that contains OSDs (for example : decommissioning of an old equipment out of warranty), there is manual procedure to follow in order to do things in a clean way and to avoid problems : | When you want to remove a machine that contains OSDs (for example : decommissioning of an old equipment out of warranty), there is manual procedure to follow in order to do things in a clean way and to avoid problems : | ||
<ol> | |||
<li> Identify the OSDs hosted by the machine with the command : | |||
<pre> | <pre> | ||
ceph osd tree | ceph osd tree | ||
</pre> | </pre> | ||
</li> | |||
Before you remove an OSD, it is usually up and in. You need to take it out of the cluster so that Ceph can begin rebalancing and copying its data to other OSDs : | <li> Take the OSDs out of the cluster : | ||
<pre> | Before you remove an OSD, it is usually up and in. You need to take it out of the cluster so that Ceph can begin rebalancing and copying its data to other OSDs. | ||
ceph osd | First put its weight to 0 in the crushmap: | ||
</pre> | <pre>ceph osd crush reweight osd.{osd-num} 0.0</pre> | ||
Repeat this operation for all the OSDs on the machine. | Repeat this operation for all the OSDs on the machine. | ||
</li> | |||
<li> Monitor the data migration : | |||
Once you have taken the OSDs out of the cluster, Ceph will begin rebalancing the cluster by migrating placement groups out of the OSDs you've removed. You must follow this process with the following command : | Once you have taken the OSDs out of the cluster, Ceph will begin rebalancing the cluster by migrating placement groups out of the OSDs you've removed. You must follow this process with the following command : | ||
<pre> | <pre> | ||
Line 41: | Line 45: | ||
</pre> | </pre> | ||
You should see the placement group states change from active+clean to active, some degraded objects, and finally active+clean when migration completes. | You should see the placement group states change from active+clean to active, some degraded objects, and finally active+clean when migration completes. | ||
Once the procedure is completed, take the osd out of the cluster: | |||
<pre>ceph osd out {osd-num}</pre> | |||
</li> | |||
<li> Stopping the OSDs daemons | |||
After you take an OSD out of the cluster, it may still be running. That is, the OSD may be up and out. You must stop your OSD before you remove it from the configuration : | After you take an OSD out of the cluster, it may still be running. That is, the OSD may be up and out. You must stop your OSD before you remove it from the configuration : | ||
<pre> | <pre> | ||
Line 49: | Line 56: | ||
(Repeat the last command for all the OSDs on the machine.) | (Repeat the last command for all the OSDs on the machine.) | ||
As a result, a "ceph -s" should show the OSDs as down. | As a result, a "ceph -s" should show the OSDs as down. | ||
</li> | |||
<li> Removing the OSDs : | |||
<ol> | |||
<li>Remove OSDs from crush map : | |||
<pre> | <pre> | ||
ceph osd crush remove osd.{osd-num} | ceph osd crush remove osd.{osd-num} | ||
</pre> | </pre> | ||
</li> | |||
<li>Remove the OSD authentication key : | |||
<pre> | <pre> | ||
ceph auth del osd.{osd-num} | ceph auth del osd.{osd-num} | ||
</pre> | </pre> | ||
</li> | |||
<li>Remove the OSDs | |||
<pre> | |||
ceph osd rm {osd-num} | |||
</pre> | |||
</li> | |||
</ol> | |||
</li> | |||
</ol> |
Latest revision as of 14:52, 8 March 2018
Where to operate ?
All operations should be done on the ceph-admin machine. In our experimental testbed, it is cephq1.wn.iihe.ac.be.
Check Ceph cluster status
The command :
ceph -s
shows the actual status of the Ceph cluster :
[root@cephq1 ~]# ceph -s cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d health HEALTH_OK monmap e1: 3 mons at {cephq2=192.168.41.2:6789/0,cephq3=192.168.41.3:6789/0,cephq4=192.168.41.4:6789/0} election epoch 8, quorum 0,1,2 cephq2,cephq3,cephq4 osdmap e78: 6 osds: 6 up, 6 in pgmap v1293: 192 pgs, 2 pools, 0 bytes data, 0 objects 27920 kB used, 4021 GB / 4106 GB avail 192 active+clean
The following command displays a real-time summary of the status of the cluster, and major events :
ceph -w
Remove OSDs
When you want to remove a machine that contains OSDs (for example : decommissioning of an old equipment out of warranty), there is manual procedure to follow in order to do things in a clean way and to avoid problems :
- Identify the OSDs hosted by the machine with the command :
ceph osd tree
- Take the OSDs out of the cluster :
Before you remove an OSD, it is usually up and in. You need to take it out of the cluster so that Ceph can begin rebalancing and copying its data to other OSDs.
First put its weight to 0 in the crushmap:
ceph osd crush reweight osd.{osd-num} 0.0
Repeat this operation for all the OSDs on the machine.
- Monitor the data migration :
Once you have taken the OSDs out of the cluster, Ceph will begin rebalancing the cluster by migrating placement groups out of the OSDs you've removed. You must follow this process with the following command :
ceph -w
You should see the placement group states change from active+clean to active, some degraded objects, and finally active+clean when migration completes. Once the procedure is completed, take the osd out of the cluster:
ceph osd out {osd-num}
- Stopping the OSDs daemons
After you take an OSD out of the cluster, it may still be running. That is, the OSD may be up and out. You must stop your OSD before you remove it from the configuration :
ssh {osd-host} /etc/init.d/ceph stop osd.{osd-num}
(Repeat the last command for all the OSDs on the machine.) As a result, a "ceph -s" should show the OSDs as down.
- Removing the OSDs :
- Remove OSDs from crush map :
ceph osd crush remove osd.{osd-num}
- Remove the OSD authentication key :
ceph auth del osd.{osd-num}
- Remove the OSDs
ceph osd rm {osd-num}
- Remove OSDs from crush map :