I had a service specification that assigned all free SSDs to OSDs:
service\_type: osd
service\_id: dashboard-tintin-7634852880
service\_name: osd.dashboard-tintin-7634852880
placement:
host\_pattern: '*'
spec:
data\_devices:
rotational: false
filter\_logic: AND
objectstore: bluestore
I want more control over which drives each server assigns so I created some new specifications as follows:
service_type: osd
service_id: dashboard-tintin-1715222958508
service_name: osd.dashboard-tintin-1715222958508
placement:
host_pattern: 'host1'
spec:
data_devices:
rotational: false
filter_logic: AND
objectstore: bluestore
In Ceph Dashboard -> Services I could see that my old OSD daemons continued to run under the control of the old service definitions. I deleted the old service definition. I got a warining:
If osd.dashboard-tintin-7634852880 is removed the the following OSDs will remain, --force to proceed anyway ...
As I thought keeping the daemons going is what I want I continued with --force
. Now Ceph Dashboard -> Services lists the OSDs and "Unmanaged" and the new service definition still has not picked them up. How can I move these OSD daemons under the new service specification?
If I stop the daemons new ones do not get started by the new service definition. If I redeploy the daemons they still show as "unmanaged". The only way I can get them to move under the new service definition is to stop the daemon and zap the drive. However this is not a practical solution given the size of the cluster.
Given that the data is present and correct I am surprised there is no way to bring stray daemons to heal. (I have looked at the docs about stray daemons but they only reference the context of upgrading the cluster to cephadm).
This is part of my ceph orch ls osd --export
:
service_type: osd
service_id: dashboard-tintin-1706434852880
service_name: osd.dashboard-tintin-1706434852880
unmanaged: true
spec:
filter_logic: AND
objectstore: bluestore
---
service_type: osd
service_id: dashboard-tintin-1715222958508
service_name: osd.dashboard-tintin-1715222958508
placement:
host_pattern: ceph-pn-osd1
spec:
data_devices:
rotational: false
filter_logic: AND
objectstore: bluestore
---
service_type: osd
service_id: dashboard-tintin-1712545397532
service_name: osd.dashboard-tintin-1712545397532
placement:
host_pattern: ceph-pn-osd2
spec:
data_devices:
rotational: false
filter_logic: AND
objectstore: bluestore
---
service_type: osd
service_id: dashboard-tintin-1706421419210
service_name: osd.dashboard-tintin-1706421419210
placement:
host_pattern: ceph-pn-osd3
spec:
data_devices:
rotational: false
filter_logic: AND
objectstore: bluestore
---
service_type: osd
service_id: dashboard-tintin-1706421419211
service_name: osd.dashboard-tintin-1706421419211
placement:
host_pattern: ceph-pn-osd4
spec:
data_devices:
rotational: false
filter_logic: AND
objectstore: bluestore
---
service_type: osd
service_id: dashboard-tintin-1706425693555
service_name: osd.dashboard-tintin-1706425693555
placement:
host_pattern: ceph-pn-osd5
spec:
data_devices:
rotational: false
filter_logic: AND
objectstore: bluestore
One way would be to modify the
unit.run
file for an OSD and point it to theservice_name
you want:Depending on the orchestrator's refresh interval, the modified
service_name
will show up after a couple of minutes. This may seem a little hacky, but since it's unclear why your current OSD service specs don't seem to work, it would be a reasonable workaround.