注意

本文档适用于 Ceph 的开发版本。

报告文档错误

ceph-mgr 编排器模块

警告

这是开发人员文档，描述了仅与编写 ceph-mgr 编排器模块的人员相关的 Ceph 内部结构。

在此上下文中，编排器指的是提供发现设备和创建 Ceph 服务能力的外部服务。这包括 Rook 等外部项目。

编排器模块是 ceph-mgr 模块（ceph-mgr 模块开发人员指南），它使用特定的编排器实现常见的管理操作。

编排器模块是 Orchestrator 类的子类：该类是一个接口，它只提供由子类实现的方法定义。为不同的编排器定义此通用接口的目的是使通用 UI 代码（例如仪表板）能够与各种不同的后端一起工作。

在所有抽象的背后，编排器模块的目的很简单：使 Ceph 能够执行诸如发现可用硬件、创建和销毁 OSD 以及运行 MDS 和 RGW 服务等操作。

此处不包含教程：有关完整和具体的示例，请参阅 Ceph 源代码树中现有已实现的编排器模块。

词汇表

有状态服务: 使用本地存储的守护进程，例如 OSD 或 mon。
无状态服务: 不使用任何本地存储的守护进程，例如 MDS、RGW、nfs-ganesha、iSCSI 网关。
标签: 管理员可以应用于主机的任意字符串标签。管理员通常使用标签来指示哪些主机应该运行哪种类型的服务。标签是建议性的（来自人工输入），不保证主机具有特定的物理能力。
驱动器组: 具有通用/共享 OSD 格式的块设备集合（通常是一个或多个 SSD 作为一组 HDD 的日志/数据库）。
Placement: 用于运行服务的的主机选择。

关键概念

底层编排器仍然是有关服务是否正在运行、正在哪里运行、哪些主机可用等信息的真相来源。编排器模块应避免在内部复制此信息，并尽可能直接从编排器后端读取它。

引导主机并将其添加到底层编排系统超出了 Ceph 编排器接口的范围。只有当编排器已经知道主机时，Ceph 才能在主机上工作。

在可能的情况下，无状态服务的位置应留给编排器决定。

完成与批处理

所有读取或修改系统状态的方法都可能是长期运行的。因此，模块需要调度这些操作。

每个编排器模块都实现了自己的底层完成机制。这可能涉及在线程中运行底层操作，或者在后台一次性执行之前将操作分批处理。如果实现这种批处理模式，模块将不会对任何操作进行任何工作，直到它出现在传入 process 的完成列表中。

错误处理

编排器模块中错误处理的主要目标是提供调试信息，以帮助用户处理部署错误。

class orchestrator.OrchestratorError(msg, errno=-22, event_kind_subject=None)

通用编排器特定错误。

用于部署、配置或用户错误。

它不适用于编程错误或编排器内部错误。

class orchestrator.NoOrchestrator(msg='No orchestrator configured (try `ceph orch set backend`)'): 未配置编排器。

class orchestrator.OrchestratorValidationError(msg, errno=-22, event_kind_subject=None): 当编排器不支持特定功能时引发。

详细来说，编排器需要显式处理不同类型的错误

未配置编排器

请参阅 NoOrchestrator。
编排器未实现特定方法。

例如，编排器不支持 add_host。

在这种情况下，会引发 NotImplementedError。
已实现方法中缺少功能。

例如，后端不支持命令的可选参数（例如，使用 rook 后端的 Orchestrator.apply_mons() 命令中的 hosts 字段）。

请参阅 OrchestratorValidationError。
输入验证错误

orchestrator 模块和其他调用模块应该提供有意义的错误消息。

请参阅 OrchestratorValidationError。
实际执行命令时的错误

生成的 Completion 应包含有助于理解问题的错误字符串。此外，Completion.is_errored() 设置为 True
编排器模块中的配置无效

这可以像 5. 一样解决。

所有其他错误都是意外的编排器问题，因此应引发异常，然后将其记录到 mgr 日志文件中。如果此时存在 completion 对象，则 Completion.result() 可能包含错误消息。

排除的功能

Ceph 的编排器接口不是用于管理 Linux 服务器的通用框架 -- 它被故意限制为仅管理 Ceph 集群的服务。
不处理多路径存储（多路径对于 Ceph 集群是不必要的）。假设每个驱动器仅在一个主机上可见。

主机管理

Orchestrator.add_host(host_spec)

将主机添加到编排器清单中。

参数:: host -- 主机名
返回类型:: OrchResult[str]

Orchestrator.remove_host(host, force, offline, rm_crush_entry)

从编排器清单中删除主机。

参数:: host (str) -- 主机名
返回类型:: OrchResult[str]

Orchestrator.get_hosts()

报告集群中的主机。

返回类型:: OrchResult[List[HostSpec]]
返回:: HostSpec 列表

Orchestrator.update_host_addr(host, addr)

更新主机的地址

参数:

host (str) -- 主机名
addr (str) -- 地址（dns 名称或 IP）

返回类型:

OrchResult[str]

Orchestrator.add_host_label(host, label)

添加主机标签

返回类型:: OrchResult[str]

Orchestrator.remove_host_label(host, label, force=False)

删除主机标签

返回类型:: OrchResult[str]

class orchestrator.HostSpec(hostname, addr=None, labels=None, status=None, location=None, oob=None): 有关主机的信息。例如 kubectl get nodes

设备

Orchestrator.get_inventory(host_filter=None, refresh=False)

返回由 ceph-volume inventory 创建的内容。

返回类型:: OrchResult[List[InventoryHost]]
返回:: InventoryHost 列表

class orchestrator.InventoryFilter(labels=None, hosts=None)

获取清单时，使用此过滤器以避免不必要地扫描整个资产。

典型用法

在显示用于配置特定服务器的 UI 工作流时按主机过滤。当并非所有资产都是 Ceph 服务器时按标签过滤，并且我们只想了解 Ceph 服务器。当我们特别对 OSD 服务器感兴趣时按标签过滤。

class ceph.deployment.inventory.Devices(devices): 带有报告的 Device 实例容器

class ceph.deployment.inventory.Device(path, sys_api=None, available=None, rejected_reasons=None, lvs=None, device_id=None, lsm_data=None, created=None, ceph_device=None, crush_device_class=None, being_replaced=None)

放置

A Daemon Placement 定义了特定服务的守护进程放置。

通常，无状态服务不需要任何特定的放置规则，因为它们可以在有足够系统资源的任何地方运行。但是，某些编排器可能不包含以这种方式选择位置的功能。或者，您可以在创建无状态服务时指定位置。

class ceph.deployment.service_spec.PlacementSpec(label=None, hosts=None, count=None, count_per_host=None, host_pattern=None)

对于需要指定主机子集的 API

classmethod from_string(arg)

单个整数被解析为计数

>>> PlacementSpec.from_string('3')
PlacementSpec(count=3)

名称列表被解析为主机规范

>>> PlacementSpec.from_string('host1 host2')
PlacementSpec(hosts=[HostPlacementSpec(hostname='host1', network='', name=''), HostPlacementSpec(hostname='host2', network='', name='')])

您还可以按如下方式为主机添加计数前缀

>>> PlacementSpec.from_string('2 host1 host2')
PlacementSpec(count=2, hosts=[HostPlacementSpec(hostname='host1', network='', name=''), HostPlacementSpec(hostname='host2', network='', name='')])

您可以使用 label:<label> 指定标签

>>> PlacementSpec.from_string('label:mon')
PlacementSpec(label='mon')

标签也支持计数

>>> PlacementSpec.from_string('3 label:mon')
PlacementSpec(count=3, label='mon')

您可以使用 regex:<regex> 指定要匹配的正则表达式

>>> PlacementSpec.from_string('regex:Foo[0-9]|Bar[0-9]')
PlacementSpec(host_pattern=HostPattern(pattern='Foo[0-9]|Bar[0-9]', pattern_type=PatternType.regex))

如果未提供“regex:”，则 fnmatch 是单个字符串的默认值

>>> PlacementSpec.from_string('data[1-3]')
PlacementSpec(host_pattern=HostPattern(pattern='data[1-3]', pattern_type=PatternType.fnmatch))

>>> PlacementSpec.from_string(None)
PlacementSpec()

返回类型:: PlacementSpec

host_pattern: HostPattern: fnmatch 模式用于选择主机。也可以是单个主机。

pretty_str()

>>> 
... ps = PlacementSpec(...)  # For all placement specs:
... PlacementSpec.from_string(ps.pretty_str()) == ps

返回类型:: str

服务

class orchestrator.ServiceDescription(spec, container_image_id=None, container_image_name=None, service_url=None, last_refresh=None, created=None, deleted=None, size=0, running=0, events=None, virtual_ip=None, ports=[])

用于响应有关特定服务（有状态或无状态）状态的查询。

这与服务的健康或性能监控无关：它与让编排器告诉 Ceph 是否以及在集群的何处调度服务有关。当编排器告诉 Ceph“它正在 host123 上运行”时，这并不是承诺该进程此刻正在运行，而是描述编排器已决定该服务应该运行的位置。

class ceph.deployment.service_spec.ServiceSpec(service_type, service_id=None, placement=None, count=None, config=None, ssl=False, certificate_source=None, custom_sans=None, ssl_cert=None, ssl_key=None, unmanaged=False, preview_only=False, networks=None, targets=None, extra_container_args=None, extra_entrypoint_args=None, custom_configs=None, ip_addrs=None, ssl_ca_cert=None, termination_grace_period_seconds=None)

服务创建的详细信息。

向编排器请求守护进程集群，例如 MDS、RGW、iscsi 网关、nvmeof 网关、MON、MGR、Prometheus

此结构应该包含启动服务所需的所有信息。

classmethod from_json(cls, json_spec)

从 json 结构初始化 'ServiceSpec' 对象数据

有两种有效的服务规范样式

“旧”样式

service_type: nfs
service_id: foo
pool: mypool
namespace: myns

和“新”样式

service_type: nfs
service_id: foo
config:
  some_option: the_value
networks: [10.10.0.0/16]
spec:
  pool: mypool
  namespace: myns

在 https://tracker.ceph.com/issues/45321 中，我们决定倾向于新样式，因为它更具可读性，并能更好地理解给定服务类型的特殊字段。

请注意，我们需要在接下来的两个主要版本（octopus、pacific）中保持与这两个版本的兼容性。

参数:: json_spec (Dict) -- 带有 ServiceSpec 的有效字典
返回类型:: TypeVar(ServiceSpecT, bound= ServiceSpec)

networks: List[str]: 网络标识符列表，指示守护进程仅绑定到该列表中的特定网络。如果集群分布在多个网络中，您可以添加多个网络。请参阅网络和端口、指定网络和指定网络。

placement: PlacementSpec: 请参阅守护进程放置。

service_id: 服务的名称。对于 iscsi、nvmeof、mds、nfs、osd、rgw、container、ingress 是必需的

service_type: 服务的类型。必须是 Ceph 服务（mon、crash、mds、mgr、osd 或 rbd-mirror）、网关（nfs 或 rgw）、监控堆栈的一部分（alertmanager、grafana、node-exporter 或 prometheus）或（container）用于自定义容器。

unmanaged: 如果设置为 true，编排器将不会部署或删除与此服务关联的任何守护进程。Placement 和所有其他属性将被忽略。如果您不希望临时管理此服务，这将很有用。对于 cephadm，请参阅禁用守护进程的自动部署

Orchestrator.describe_service(service_type=None, service_name=None, refresh=False)

描述编排器中已配置的服务（任何类型）。例如，在仪表板中查看 OSD 时，我们可能希望同时显示有关编排器对该服务的看法的详细信息（例如 kubernetes pod ID）。

在仪表板中查看 CephFS 文件系统时，我们将使用它来显示当前为 MDS 守护进程运行的 pod。

返回类型:: OrchResult[List[ServiceDescription]]
返回:: ServiceDescription 对象列表。

Orchestrator.service_action(action, service_name)

对服务执行操作（启动/停止/重新加载）（即，提供逻辑服务的所有守护进程）。

参数:

action (str) -- “start”、“stop”、“restart”、“redeploy”、“reconfig”之一
service_name (str) -- service_type + ‘.’ + service_id（例如 “mon”、“mgr”、“mds.mycephfs”、“rgw.realm.zone” 等）

返回类型:

OrchResult

Orchestrator.remove_service(service_name, force=False)

删除服务（守护进程集合）。

返回类型:: OrchResult[str]
返回:: None

守护进程

Orchestrator.list_daemons(service_name=None, daemon_type=None, daemon_id=None, host=None, refresh=False)

描述编排器中已配置的守护进程（任何类型）。

返回类型:: OrchResult[List[DaemonDescription]]
返回:: DaemonDescription 对象列表。

Orchestrator.remove_daemons(names)

删除特定的守护进程。

返回类型:: OrchResult[List[str]]
返回:: None

Orchestrator.daemon_action(action, daemon_name, image=None, force=False)

对守护进程执行操作（启动/停止/重新加载）。

参数:

action (str) -- “start”、“stop”、“restart”、“redeploy”、“reconfig”之一
daemon_name (str) -- 守护进程名称
image (Optional[str]) -- 重新部署该守护进程时的容器镜像

返回类型:

OrchResult

class orchestrator.DaemonDescription(daemon_type=None, daemon_id=None, hostname=None, container_id=None, container_image_id=None, container_image_name=None, container_image_digests=None, version=None, status=None, status_desc=None, last_refresh=None, created=None, started=None, last_configured=None, osdspec_affinity=None, last_deployed=None, events=None, is_active=False, memory_usage=None, memory_request=None, memory_limit=None, cpu_percentage=None, service_name=None, ports=None, ip=None, deployed_by=None, systemd_unit=None, rank=None, rank_generation=None, extra_container_args=None, extra_entrypoint_args=None, pending_daemon_config=False)

用于响应有关特定守护进程（有状态或无状态）状态的查询。

这与守护进程的健康或性能监控无关：它与让编排器告诉 Ceph 是否以及在集群的何处调度守护进程有关。当编排器告诉 Ceph“它正在 host123 上运行”时，这并不是承诺该进程此刻正在运行，而是描述编排器已决定该守护进程应该运行的位置。

class orchestrator.DaemonDescriptionStatus(value): 枚举。

OSD 管理

Orchestrator.create_osds(drive_group, skip_validation=False)

在单个驱动器组中创建一个或多个 OSD。

此处的主要参数是 OsdSpec 的 drive_group 成员：其他字段是用于任何更精细的 OSD 功能启用（后端存储的选择、压缩/加密等）的建议性/可扩展字段。

返回类型:: OrchResult[str]

Orchestrator.blink_device_light(ident_fault, on, locations)

指示编排器启用或禁用 ident 或 fault LED。

参数:

ident_fault (str) -- “ident”或“fault”之一
on (bool) -- True = 开。
locations (List[DeviceLightLoc]) -- 请参阅 orchestrator.DeviceLightLoc

返回类型:

OrchResult[List[str]]

class orchestrator.DeviceLightLoc(host, dev, path)

描述特定主机上的特定设备。用于启用或禁用设备上的 LED。

主机名，如 orchestrator.Orchestrator.get_hosts() 中所示

device_id: 例如 ABC1234DEF567-1R1234_ABC8DE0Q。: 请参阅 ceph osd metadata | jq '.[].device_ids'

OSD 更换

有关底层过程，请参阅更换 OSD。

更换 OSD 基本上是一个两阶段过程，因为用户需要物理更换驱动器。因此，编排器暴露了这个两阶段过程。

第一阶段是调用 Orchestrator.remove_daemons()，其中 destroy=True 以将 OSD 标记为已销毁。

第二阶段是调用 Orchestrator.create_osds()，其中驱动器组的

DriveGroupSpec.osd_id_claims 设置为已销毁的 OSD ID。

服务

Orchestrator.add_daemon(spec)

为非托管服务创建守护进程

返回类型:: OrchResult[List[str]]

Orchestrator.apply_mon(spec)

更新 mon 集群

返回类型:: OrchResult[str]

Orchestrator.apply_mgr(spec)

更新 mgr 集群

返回类型:: OrchResult[str]

Orchestrator.apply_mds(spec)

更新 MDS 集群

返回类型:: OrchResult[str]

Orchestrator.apply_rbd_mirror(spec)

更新 rbd-mirror 集群

返回类型:: OrchResult[str]

class ceph.deployment.service_spec.RGWSpec(service_type='rgw', service_id=None, placement=None, rgw_realm=None, rgw_zonegroup=None, rgw_zone=None, rgw_frontend_port=None, rgw_frontend_ssl_certificate=None, rgw_frontend_type=None, rgw_frontend_extra_args=None, unmanaged=False, ssl=False, certificate_source=None, ssl_cert=None, ssl_key=None, custom_sans=None, preview_only=False, config=None, networks=None, subcluster=None, extra_container_args=None, extra_entrypoint_args=None, custom_configs=None, only_bind_port_on_networks=False, rgw_realm_token=None, update_endpoints=False, zone_endpoints=None, zonegroup_hostnames=None, data_pool_attributes=None, rgw_user_counters_cache=False, rgw_user_counters_cache_size=None, rgw_bucket_counters_cache=False, rgw_bucket_counters_cache_size=None, generate_cert=False, disable_multisite_sync_traffic=None, wildcard_enabled=False, rgw_exit_timeout_secs=120, qat=None)

配置（多站点）Ceph RGW 的设置

service_type: rgw
service_id: myrealm.myzone
spec:
    rgw_realm: myrealm
    rgw_zonegroup: myzonegroup
    rgw_zone: myzone
    ssl: true
    rgw_frontend_port: 1234
    rgw_frontend_type: beast
    rgw_frontend_ssl_certificate: ...

另请参阅：服务规范

Orchestrator.apply_rgw(spec)

更新 RGW 集群

返回类型:: OrchResult[str]

class ceph.deployment.service_spec.NFSServiceSpec(service_type='nfs', service_id=None, placement=None, unmanaged=False, preview_only=False, config=None, networks=None, ip_addrs=None, port=None, monitoring_networks=None, monitoring_ip_addrs=None, monitoring_port=None, virtual_ip=None, enable_nlm=False, enable_haproxy_protocol=False, extra_container_args=None, extra_entrypoint_args=None, idmap_conf=None, custom_configs=None, ssl=False, ssl_cert=None, ssl_key=None, ssl_ca_cert=None, certificate_source=None, custom_sans=None, tls_ktls=False, tls_debug=False, tls_min_version=None, tls_ciphers=None)

Orchestrator.apply_nfs(spec)

更新 NFS 集群

返回类型:: OrchResult[str]

升级

Orchestrator.upgrade_available()

报告可升级到的版本

返回类型:: OrchResult
返回:: 字符串列表

Orchestrator.upgrade_start(image, version, daemon_types, hosts, services, limit)

返回类型:: OrchResult[str]

Orchestrator.upgrade_status()

如果当前正在进行升级，报告升级过程的进度，或是否发生错误。

返回类型:: OrchResult[UpgradeStatusSpec]
返回:: UpgradeStatusSpec 实例

class orchestrator.UpgradeStatusSpec

实用程序

Orchestrator.available()

报告我们是否可以与 orchestrator 通信。如果 orchestrator 未运行或无法联系，这是向用户提供有意义消息的地方。

此方法可能会被频繁调用（例如，每次页面加载时有条件地显示警告横幅），因此请确保它不会开销过大。如果需要使此方法快速执行，即使状态稍有滞后（例如，基于对 orchestrator 的定期后台ping），也是可以接受的。

注意

True 并不意味着 orchestrator 中实际提供了所需的功能。即，这不会按预期工作

>>> 
... if OrchestratorClientMixin().available()[0]:  # wrong.
...     OrchestratorClientMixin().get_hosts()

返回:: 表示模块是否可用/可用的布尔值
返回:: 描述任何错误的字符串
返回类型:: Tuple[bool, str, Dict[str, Any]]
返回:: 包含任何模块特定信息的字典

Orchestrator.get_feature_set()

描述此 orchestrator 实现的方法

注意

True 并不意味着 orchestrator 中实际可以实现所需的功能。即，这不会按预期工作

>>> 
... api = OrchestratorClientMixin()
... if api.get_feature_set()['get_hosts']['available']:  # wrong.
...     api.get_hosts()

最好是请求宽恕，而不是请求许可

>>> 
... try:
...     OrchestratorClientMixin().get_hosts()
... except (OrchestratorError, NotImplementedError):
...     ...

返回类型:: Dict[str, dict]
返回:: API 方法名称字典，映射到 {'available': True or False}

客户端模块

class orchestrator.OrchestratorClientMixin

继承自 OrchestratorClientMixin 的模块可以直接调用所有 Orchestrator 方法，而无需手动调用 remote。

Orchestrator 中的每个接口方法都被转换为一个存根方法，该方法在内部调用 OrchestratorClientMixin._oremote()

>>> class MyModule(OrchestratorClientMixin):
...    def func(self):
...        completion = self.add_host('somehost')  # calls `_oremote()`
...        self.log.debug(completion.result)

注意

Orchestrator 实现不应继承自 OrchestratorClientMixin。原因是 OrchestratorClientMixin 会将所有方法神奇地重定向到 orchestrator 的“真正”实现。

>>> import mgr_module
>>> 
... class MyImplementation(mgr_module.MgrModule, Orchestrator):
...     def __init__(self, ...):
...         self.orch_client = OrchestratorClientMixin()
...         self.orch_client.set_mgr(self.mgr))

add_daemon(spec)

为非托管服务创建守护进程

返回类型:: OrchResult[List[str]]

add_host(host_spec)

将主机添加到编排器清单中。

参数:: host -- 主机名
返回类型:: OrchResult[str]

add_host_label(host, label)

添加主机标签

返回类型:: OrchResult[str]

apply(specs, no_overwrite=False, continue_on_error=False)

应用任何规范

返回类型:: List[str]

apply_alertmanager(spec)

更新现有的 AlertManager 守护程序

返回类型:: OrchResult[str]

apply_alloy(spec)

更新现有的 alloy 守护程序

返回类型:: OrchResult[str]

apply_ceph_exporter(spec)

更新现有的 ceph exporter 守护程序

返回类型:: OrchResult[str]

apply_crash(spec)

更新现有的 crash 守护程序

返回类型:: OrchResult[str]

apply_drivegroups(specs)

更新 OSD 集群

返回类型:: OrchResult[List[str]]

apply_grafana(spec)

更新现有的 grafana 服务

返回类型:: OrchResult[str]

apply_ingress(spec)

更新 ingress 守护程序

返回类型:: OrchResult[str]

apply_iscsi(spec)

更新 iscsi 集群

返回类型:: OrchResult[str]

apply_loki(spec)

更新现有的 Loki 守护程序

返回类型:: OrchResult[str]

apply_mds(spec)

更新 MDS 集群

返回类型:: OrchResult[str]

apply_mgmt_gateway(spec)

更新现有集群网关服务

返回类型:: OrchResult[str]

apply_mgr(spec)

更新 mgr 集群

返回类型:: OrchResult[str]

apply_mon(spec)

更新 mon 集群

返回类型:: OrchResult[str]

apply_nfs(spec)

更新 NFS 集群

返回类型:: OrchResult[str]

apply_node_exporter(spec)

更新现有的 Node-Exporter 守护程序

返回类型:: OrchResult[str]

apply_nvmeof(spec)

更新 nvmeof 集群

返回类型:: OrchResult[str]

apply_oauth2_proxy(spec)

更新现有的 oauth2-proxy

返回类型:: OrchResult[str]

apply_prometheus(spec)

更新 prometheus 集群

返回类型:: OrchResult[str]

apply_promtail(spec)

更新现有的 Promtail 守护程序

返回类型:: OrchResult[str]

apply_rbd_mirror(spec)

更新 rbd-mirror 集群

返回类型:: OrchResult[str]

apply_rgw(spec)

更新 RGW 集群

返回类型:: OrchResult[str]

apply_smb(spec)

更新 smb 网关服务

返回类型:: OrchResult[str]

apply_snmp_gateway(spec)

更新现有的 snmp 网关服务

返回类型:: OrchResult[str]

apply_tuned_profiles(specs, no_overwrite)

添加或更新现有的 tuned 配置文件

返回类型:: OrchResult[str]

available()

报告我们是否可以与 orchestrator 通信。如果 orchestrator 未运行或无法联系，这是向用户提供有意义消息的地方。

此方法可能会被频繁调用（例如，每次页面加载时有条件地显示警告横幅），因此请确保它不会开销过大。如果需要使此方法快速执行，即使状态稍有滞后（例如，基于对 orchestrator 的定期后台ping），也是可以接受的。

注意

True 并不意味着 orchestrator 中实际提供了所需的功能。即，这不会按预期工作

>>> 
... if OrchestratorClientMixin().available()[0]:  # wrong.
...     OrchestratorClientMixin().get_hosts()

返回:: 表示模块是否可用/可用的布尔值
返回:: 描述任何错误的字符串
返回类型:: Tuple[bool, str, Dict[str, Any]]
返回:: 包含任何模块特定信息的字典

blink_device_light(ident_fault, on, locations)

指示编排器启用或禁用 ident 或 fault LED。

参数:

ident_fault (str) -- “ident”或“fault”之一
on (bool) -- True = 开。
locations (List[DeviceLightLoc]) -- 请参阅 orchestrator.DeviceLightLoc

返回类型:

OrchResult[List[str]]

cancel_completions()

取消正在进行的完成操作。解除 mgr 的阻塞。

返回类型:: None

create_osds(drive_group, skip_validation=False)

在单个驱动器组中创建一个或多个 OSD。

此处的主要参数是 OsdSpec 的 drive_group 成员：其他字段是用于任何更精细的 OSD 功能启用（后端存储的选择、压缩/加密等）的建议性/可扩展字段。

返回类型:: OrchResult[str]

daemon_action(action, daemon_name, image=None, force=False)

对守护进程执行操作（启动/停止/重新加载）。

参数:

action (str) -- “start”、“stop”、“restart”、“redeploy”、“reconfig”之一
daemon_name (str) -- 守护进程名称
image (Optional[str]) -- 重新部署该守护进程时的容器镜像

返回类型:

OrchResult

describe_service(service_type=None, service_name=None, refresh=False)

描述编排器中已配置的服务（任何类型）。例如，在仪表板中查看 OSD 时，我们可能希望同时显示有关编排器对该服务的看法的详细信息（例如 kubernetes pod ID）。

在仪表板中查看 CephFS 文件系统时，我们将使用它来显示当前为 MDS 守护进程运行的 pod。

返回类型:: OrchResult[List[ServiceDescription]]
返回:: ServiceDescription 对象列表。

drain_host(hostname, force=False, keep_conf_keyring=False, zap_osd_devices=False)

从主机上排出所有守护程序

参数:: hostname (str) -- 主机名
返回类型:: OrchResult[str]

enter_host_maintenance(hostname, force=False, yes_i_really_mean_it=False)

将主机置于维护状态，停止守护程序并禁用其 systemd 目标

返回类型:: OrchResult[str]

exit_host_maintenance(hostname, force=False, offline=False)

将主机从维护状态返回，重新启动集群的 systemd 目标

返回类型:: OrchResult[str]

generate_certificates(module_name)

为名为 module_name 的模块生成证书/密钥

返回类型:: OrchResult[Optional[Dict[str, str]]]

get_alertmanager_access_info()

获取 alertmanager 访问信息

返回类型:: OrchResult[Dict[str, str]]

get_facts(hostname=None)

返回主机元数据（gather_facts）。

返回类型:: OrchResult[List[Dict[str, Any]]]

get_feature_set()

描述此 orchestrator 实现的方法

注意

True 并不意味着 orchestrator 中实际可以实现所需的功能。即，这不会按预期工作

>>> 
... api = OrchestratorClientMixin()
... if api.get_feature_set()['get_hosts']['available']:  # wrong.
...     api.get_hosts()

最好是请求宽恕，而不是请求许可

>>> 
... try:
...     OrchestratorClientMixin().get_hosts()
... except (OrchestratorError, NotImplementedError):
...     ...

返回类型:: Dict[str, dict]
返回:: API 方法名称字典，映射到 {'available': True or False}

get_hosts()

报告集群中的主机。

返回类型:: OrchResult[List[HostSpec]]
返回:: HostSpec 列表

get_inventory(host_filter=None, refresh=False)

返回由 ceph-volume inventory 创建的内容。

返回类型:: OrchResult[List[InventoryHost]]
返回:: InventoryHost 列表

get_prometheus_access_info()

获取 prometheus 访问信息

返回类型:: OrchResult[Dict[str, str]]

get_security_config()

获取安全配置

返回类型:: OrchResult[Dict[str, bool]]

hardware_light(light_type, action, hostname, device=None)

点亮机箱或设备识别 LED。

参数:

light_type (str) -- led 类型（chassis 或 device）。
action (str) -- 设置或获取状态 led。
hostname (str) -- 主机名。
device (Optional[str]) -- 设备 ID（当 light_type = ‘device’ 时）

返回类型:

OrchResult[Dict[str, Any]]

hardware_powercycle(hostname, yes_i_really_mean_it=False)

重启主机。

参数:: hostname (str) -- 正在重启的主机名。
返回类型:: OrchResult[str]

hardware_shutdown(hostname, force=False, yes_i_really_mean_it=False)

关闭主机。

参数:: hostname (str) -- 要关闭的主机名。
返回类型:: OrchResult[str]

hardware_status(hostname=None, category='summary')

显示硬件状态。

参数:

category (Optional[str]) -- 类别
hostname (Optional[str]) -- 主机名

返回类型:

OrchResult[str]

host_ok_to_stop(hostname)

检查指定主机是否可以安全停止而不会降低可用性

参数:: host -- 主机名
返回类型:: OrchResult[str]

list_daemons(service_name=None, daemon_type=None, daemon_id=None, host=None, refresh=False)

描述编排器中已配置的守护进程（任何类型）。

返回类型:: OrchResult[List[DaemonDescription]]
返回:: DaemonDescription 对象列表。

node_proxy_common(category, hostname=None)

返回 node-proxy 通用报告

参数:: hostname (Optional[str]) -- 主机名
返回类型:: OrchResult[Dict[str, Any]]

node_proxy_criticals(hostname=None)

返回 node-proxy 关键报告

参数:: hostname (Optional[str]) -- 主机名
返回类型:: OrchResult[Dict[str, Any]]

node_proxy_firmwares(hostname=None)

返回 node-proxy 固件报告

参数:: hostname (Optional[str]) -- 主机名
返回类型:: OrchResult[Dict[str, Any]]

node_proxy_fullreport(hostname=None)

返回 node-proxy 完整报告

参数:: hostname (Optional[str]) -- 主机名
返回类型:: OrchResult[Dict[str, Any]]

node_proxy_summary(hostname=None)

返回 node-proxy 摘要

参数:: hostname (Optional[str]) -- 主机名
返回类型:: OrchResult[Dict[str, Any]]

plan(spec)

计划（试运行、预览）规范列表。

返回类型:: OrchResult[List]

preview_osdspecs(osdspec_name='osd', osdspecs=None)

获取 OSD 部署预览

返回类型:: OrchResult[str]

remove_daemons(names)

删除特定的守护进程。

返回类型:: OrchResult[List[str]]
返回:: None

remove_host(host, force, offline, rm_crush_entry)

从编排器清单中删除主机。

参数:: host (str) -- 主机名
返回类型:: OrchResult[str]

remove_host_label(host, label, force=False)

删除主机标签

返回类型:: OrchResult[str]

remove_osds(osd_ids, replace=False, replace_block=False, replace_db=False, replace_wal=False, force=False, zap=False, no_destroy=False)

参数:

osd_ids (List[str]) -- OSD ID 列表
replace (bool) -- 标记 OSD 正在被销毁。请参阅 OSD Replacement
replace_block (bool) -- 标记相应的块设备正在被替换。
replace_db (bool) -- 标记相应的 db 设备正在被替换。
replace_wal (bool) -- 标记相应的 wal 设备正在被替换。
force (bool) -- 强制执行 OSD 移除过程，无需等待数据先排出。
zap (bool) -- 擦除与 OSD 关联的所有设备（销毁数据）
no_destroy (bool) -- 不要销毁与 OSD 关联的 VG/LV。

注意

这只能移除已成功创建的 OSD（即获得了 OSD ID）。

返回类型:: OrchResult[str]

remove_osds_status()

返回正在进行的 OSD 移除操作的状态。

返回类型:: OrchResult

remove_prometheus_target(url)

移除多集群的 prometheus 目标

返回类型:: OrchResult[str]

remove_service(service_name, force=False)

删除服务（守护进程集合）。

返回类型:: OrchResult[str]
返回:: None

replace_device(hostname, device, clear=False, yes_i_really_mean_it=False)

执行所有必要的操作以替换设备。

返回类型:: OrchResult

rescan_host(hostname)

使用 cephadm 对每个 HBA 执行磁盘重新扫描

一些 HBA 和外部机箱不会自动向内核注册设备插入，因此在这些情况下我们需要手动重新扫描

参数:: hostname (str) -- (str) 主机名
返回类型:: OrchResult

rm_tuned_profile(profile_name)

移除 tuned 配置文件

返回类型:: OrchResult[str]

service_action(action, service_name)

对服务执行操作（启动/停止/重新加载）（即，提供逻辑服务的所有守护进程）。

参数:

action (str) -- “start”、“stop”、“restart”、“redeploy”、“reconfig”之一
service_name (str) -- service_type + ‘.’ + service_id（例如 “mon”、“mgr”、“mds.mycephfs”、“rgw.realm.zone” 等）

返回类型:

OrchResult

set_alertmanager_access_info(user, password)

设置 alertmanager 访问信息

返回类型:: OrchResult[str]

set_custom_prometheus_alerts(alerts_file)

设置 prometheus 自定义警报文件并安排 prometheus 重新配置

返回类型:: OrchResult[str]

set_mgr(mgr)

可在使用全局 mgr 的 Dashboard 中使用

返回类型:: None

set_osd_spec(service_name, osd_ids)

设置 osd 服务

返回类型:: OrchResult

set_prometheus_access_info(user, password)

设置 prometheus 访问信息

返回类型:: OrchResult[str]

set_prometheus_target(url)

设置多集群的 prometheus 目标

返回类型:: OrchResult[str]

set_unmanaged(service_name, value)

为给定服务设置 unmanaged 参数为 True/False

返回类型:: OrchResult[str]
返回:: None

stop_drain_host(hostname)

停止排出主机守护程序

参数:: hostname (str) -- 主机名
返回类型:: OrchResult[str]

stop_remove_osds(osd_ids)

待办事项

返回类型:: OrchResult

tuned_profile_add_setting(profile_name, setting, value)

更改/添加 tuned 配置文件的特定设置

返回类型:: OrchResult[str]

tuned_profile_add_settings(profile_name, setting)

更改/添加 tuned 配置文件的多个设置

返回类型:: OrchResult[str]

tuned_profile_ls()

查看当前的 tuned 配置文件

返回类型:: OrchResult[List[TunedProfileSpec]]

tuned_profile_rm_setting(profile_name, setting)

移除 tuned 配置文件的特定设置

返回类型:: OrchResult[str]

tuned_profile_rm_settings(profile_name, settings)

从 tuned 配置文件中移除多个设置

返回类型:: OrchResult[str]

update_host_addr(host, addr)

更新主机的地址

参数:

host (str) -- 主机名
addr (str) -- 地址（dns 名称或 IP）

返回类型:

OrchResult[str]

upgrade_available()

报告可升级到的版本

返回类型:: OrchResult
返回:: 字符串列表

upgrade_status()

如果当前正在进行升级，报告升级过程的进度，或是否发生错误。

返回类型:: OrchResult[UpgradeStatusSpec]
返回:: UpgradeStatusSpec 实例

zap_device(host, path)

擦除设备（销毁数据）

返回类型:: OrchResult[str]

由 Ceph 基金会为您呈现

Ceph 文档是由非营利性 Ceph 基金会资助和托管的社区资源。如果您希望支持这项工作和我们的其他努力，请考虑立即加入。