The HA Hub clusters solution -- MultipleHubs

The MultipleHubs is a new feature in Open Cluster Management (OCM) that allows you to configure a list of bootstrapkubeconfigs of multiple hubs. This feature is designed to provide a high availability (HA) solution of hub clusters. In this blog, we will introduce the MultipleHubs feature and how to use it.

The high availability of hub clusters means that if one hub cluster is down, the managed clusters can still communicate with other hub clusters. Users can also specify the hub cluster that the managed cluster should connect to by configuring the ManagedCluster resource.

The MultipleHubs feature is currently in the experimental stage and is disabled by default. To enable the MultipleHubs feature, you need to set the featureGate in Klusterlet’s registration configuration. The following is an example of the Klusterlet’s registration configuration:

apiVersion: operator.open-cluster-management.io/v1
kind: Klusterlet
...
spec:
  ...
  registrationConfiguration:
    ...
    featureGates:
      - feature: MultipleHubs
        mode: Enable

If MultipleHubs is enabled, you don’t need to prepare the default bootstrapKubeConfig for the managed cluster. The managed cluster will use the bootstrapKubeConfigs in the Klusterlet’s registration configuration to connect to the hub clusters. An example of bootstrapKubeConfigs is like following:

apiVersion: operator.open-cluster-management.io/v1
kind: Klusterlet
...
spec:
  ...
  registrationConfiguration:
    ...
    featureGates:
      - feature: MultipleHubs
        mode: Enable
    bootstrapKubeConfigs:
      type: "LocalSecrets"
      localSecretsConfig:
        kubeConfigSecrets:
            - name: "hub1-bootstrap"
            - name: "hub2-bootstrap"
        hubConnectionTimeoutSeconds: 600

In the above configuration, the hub1-bootstrap and hub2-bootstrap are the secrets that contain the kubeconfig of the hub clusters. You should create the secrets before you set the bootstrapKubeConfigs in the Klusterlet’s registration configuration.

The order of the secrets in the kubeConfigSecrets is the order of the hub clusters that the managed cluster will try to connect to. The managed cluster will try to connect to the first hub cluster in the list first. If the managed cluster cannot connect to the first hub cluster, it will try to connect to the second hub cluster, and so on.

Note that the expiration time of the credentials in kubeconfigs should be long enough to ensure the managed cluster can connect to another hub cluster when one hub cluster is down.

The hubConnectionTimeoutSeconds is the timeout for the managed cluster to connect to the hub clusters. If the managed cluster cannot connect to the hub cluster within the timeout, it will try to connect to another hub cluster. It is also used to avoid the effect of network disturbance. The default value is 600 seconds and the minimum value is 180 seconds.

Currently, the MultipleHubs feature only supports the LocalSecrets type of bootstrapKubeConfigs.

As we mentioned before, you can also specify the hub’s connectivities in the ManagedCluster resource from the hub side. We using the hubAcceptsClient field in the ManagedCluster resource to specify whether the hub cluster accepts the managed cluster. The following is an example of the ManagedCluster resource:

apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
...
spec:
  ...
  hubAcceptsClient: false

If the hubAcceptsClient is set to false, the managed cluster currently connected to the hub will immediately disconnect from the hub and try to connect to another hub cluster.

And the managed clusters that are trying to connect to another hub cluster will ignore the hub cluster that the managed cluster’s hubAcceptsClient is set to false.

That’s the brief introduction of the MultipleHubs feature in Open Cluster Management. We hope this feature can help you to start building a high availability solution of hub clusters and we are looking forward to your feedback. If you have any questions or suggestions, please feel free to contact us.

Using the GitOps way to deal with the upgrade challenges of multi-cluster tool chains

Upgrading challenges of tool chains in multi-cluster environments

Open Cluster Management (OCM) is a community-driven project focused on multicluster and multicloud scenarios for Kubernetes applications. It provides functions such as cluster registration, application and workload distribution, and scheduling. Add-on is an extension mechanism based on the foundation components provided by OCM, which allows applications in the Kubernetes ecosystem to be easily migrated to the OCM platform and has the ability to orchestrate and schedule across multiple clusters and multiple clouds. For example, Istio, Prometheus, and Submarine can be expanded to multiple clusters through Add-on. In a multi-cluster environment, how to upgrade the entire tool chain (such as Istio, Prometheus and other tools) gracefully and smoothly is a challenge we encounter in multi-cluster management. A failed upgrade of the tool chain can potentially render thousands of user workloads inaccessible. Therefore, finding an easy and safe upgrade solution across clusters becomes important.

In this article, we will introduce how Open Cluster Management (OCM) treats tool chain upgrades as configuration file changes, allowing users to leverage Kustomize or GitOps to achieve seamless rolling/canary upgrades across clusters.

Before we begin, let us first introduce several concepts in OCM.

Add-on

On the OCM platform, add-on can apply different configurations on different managed clusters, and can also implement functions such as obtaining data from the control plane (Hub) to the managed cluster. For example, you can use managed-serviceaccount, this add-on returns the specified ServiceAccount information on the managed cluster to the hub cluster. You can use the cluster-proxy add-on to establish a reverse proxy channel from spoke to hub.

At this stage, there are some add-ons in the OCM community:

  • Multicluster Mesh Addon can be used to manage (discovery, deploy and federate) service meshes across multiple clusters in OCM.
  • Submarine Addon deploys the Submariner Broker on the Hub cluster and the required Submariner components on the managed clusters.
  • Open-telemetry add-on automates the installation of otelCollector on both hub cluster and managed clusters and jaeget-all-in-one on hub cluster for processing and storing the traces.
  • Application lifecycle management enables application lifecycle management in multi-cluster or multi-cloud environments.
  • Policy framework and Policy controllers allows Hub cluster administrators to easily deploy security-related policies for managed clusters.
  • Managed service account enables a hub cluster admin to manage service account across multiple clusters on ease.
  • Cluster proxy provides L4 network connectivity from hub cluster to the managed clusters.

For more information about add-on, please refer to Add-on concept and Add-on Developer Guide.

OCM provides two ways to help developers develop their own add-ons:

  • Hard mode: Using the built-in mechanism of addon-framework, you can follow the Add-on Development Guide to develop the addon manager and addon agent.
  • Easy mode: OCM provides a new development model, which can use AddOnTemplate to build add-on. In this model, developers do not need to develop the addon manager, but only need to prepare the addon agent’s image and AddOnTemplate. AddOnTemplate describes how to deploy the addon agent and how to register the add-on.

Below is the ClusterManagementAddOn and AddOnTemplate of a sample add-on. AddOnTemplate is treated as an add-on configuration file, defined in supportedConfigs. The AddOnTemplate resource contains the manifest required to deploy the add-on and the add-on registration method.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: hello-template
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   addOnMeta:
     description: hello-template is an addon built with addon template
     displayName: hello-template
   supportedConfigs: # declare it is a template type addon
   - group: addon.open-cluster-management.io
     resource: addontemplates
     defaultConfig:
       name: hello-template
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: AddOnTemplate
metadata:
   name: hello-template
spec:
   addonName: hello-template
   agentSpec: #required
       workload:
         manifests:
           - kind: Deployment
             metadata:
               name: hello-template-agent
               namespace: open-cluster-management-agent-addon
...
           - kind: ServiceAccount
             metadata:
               name: hello-template-agent-sa
               namespace: open-cluster-management-agent-addon
           - kind: ClusterRoleBinding
              metadata:
                name: hello-template-agent
...
   registration: #optional
     ...

Placement Decision Strategy

The Placement API is used to select a set of ManagedClusters in one or more ManagedClusterSets to deploy workloads to these clusters.

For more introduction to the Placement API, please refer to Placement concept.

The “input” and “output” of the Placement scheduling process are decoupled into two independent Kubernetes APIs: Placement and PlacementDecision.

  • Placement provides filtering of clusters through the labelSelector or the claimSelector, and also provides some built-in prioritizers, which can score, sort and prioritize the filtered clusters.
  • The scheduling results of Placement will be placed in PlacementDecision, status.decisions lists the top N clusters with the highest scores and sorts them by name, and the scheduling results will dynamically change as the cluster changes. The decisionStrategy section in Placement can be used to divide the created PlacementDecision into multiple groups and define the number of clusters in each decision group. PlacementDecision supports paging display, and each resource supports containing 100 cluster names.

Below is an example of Placement and decisionStrategy. Assume that there are 300 ManagedClusters in the global ManagedClusterSets, and 10 of them have the label canary. The following example describes grouping the canary-labeled clusters into a group and grouping the remaining clusters into groups of up to 150 clusters each.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
   name: aws-placement
   namespace: default
spec:
   clusterSets:
     - global
   decisionStrategy:
     groupStrategy:
       clustersPerDecisionGroup: 150
       decisionGroups:
       - groupName: canary
         groupClusterSelector:
           labelSelector:
             matchExpressions:
               - key: canary
                 operator: Exists

The grouped results will be displayed in the status of Placement. The canary group has 10 clusters, and the results are placed in aws-placement-decision-1. The other default groupings are only group index, each group has 150 and 140 clusters respectively. Since a PlacementDecsion only supports 100 clusters, the results for each group are put into two PlacementDecisions.

status:
...
   decisionGroups:
   - clusterCount: 10
     decisionGroupIndex: 0
     decisionGroupName: canary
     decisions:
     - aws-placement-decision-1
   - clusterCount: 150
     decisionGroupIndex: 1
     decisionGroupName: ""
     decisions:
     - aws-placement-decision-2
     - aws-placement-decision-3
   - clusterCount: 140
     decisionGroupIndex: 2
     decisionGroupName: ""
     decisions:
     - placement1-decision-3
     - placement1-decision-4
   numberOfSelectedClusters: 300

Taking the canary group as an example, its PlacementDecision is as follows, where the label cluster.open-cluster-management.io/decision-group-index represents the index of the group to which it belongs, cluster.open-cluster-management.io/decision-group-name represents the name of the group it belongs to, and cluster.open-cluster-management.io/placement represents the Placement it belongs to. Users can flexibly obtain scheduling results through tag selectors.

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: PlacementDecision
metadata:
   labels:
     cluster.open-cluster-management.io/decision-group-index: "0"
     cluster.open-cluster-management.io/decision-group-name: canary
     cluster.open-cluster-management.io/placement: aws-placement
   name: aws-placement-decision-1
   namespace: default
status:
   decisions:
   - clusterName: cluster1
     reason: ""
...
   - clusterName: cluster10
     reason: ""

Simplify upgrades the GitOps way

The above briefly introduces the concepts of add-on template and placement decision strategy.

In OCM, we regard the upgrade of add-on as the upgrade of its configuration file. The configuration here can be AddOnTemplate or other customized configuration file such as AddOnDeploymentConfig. An add-on upgrade is treated as a configuration file update, which enables users to leverage Kustomize or GitOps for seamless cross-cluster rolling/canary upgrades. RolloutStrategy defines the upgrade strategy, supports upgrade all, progressive upgrades by cluster and progressive upgrades by cluster group, and can define a set of MandatoryDecisionGroups to try new configurations first.

According to the four principles of GitOps, let’s take a look at how OCM supports the GitOps approach to address upgrade challenges in multi-cluster environments.

  • Declarative

The configuration file used by add-on can be declared in ClusterManagementAddOn. The configuration file can be declared in the global supportedConfigs, and the configuration file will be applied to all ManagedClusterAddOn instances. It can also be declared in different placements under installStrategy. The ManagedClusterAddOn of the cluster selected by each Placement will have the same configuration file. The configuration declared in placements will override the global configuration.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: managed-serviceaccount
spec:
  supportedConfigs:
  - defaultConfig:
      name: managed-serviceaccount-0.4.0
    group: addon.open-cluster-management.io
    resource: addontemplates
  installStrategy:
    placements:
    - name: aws-placement
      namespace: default
      configs:
      - group: addon.open-cluster-management.io
        resource: addondeploymentconfigs
        name: managed-serviceaccount-addon-deploy-config
      rolloutStrategy:
        type: Progressive
        progressive:
          mandatoryDecisionGroups:
          - groupName: "canary"
          maxConcurrency: 1
    type: Placements
  • Version control

Changes in the add-on configuration file name or spec content will be considered a configuration change and will trigger an upgrade of the add-on. Users can leverage Kustomize or GitOps to control configuration file upgrades.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: AddOnTemplate
metadata:
  name: managed-serviceaccount-0.4.0
spec:
  agentSpec: # required
      workload:
        manifests:
          - kind: Deployment
            metadata:
              name: managed-serviceaccount-addon-agent
              namespace: open-cluster-management-agent-addon
...
          - kind: ServiceAccount
            metadata:
              name: managed-serviceaccount
              namespace: open-cluster-management-agent-addon

  registration: # optional
  • Automation

The OCM component addon-manager-controller under the open-cluster-management-hub namespace is a more general addon manager. It will watch the following two types of add-on and be responsible for maintaining the lifecycle of such add-on. Includes installation and upgrades. When the name or spec content of the configuration file changes, this component will upgrade the add-on according to the upgrade strategy defined by rolloutStrategy.

  • Hard mode: Using the add-on developed by the latest addon-framework, you need to delete the WithInstallStrategy() method in the code and add annotation addon.open-cluster-management.io/lifecycle: "addon-manager" in ClusterManagementAddOn. For details, refer to Add-on Development Guide.
  • Easy mode: add-on developed using AddOnTemplate mode.
✗ kubectl get deploy -n open-cluster-management-hub
NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
cluster-manager-addon-manager-controller   1/1     1            1           10h
cluster-manager-placement-controller       1/1     1            1           10h
cluster-manager-registration-controller    1/1     1            1           10h
cluster-manager-registration-webhook       1/1     1            1           10h
cluster-manager-work-webhook               1/1     1            1           10h
  • Coordination

The spec hash of the add-on configuration file will be recorded in the status of ClusterManagementAddOn and ManagedClusterAddOn. When the spec hash changes, add-on-manager-controller will continue to update the add-on according to the upgrade strategy defined by rolloutStrategy until lastAppliedConfig, lastKnownGoodConfig is consistent with desiredConfig. In the following example, because lastAppliedConfig does not match desiredConfig, the add-on status is displayed as “Upgrading”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: managed-serviceaccount

  status:
    installProgressions:
    - conditions:
      - lastTransitionTime: "2023-09-21T06:53:59Z"
        message: 1/3 upgrading, 0 timeout.
        reason: Upgrading
        status: "False"
        type: Progressing
      configReferences:
       - desiredConfig:
          name: managed-serviceaccount-0.4.1
          specHash: dcf88f5b11bd191ed2f886675f967684da8b5bcbe6902458f672277d469e2044
        group: addon.open-cluster-management.io
        lastAppliedConfig:
          name: managed-serviceaccount-0.4.0
          specHash: 1f7874ac272f3e4266f89a250d8a76f0ac1c6a4d63d18e7dcbad9068523cf187
        lastKnownGoodConfig:
          name: managed-serviceaccount-0.4.0
          specHash: 1f7874ac272f3e4266f89a250d8a76f0ac1c6a4d63d18e7dcbad9068523cf187
        resource: addontemplates
      name: aws-placementl
      namespace: default
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: managed-serviceaccount
  namespace: cluster1

status:
  conditions:
  - lastTransitionTime: "2023-09-21T06:53:42Z"
    message: upgrading.
    reason: Upgrading
    status: "False"
    type: Progressing
  configReferences:
  - desiredConfig:
      name: managed-serviceaccount-0.4.1
      specHash: dcf88f5b11bd191ed2f886675f967684da8b5bcbe6902458f672277d469e2044
    group: addon.open-cluster-management.io
    lastAppliedConfig:
      name: managed-serviceaccount-0.4.0
      specHash: dcf88f5b11bd191ed2f886675f967684da8b5bcbe6902458f672277d469e2044
    lastObservedGeneration: 1
    name: managed-serviceaccount-0.4.1
    resource: addontemplates

Three upgrade strategies

The rolloutStrategy field of ClusterManagementAddOn defines the upgrade strategy. Currently, OCM supports three types of upgrade strategies.

  • All

The default upgrade type is All, which means the new configuration file will be applied to all the clusters immediately.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: managed-serviceaccount
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   supportedConfigs:
...
   installStrategy:
     placements:
     - name: aws-placement
       namespace:default
       rolloutStrategy:
         type: All
     type: Placement
  • Progressive

Progressive means that the new configuration file will be deployed to the selected clusters progressively per cluster. The new configuration file will not be applied to the next cluster unless one of the current applied clusters reach the successful state and haven’t breached the MaxFailures. We introduced the concept of “Placement Decision Group” earlier. One or more decision groups can be specified in MandatoryDecisionGroups. If MandatoryDecisionGroups are defined, new configuration files are deployed to these cluster groups first. MaxConcurrency defines the maximum number of clusters deployed simultaneously.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: managed-serviceaccount
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   supportedConfigs:
...
   installStrategy:
     placements:
     - name: aws-placement
       namespace:default
       rolloutStrategy:
         type: Progressive
         progressive:
           mandatoryDecisionGroups:
           - groupName: "canary"
           maxConcurrency: 1
     type: Placements
  • ProgressivePerGroup

ProgressivePerGroup means that the new configuration file will be deployed to decisionGroup clusters progressively per group. The new configuration file will not be applied to the next cluster group unless all the clusters in the current group reach the successful state and haven’t breached the MaxFailures. If MandatoryDecisionGroups are defined, new configuration files are deployed to these cluster groups first. If there are no MandatoryDecisionGroups, the cluster group will be upgraded in order of index.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: managed-serviceaccount
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   supportedConfigs:
...
   installStrategy:
     placements:
     - name: aws-placement
       namespace:default
       rolloutStrategy:
         type: ProgressivePerGroup
         progressivePerGroup:
           mandatoryDecisionGroups:
           - groupName: "canary"
     type: Placements

According to the four principles of GitOps and the three upgrade strategies of OCM, users can use Kustomize or GitOps to achieve seamless rolling/canary upgrades across clusters. It is worth noting that installStrategy supports multiple placement definitions, and users can implement more advanced upgrade strategies based on this.

As in the example below, you can define two placements at the same time to select clusters on aws and gcp respectively, so that the same add-on can use different configuration files and upgrade strategies in different clusters.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: managed-serviceaccount
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   supportedConfigs:
...
   installStrategy:
     placements:
     - name: aws-placement
       namespace: default
       configs:
       - group: addon.open-cluster-management.io
         resource: addondeploymentconfigs
         name: managed-serviceaccount-addon-deploy-config-aws
       rolloutStrategy:
         type: ProgressivePerGroup
         progressivePerGroup:
           mandatoryDecisionGroups:
           - groupName: "canary"
     type: Placements
     - name: gcp-placement
       namespace: default
       configs:
       - group: addon.open-cluster-management.io
         resource: addondeploymentconfigs
         name: managed-serviceaccount-addon-deploy-config-gcp
       rolloutStrategy:
         type: ProgressivePerGroup
         progressivePerGroup:
           mandatoryDecisionGroups:
           - groupName: "canary"
     type: Placements

Three upgrade configurations

The rolloutStrategy upgrade strategy can also define MinSuccessTime, ProgressDeadline and MaxFailures to achieve more fine-grained upgrade configuration.

  • MinSuccessTime

MinSuccessTime defines how long the controller needs to wait before continuing to upgrade the next cluster when the addon upgrade is successful and MaxFailures is not reached. The default value is 0 meaning the controller proceeds immediately after a successful state is reached.

In the following example, add-on will be upgraded at a rate of one cluster every 5 minutes.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: managed-serviceaccount
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   supportedConfigs:
...
   installStrategy:
     placements:
     - name: aws-placement
       namespace: default
       rolloutStrategy:
         type: Progressive
         progressive:
           mandatoryDecisionGroups:
           - groupName: "canary"
           maxConcurrency: 1
           minSuccessTime: "5m"
     type: Placements
  • ProgressDeadline

ProgressDeadline defines the maximum time for the controller to wait for the add-on upgrade to be successful. If the add-on does not reach a successful state after ProgressDeadline, controller will stop waiting and this cluster will be treated as “timeout” and be counted into MaxFailures. Once the MaxFailures is breached, the rollout will stop. The default value is “None”, which means the controller will wait for a successful state indefinitely.

In the following example, the controller will wait for 10 minutes on each cluster until the addon upgrade is successful. If it fails after 10 minutes, the upgrade status of the cluster will be marked as “timeout”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: managed-serviceaccount
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   supportedConfigs:
...
   installStrategy:
     placements:
     - name: aws-placement
       namespace:default
       rolloutStrategy:
         type: Progressive
         progressive:
           mandatoryDecisionGroups:
           - groupName: "canary"
           maxConcurrency: 1
           progressDeadline: "10m"
     type: Placements
  • MaxFailures

MaxFailures defines the number of clusters that can tolerate upgrade failures, which can be a numerical value or a percentage. If the cluster status is failed or timeout, it will be regarded as an upgrade failure. If the failed cluster exceeds MaxFailures, the upgrade will stop.

In the following example, when 3 addons fail to upgrade or does not reach successful status for more than 10 minutes, the upgrade will stop.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
   name: managed-serviceaccount
   annotations:
     addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
   supportedConfigs:
...
   installStrategy:
     placements:
     - name: aws-placement
       namespace: default
       rolloutStrategy:
         type: Progressive
         progressive:
           mandatoryDecisionGroups:
           - groupName: "canary"
           maxConcurrency: 1
           maxFailures: 2
           progressDeadline: "10m"
     type: Placements

Summary

This article details how to use Open Cluster Management to address tool chain upgrade challenges in a multi-cluster environment using the GitOps way. OCM provides a Kubernetes-based management platform across multiple clusters and multiple clouds. Through Add-on and Placement API, users can upgrade the entire tool chain gracefully and smoothly. At the same time, OCM treats add-on upgrades as configuration file changes, enabling users to leverage Kustomize or GitOps for seamless rolling/canary upgrades across clusters. In addition, OCM also provides a variety of upgrade strategies, including all upgrade (All), progressive upgrade by cluster (Progressive) and progressive upgrade by cluster group (ProgressivePerGroup) to meet different upgrade needs.

Open Cluster Management - Configuring Your Kubernetes Fleet With the Policy Addon

View the video at YouTube.

详解OCM add-on插件

OCM add-on插件概述

OCM (open-cluster-management)是一个专注于Kubernetes应用跨多集群和多云的管理平台, 提供了集群的注册,应用和负载的分发,调度等基础功能。Add-on插件是OCM提供的一种基于基础组建 的扩展机制,可以让Kubernetes生态的应用很容易迁移到OCM平台上,拥有跨多集群多云的编排和调度的能力。

在OCM平台上,add-on插件可以实现不同被管理集群(Spoke)上应用的不同的配置,也可以实现从控制面(Hub) 获取数据到Spoke集群上等功能。比如:你可以使用managed-serviceaccount add-on插件在Spoke集群上将指定的ServiceaCount信息返回给Hub集群,可以使用cluster-proxy add-on插件建立一个从spoke到hub的反向代理通道。

现阶段OCM社区已经有的一些add-on:

  • Application lifecycle management add-on插件提供了一套通过Subscriptions订阅channel,将github仓库,Helm release或者对象存储仓库的应用分发到指定Spoke集群上的机制。
  • Cluster proxy add-on插件通过反向代理通道提供了Hub和Spoke集群之间L4网络连接。
  • Managed service account add-on插件可以让Hub集群管理员很容易管理Spoke集群上serviceaccount。
  • Policy frameworkPolicy controllers add-on插件可以让Hub集群管理员很轻松为Spoke集群部署安全相关的policy策略。
  • Submarine Addon add-on插件可以让Submarine 和OCM方便集成,为被管理集群提供跨集群的Pod和Service网络互相访问的能力。
  • Multicluster Mesh Addon add-on插件为OCM被管理集群提供了跨集群Service Mesh服务。

本文将详细介绍add-on插件的实现机制。

OCM add-on 插件实现机制

通常情况下一个add-on插件包含2部分组成:

  1. Add-on Agent 是运行在Spoke集群上的任何Kubernetes资源,比如可以是一个有访问Hub权限的Pod,可以是一个Operator,等等。
  2. Add-on Manager 是运行中Hub集群上的一个Kubernetes控制器。这个控制器可以通过ManifestWork 来给不同Spoke集群部署分发Add-on Agent所需要的Kubernetes资源, 也可以管理Add-on Agent所需要的权限等。

在OCM Hub集群上,关于add-on插件有2个主要的API:

  1. ClusterManagementAddOn: 这是一个cluster-scoped的API,每个add-on插件必须创建一个同名的实例用来描述add-on插件的名字 和描述信息,以及配置,安装部署策略等。
  2. ManagedClusterAddOn: 这是一个namespace-scoped的API,部署到spoke集群的namespace下的和add-on同名的实例用来触发 Add-on Agent安装部署到该Spoke集群。我们也可以通过这个API获取这个add-on插件的agent的健康状态信息。

Add-on 插件架构如下:

Addon Architecture

创建:

Add-on Manager 监控managedClusterAddOn 来创建manifestWorkAdd-on Agent部署到Spoke集群上,也可以根据 配置的部署策略只将agent部署到策略选中的集群上。

注册:

如果Add-on Agent 需要访问Hub集群,registration-agent会根据managedClusterAddOn 中的注册信息来向Hub集群 发起CSR请求来申请访问Hub集群的权限,Add-on Manager 根据自定义的approve策略来检查CSR请求,approve后,创建对应的RBAC 权限给agent,registration-agent 会生成一个含有指定权限的kubeconfig secret, agent可以通过这个secret来访问Hub集群。

原生Kubernetes CSR只支持kubernetes.io/kube-apiserver-client,kubernetes.io/kube-apiserver-client-kubeletkubernetes.io/kubelet-serving 这几种签名者(signer),我们可以提供让用户自定义证书和签名者来访问非kube-apiserver的服务, 在Add-on Manager上可以自定义验证签名者和证书是否正确来完成add-on的注册。

健康检查:

Add-on Agent可以通过addon-framework提供的lease功能在Spoke集群上维护一个lease,registration-agent 监控这个Lease, 并通过Lease状态判断Agent是否健康,并更新到Hub集群的managedClusterAddOnAvailable状态中。用户也可以通过其他自定义方式 来进行agent的健康检查,比如通过Add-on ManifestWork中某个资源的字段来判断agent是否健康。

开发:

OCM 社区提供了一个addon-framework的库, 可以方便开发者快速开发自己的add-on插件Manager,也可以将自己的Kubernetnes 应用通过addon-framework便捷的以add-on插件的 形式迁移到OCM多集群上。

开发者将自己的Agent侧要部署的资源以Helm Chart或者Go Template的形式直接拷贝到工程目录,通过调用addonfactory就可以完成整个add-on注册,配置,健康检查等所有功能。详细请 参考add-on 开发指引.

例子

我们以addon-framework中的helloworldhelm add-on插件来举例。 这个add-on插件例子是将Hub集群上集群namespace下的configmap同步到Spoke集群上。

首先我们用KinD创建2个集群,一个当Hub集群安装OCM,并将另一个作为Spoke集群,以cluster1的名字注册到Hub集群。 可以参考OCM安装

$ kubectl get mcl
NAME       HUB ACCEPTED   MANAGED CLUSTER URLS   JOINED   AVAILABLE   AGE
cluster1   true           https://localhost      True     True        17s

然后在Hub集群上安装helloworldhelm add-on插件的Add-on Manager控制器。 具体步骤参考部署helloworldhelm add-on

$ kubectl get deployments.apps -n open-cluster-management helloworld-controller
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
helloworldhelm-controller   1/1     1            1           50s

在Hub集群上我们可以看到helloworldhelm add-on插件的ClusterManagementAddOn

$ kubectl get clustermanagementaddons.addon.open-cluster-management.io helloworldhelm -o yaml
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  creationTimestamp: "2023-05-28T14:12:32Z"
  generation: 1
  name: helloworldhelm
  resourceVersion: "457615"
  uid: 29ac6292-7346-4bc9-8013-fd90f40589d6
spec:
  addOnMeta:
    description: helloworldhelm is an example addon created by helm chart
    displayName: helloworldhelm
  installStrategy:
    type: Manual
  supportedConfigs:
  - group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
  - group: ""
    resource: configmaps

给cluster1 集群上部署helloworldhelm add-on, agent部署到Spoke集群的open-cluster-management-agent-addon namespace。

$ clusteradm addon enable --names helloworldhelm --namespace open-cluster-management-agent-addon --clusters cluster1

我们看到Hub集群上cluster1的namespace下部署了一个managedClusterAddon:

$ kubectl get managedclusteraddons.addon.open-cluster-management.io -n cluster1 helloworldhelm -o yaml
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  creationTimestamp: "2023-05-28T14:13:56Z"
  finalizers:
  - addon.open-cluster-management.io/addon-pre-delete
  generation: 1
  name: helloworldhelm
  namespace: cluster1
  ownerReferences:
  - apiVersion: addon.open-cluster-management.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: ClusterManagementAddOn
    name: helloworldhelm
    uid: 29ac6292-7346-4bc9-8013-fd90f40589d6
  resourceVersion: "458003"
  uid: 84ceac57-3a7d-442f-bc28-d9828023d880
spec:
  installNamespace: open-cluster-management-agent-addon
status:
  conditions:
  - lastTransitionTime: "2023-05-28T14:13:57Z"
    message: Registration of the addon agent is configured
    reason: SetPermissionApplied
    status: "True"
    type: RegistrationApplied
  - lastTransitionTime: "2023-05-28T14:13:57Z"
    message: manifests of addon are applied successfully
    reason: AddonManifestApplied
    status: "True"
    type: ManifestApplied
  - lastTransitionTime: "2023-05-28T14:13:57Z"
    message: client certificate rotated starting from 2023-05-28 14:08:57 +0000 UTC
      to 2024-05-27 14:08:57 +0000 UTC
    reason: ClientCertificateUpdated
    status: "True"
    type: ClusterCertificateRotated
  - lastTransitionTime: "2023-05-28T14:15:04Z"
    message: helloworldhelm add-on is available.
    reason: ManagedClusterAddOnLeaseUpdated
    status: "True"
    type: Available
  namespace: open-cluster-management-agent-addon
  registrations:
  - signerName: kubernetes.io/kube-apiserver-client
    subject:
      groups:
      - system:open-cluster-management:cluster:cluster1:addon:helloworldhelm
      - system:open-cluster-management:addon:helloworldhelm
      - system:authenticated
      user: system:open-cluster-management:cluster:cluster1:addon:helloworldhelm:agent:8xz2x
  supportedConfigs:
  - group: ""
    resource: configmaps
  - group: addon.open-cluster-management.io
    resource: addondeploymentconfigs

在Hub集群上的cluster1 namespace下,我们还可以看到部署add-on agent对应的manifestWork

$ kubectl get manifestwork -n cluster1
NAME                            AGE
addon-helloworldhelm-deploy-0   7m18s

在Spoke集群cluster1上,我们可以看到agent部署在了open-cluster-management-agent-addon namespace下, agent通过绑定hub的kubeconfig来访问Hub同步configmap。

$ kubectl get deployments.apps -n open-cluster-management-agent-addon
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
helloworldhelm-agent   1/1     1            1           8m17s

$ kubectl get secret -n open-cluster-management-agent-addon
NAME                                  TYPE                                  DATA   AGE
helloworldhelm-hub-kubeconfig         Opaque                                3      8m17s

OCM add-on最新的改进和计划

在最新发布的OCM v0.11.0版本中,我们对add-on进行了很多功能的增强:

  1. 在Hub集群上有专门的addon-manager 组建来管理add-on插件的配置和生命周期。
  2. 特别增强了add-on生命周期的管理,升级了ClusterManagementAddonManagedClusterAddOn这两个API, 用户可以通过和Placement结合对指定集群上的add-on进行滚动升级和金丝雀升级。
  3. 我们还在设计一种新的add-on API AddonTemplate 来让用户不用进行代码开发就可以轻松实现自己的add-on插件的部署安装。

How to distribute workloads using Open Cluster Management

Read more at Red Hat Developers.

KubeCon NA 2022 - OCM Multicluster App & Config Management

Read more at KubeCon NA 2022 - OCM Multicluster App & Config Management.

KubeCon NA 2022 - OCM Workload distribution with Placement API

Read more at KubeCon NA 2022 - OCM Workload distribution with Placement API.

Karmada and Open Cluster Management: two new approaches to the multicluster fleet management challenge

Read more at CNCF Blog.

Extending the Multicluster Scheduling Capabilities with Open Cluster Management Placement

Read more at Red Hat Cloud Blog.

详解ocm klusterlet秘钥管理机制

概述

open-cluster-management中,为了使控制面有更好的可扩展性,我们使用了hub-spoke的架构:即集中的控制面(hub只 负责处理控制面的资源和数据而无需访问被管理的集群;每个被管理集群(spoke)运行一个称为klusterlet的agent访问控制面获取 需要执行的任务。在这个过程中,klusterlet需要拥有访问hub集群的秘钥才能和hub安全通信。确保秘钥的安全性是非常重要的, 因为如果这个秘钥被泄露的话有可能导致对hub集群的恶意访问或者窃取敏感信息,特别是当ocm的被管理集群分布在不同的公有云中的时候。 为了保证秘钥的安全性,我们需要满足一些特定的需求:

  1. 尽量避免秘钥在公有网络中的传输
  2. 秘钥的刷新和废除
  3. 细粒度的权限控制

本文将详细介绍ocm是如何实现秘钥的管理来保证控制面板和被管理集群之间的安全访问的。

架构和机制

在ocm中我们采用了以下几个机制来确保控制面和被管理集群之间访问的安全性:

  1. 基于CertificateSigniningRequest的mutual tls
  2. 双向握手协议和动态klusterletID
  3. 认证和授权的分离

基于CertificateSigniningRequest的mutual tls

使用kubernetesCertificateSigniningRequestCSR)API可以方便的生成客户认证证书。这个机制可以让klusterlet在第一次 启动访问hub集群时使用一个权限很小的秘钥来创建CSR。当CSR返回了生成的证书后,klusterlet就可以用后续生成的带有更大访问权限的 证书来访问hub集群。在使用csr的过程中,klusterlet的私钥不会在网络中传输而是一直保存在被管理集群中;只有CSR的公钥和初始阶段需要的 小权限秘钥(bootstrap secret)会在不同集群间传输。这就最大程度的保证秘钥不会在传输过程中被泄露出去。

双向握手协议和动态klusterletID

那么如果初始阶段的bootstrap secret被泄露了会怎么样呢?这就牵涉到OCM中的双向握手协议。当被管理集群中的klusterlet使用bootstrap secret 发起了第一次请求的时候, hub集群不会立刻为这个请求创建客户证书和对应的访问权限。这个请求将处在Pending状态,直到hub集群拥有特定管理权限的管理员 同意了klusterlet的接入请求后,客户证书和特定权限才会被创建出来。这个请求中包含了klusterlet启动阶段生成的动态ID,管理员需要确保这个ID和被 管理集群上klusterlet的ID一致才能同意klusterlet的接入。这也就确保了如果bootstrap secret被不慎泄露后,CSR也不会被管理员轻易的接受。

klusterlet使用的客户证书是有过期时间的,klusterlet需要在证书过期之前使用现有的客户证书发起新的CSR请求来获取新的客户证书。hub集群会检验 更新证书的CSR请求是否合法并自动签署新的客户证书。需要注意的是由于klusterlet使用了动态ID的机制,只有klusterlet本身发起的CSR请求才会 被自动签署。如果klusterlet在集群中被卸载然后重新部署后,它必须重新使用bootstrap secret流程来获取客户证书。

认证和授权的分离

klusterletCSR请求被接受后,它获得了被hub集群认证通过的客户证书,但是它在这个时候还没有对hub集群上特定资源访问的权限。 ocm中还有一个单独的授权流程。每个被管理集群的klusterlet时候有权限访问hub集群的特定资源是被对应ManagedClusterAPI上的 hubAcceptsClient域来控制的。只有当这个域被置位true时,hub集群的控制器才会为对应klusterlet赋予权限。而设置这个域需要用户 在hub集群中对managedcluster/accept具有update权限才可以。如下面的clusterrole的例子表示用户只能对cluster1这个 ManagedCluster上的klusterlet赋予权限。

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: open-cluster-management:hub
rules:
- apiGroups: ["register.open-cluster-management.io"]
  resources: ["managedclusters/accept"]
  verbs: ["update"]
  resourceNames: ["cluster1"]

将认证和授权的流程分开的原因是通常情况下hub集群具有approve CSR权限的用户和"允许klusterlet接入hub"集群的用户并不完全一致。以上 机制就可以保证即使用户拥有approve CSR的权限也不能给任意的klusterlet赋予接入hub集群的权限。

实现细节

所有认证授权和秘钥管理的代码实现都在registration组件中。大概的流程 如下图所示

registration-agent在被管理集群中启动后,会首先在自己的namespace里查找是否有hub-kubeconfig的秘钥并验证这个秘钥是否合法。 如果不存在或者不合法,registration-agent就进入了bootstrap流程,它会首先产生一个动态的agent ID, 然后使用一个更小权限的 bootstrap-kubeconfig来创建client和informer,接下来启动一个ClientCertForHubController的goroutine。这个controller会在hub集群 创建CSR,等待CSR中签署的证书并最终把证书和私钥做为名为hub-kubeconfig的秘钥持久化在被管理集群中。agent接着持续监控hub-kubeconfig 这个秘钥是否已经被持久化。当agent发现hub-kubeconfig则意味着agent已经获取到了可以访问hub集群的客户证书,agent就会停掉之前的controller并 退出bootstrap流程。接下来agent会重新用hub-kubeconfig创建client和informer,并启动一个新的ClientCertForHubController的goroutine 来定期刷新客户证书。

在hub集群中的registration-controller会启动CSRApprovingController用来负责检查klusterlet发起的CSR请求是否可以自动签发;以及 managedClusterController用来检查对应ManagedCluster上的hubAccepctsClient域是否被设置并在hub集群中创建相应的权限。

通过OCM访问不同VPC下的集群

问题背景

当我们拥有多个集群时,一个很常见的需求是:不同的用户希望能访问位于不同VPC下的集群。比如,开发人员希望能够在测试集群部署应用,或者运维人员希望能够在生产集群上进行故障排查。

作为多个集群的管理员,为了实现该需求,需要在各个集群为用户:

  1. 绑定Role。
  2. 提供访问配置(证书或Token)。
  3. 提供访问入口。

但是,这种方式有以下几个问题:

  • 网络隔离:集群位于私有数据中心,那么管理员就需要为集群用户进行特殊的网络配置,比如建立VPN或者跳板机。
  • 网络安全:为用户暴露的集群端口,会增加集群的安全风险。
  • 配置过期:证书中的秘钥和Token都有过期时间,管理员需要定期为用户做配置更新。

而通过安装OCM以及cluster-proxy,managed-serviceaccount两个插件,管理员则可以在不暴露集群端口的情况下,为不同用户提供统一访问入口,并方便地管理不同用户的访问权限。

基本概念

以下,我们通过一个简单的例子来解释OCM以及cluster-proxy,managed-serviceaccount的基本概念。

假设我们有3个集群,分别位于两个不同的VPC中,其中VPC-1中的集群可以被所有用户访问,而VPC-2中的2个集群只能被管理员访问。

管理员希望通过VPC-1中的集群(后文称“管理集群”)为用户提供统一的访问入口,使用户可以访问VPC-2中的集群(后文称“受管集群”)。

OCM是什么?

OCM 全称为 Open Cluster Management,旨在解决多集群场景下的集群注册管理,工作负载分发,以及动态的资源配置等功能。

安装OCM之后,我们可以将受管集群注册加入管理集群,完成注册后,在管理集群中会创建一个与受管集群注册名相同的命名空间。比如,受管集群以cluster1注册到管理集群,那么就会对应创建一个名为cluster1的命名空间。在管理集群上,我们可以通过这些不同的命令空间来区分多个受管集群的资源。

注册过程不要求受管集群向管理集群暴露访问接口。

更多有关于OCM的架构细节,请参考官方文档

cluster-proxy是什么?

cluster-proxy是使用OCM的addon-framework实现的一个基于 apiserver-network-proxy(后文简写为:ANP)的插件。插件安装后,会在管理集群上安装ANP的组件proxy-server,在受管集群上安装ANP的组件proxy-agent。

接着proxy-agent通过管理集群上暴露的端口,向proxy-server发送注册请求,并建立一条全双工通信的GRPC管道。

需要注意的是,cluster-proxy建立的GRPC通道只是保证了管理集群到被管理集群的网络连通性,如果用户想访问被管理集群的APIServer或者其他服务,仍需要从被管理集群获得相应的认证秘钥和权限。

更多有关cluster-proxy的信息,请参考官方文档

managed-serviceaccount是什么?

Managed-serviceaccount(后文简写为:MSA)也是利用OCM的addon-framework实现的插件。

安装该插件后,可以在管理集群上配置ManagedServiceAcccount的CR,插件会根据此CR的spec配置,在目标受管集群的open-cluster-management-managed-serviceaccount命名空间内,创建一个与CR同名的ServiceAccount

接着插件会将此ServiceAccount生成的对应token数据同步回管理集群,并在受管集群的命令空间中创建一个同名的Secret,用于保存该token。整个token的数据同步都是在OCM提供的MTLS连接中进行,从而确保token不会被第三方探查到。

由此集群管理员可以在hub上通过MSA来获得访问被管理集群APIServer的token。当然这个token现在还没有被赋予权限,只要管理员为该token绑定相应的Role,就可以实现访问被管理集群的权限控制。

更多有关managed-serviceaccount的信息,请参考官方文档

样例

接下来通过一个简单的例子来演示如何使用OCM,cluster-proxy,managed-serviceaccount来实现跨VPC访问集群。

首先从管理员视角,我们通过脚本快速创建一个基于kind的多集群环境,其中具有一个管理集群(hub),以及两个受管集群(cluster1, cluster2)。并且 cluster1, cluster2 会通过 OCM 注册到了 hub。

该脚本还会为我们安装OCM的CLI工具clusteradm

curl -L <https://raw.githubusercontent.com/open-cluster-management-io/OCM/main/solutions/setup-dev-environment/local-up.sh> | bash

然后,管理员还需要安装两个插件:

# 安装 cluster-proxy
helm install \\
    -n open-cluster-management-addon --create-namespace \\
    cluster-proxy ocm/cluster-proxy

# 安装 managed-service
helm install \\
    -n open-cluster-management-addon --create-namespace \\
    managed-serviceaccount ocm/managed-serviceaccount

# 验证 cluster-proxy 已安装
clusteradm get addon cluster-proxy

# 验证 managed-serviceaccount 已安装
clusteradm get addon managed-serviceaccount

完成安装后,管理员希望给用户能够访问cluster1,他需要通过以下命令创建一个在hub的命令空间cluster1中,创建一个MSA的CR:

kubectl apply -f - <<EOF
apiVersion: authentication.open-cluster-management.io/v1alpha1
kind: ManagedServiceAccount
metadata:
  name: dep
  namespace: cluster1
spec:
  rotation: {}
EOF

# 检查Token是否已同步回管理集群hub,并保存为名为dep的Secret
kubectl get secret -n cluster1
NAME                  TYPE                                  DATA   AGE
default-token-r89gs   kubernetes.io/service-account-token   3      6d22h
dep                   Opaque                                2      6d21h

接着,管理员需要通过OCM的Manifestwork, 即工作负载分发功能,在cluster1上创建一个ClusterRole,给dep绑定了cluster1上的对应权限:

# 创建ClusterRole, 仅具有操作Deployment的权限
clusteradm create work dep-role --cluster cluster1 -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: dep-role
rules:
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "watch", "list", "create", "update", "patch", "delete"]
EOF

# 绑定ClusterRole
clusteradm create work dep-rolebinding --cluster cluster1 -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: dep-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: dep-role
subjects:
  - kind: ServiceAccount
    name: dep
    namespace: open-cluster-management-managed-serviceaccount
EOF

完成之后,用户便可以通过cluteradm,来操作cluster1上Deployments了:

clusteradm proxy kubectl --cluster=cluster1 --sa=dep -i
Please enter the kubectl command and use "exit" to quit the interactive mode
kubectl> get deployments -A
NAMESPACE                                        NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
kube-system                                      coredns                              2/2     2            2           20d
local-path-storage                               local-path-provisioner               1/1     1            1           20d
open-cluster-management-agent                    klusterlet-registration-agent        1/1     1            1           20d
open-cluster-management-agent                    klusterlet-work-agent                1/1     1            1           20d
open-cluster-management-cluster-proxy            cluster-proxy-proxy-agent            3/3     3            3           20d
open-cluster-management-managed-serviceaccount   managed-serviceaccount-addon-agent   1/1     1            1           20d
open-cluster-management                          klusterlet                           3/3     3            3           20d
# 用户没有权限访问cluster1上的pods,请求被拒绝
kubectl> get pods -A
Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:open-cluster-management-managed-serviceaccount:dep" cannot list resource "pods" in API group "" at the cluster scope

值得注意的是,为使用clusteradm访问cluster1, 还需要为用户配置了以下权限:

# 获取MSA的token
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: user
  namespace: cluster1
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
- apiGroups: ["authentication.open-cluster-management.io"]
  resources: ["managedserviceaccounts"]
  verbs: ["get"]
---
# 通过portforward的在本地映射cluster-proxy的Service
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: user-cluster-proxy
  namespace: open-cluster-management-cluster-proxy
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["pods/portforward"]
  verbs: ["create"]
---
# 运行命令前对相关Resource进行检查
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: user
rules:
- apiGroups: ["cluster.open-cluster-management.io"]
  resources: ["managedclusters"]
  verbs: ["get, "list"]
- apiGroups: ["addon.open-cluster-management.io"]
  resources: ["clustermanagementaddons"]
  verbs: ["get"]
- apiGroups: ["proxy.open-cluster-management.io"]
  resources: ["managedproxyconfigurations"]
  verbs: ["get"]

总结

本文介绍了如何使用OCM来为用户提供访问不同VPC下集群的功能,通过这种方式,管理员不再需要对集群网络进行特殊配置,也不再需要为用户提供和维护多个集群的访问凭证,所有用户都通过统一的访问接口访问各个集群,增加了系统的安全性和易用性。

目前,OCM的cluster-proxymanaged-serviceaccount功能还处于初期阶段,未来我们还不断的完善其功能,欢迎大家试用并提出宝贵的意见和建议。

Using the Open Cluster Management Placement for Multicluster Scheduling

Read more at Red Hat Cloud Blog.

Using the Open Cluster Management Add-on Framework to Develop a Managed Cluster Add-on

Read more at Red Hat Cloud Blog.

The Next Kubernetes Frontier: Multicluster Management

Read more at Container Journal.

Put together a user walk through for the basic Open Cluster Management API using `kind`, `olm`, and other open source technologies

Read more at GitHub.

Setting up Open Cluster Management the hard way

A guide to setting up Open Cluster Management manually for a deeper understanding of its components.

Read more at Setting up Open Cluster Management the hard way.