This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Getting Started

1: Quick Start

2: Installation

2.1: Start the control plane
2.2: Register a cluster
2.3: Add-on management
2.4: Running on EKS

3: Add-ons and Integrations

3.1: Policy

3.1.1: Policy framework
3.1.2: Policy API concepts
3.1.3: Configuration Policy
3.1.4: Open Policy Agent Gatekeeper

3.2: Application lifecycle management
3.3: Cluster proxy
3.4: Managed service account
3.5: Multicluster Control Plane
3.6: FleetConfig Controller

4: Administration

4.1: Monitoring OCM using Prometheus-Operator
4.2: Upgrading your OCM environment
4.3: Feature Gates

1 - Quick Start

Follow these steps to setup an OCM hub with two managed clusters using clusteradm and kind.

Prerequisites

Ensure kubectl and kustomize are installed.
Ensure kind (greater than v0.9.0+, or the latest version is preferred) is installed.

Install clusteradm CLI tool

Run the following command to download and install the latest clusteradm command-line tool:

curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash

Setup hub and managed cluster

Run the following command to quickly set up a hub cluster and 2 managed clusters using kind.

curl -L https://raw.githubusercontent.com/open-cluster-management-io/OCM/main/solutions/setup-dev-environment/local-up.sh | bash

If you want to set up OCM in a production environment or on a different Kubernetes distribution, please refer to the Start the control plane and Register a cluster guides.

Alternatively, you can deploy OCM declaratively using the FleetConfig Controller.

What is next

Now you have the OCM control plane with 2 managed clusters connected! Let’s start your OCM journey.

Deploy kubernetes resources onto a managed cluster
Visit kubernetes apiserver of managedcluster from cluster-proxy
Visit integration to check if any certain OCM addon will meet your use cases.
- Deploy Policies onto a managed cluster
Use the OCM VScode Extension to easily generate OCM related Kubernetes resources and track your cluster

2 - Installation

Install the core control plane that includes cluster registration and manifests distribution on the hub cluster.

Install the klusterlet agent on the managed cluster so that it can be registered and managed by the hub cluster.

2.1 - Start the control plane

Prerequisites

The hub cluster should be v1.19+. (To run on hub cluster version between [v1.16, v1.18], please manually enable feature gate “V1beta1CSRAPICompatibility”).
Currently the bootstrap process relies on client authentication via CSR. Therefore, if your Kubernetes distributions (like EKS) don’t support it, you can:
- follow this article to run OCM natively on EKS
- or choose the multicluster-controlplane as the hub controlplane
Ensure kubectl and kustomize are installed.

Network requirements

Configure your network settings for the hub cluster to allow the following connections.

Direction	Endpoint	Protocol	Purpose	Used by
Inbound	https://{hub-api-server-url}:{port}	TCP	Kubernetes API server of the hub cluster	OCM agents, including the add-on agents, running on the managed clusters

Install clusteradm CLI tool

It’s recommended to run the following command to download and install the latest release of the clusteradm command-line tool:

curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash

You can also install the latest development version (main branch) by running:

# Installing clusteradm to $GOPATH/bin/
GO111MODULE=off go get -u open-cluster-management.io/clusteradm/...

Bootstrap a cluster manager

Before actually installing the OCM components into your clusters, export the following environment variables in your terminal before running our command-line tool clusteradm so that it can correctly discriminate the hub cluster.

# The context name of the clusters in your kubeconfig
export CTX_HUB_CLUSTER=<your hub cluster context>

Call clusteradm init:

 # By default, it installs the latest release of the OCM components.
 # Use e.g. "--bundle-version=latest" to install latest development builds.
 # NOTE: For hub cluster version between v1.16 to v1.19 use the parameter: --use-bootstrap-token
 clusteradm init --wait --context ${CTX_HUB_CLUSTER}

Configure CPU and memory resources

You can configure CPU and memory resources for the cluster manager components by adding resource flags to the clusteradm init command. These flags indicate that all components in the hub controller will use the same resource requirement or limit:

# Configure resource requests and limits for cluster manager components
clusteradm init \
    --resource-qos-class ResourceRequirement \
    --resource-limits cpu=1000m,memory=1Gi \
    --resource-requests cpu=500m,memory=512Mi \
    --wait --context ${CTX_HUB_CLUSTER}

Available resource configuration flags:

--resource-qos-class: Sets the resource QoS class (Default, BestEffort, or ResourceRequirement)
--resource-limits: Specifies resource limits as key-value pairs (e.g., cpu=800m,memory=800Mi)
--resource-requests: Specifies resource requests as key-value pairs (e.g., cpu=500m,memory=500Mi)

The clusteradm init command installs the registration-operator on the hub cluster, which is responsible for consistently installing and upgrading a few core components for the OCM environment.

After the init command completes, a generated command is output on the console to register your managed clusters. An example of the generated command is shown below.

clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub kube-apiserver endpoint> \
    --wait \
    --cluster-name <cluster_name>

It’s recommended to save the command somewhere secure for future use. If it’s lost, you can use clusteradm get token to get the generated command again.

Check out the running instances of the control plane

kubectl -n open-cluster-management get pod --context ${CTX_HUB_CLUSTER}
NAME                               READY   STATUS    RESTARTS   AGE
cluster-manager-695d945d4d-5dn8k   1/1     Running   0          19d

Additionally, to check out the instances of OCM’s hub control plane, run the following command:

kubectl -n open-cluster-management-hub get pod --context ${CTX_HUB_CLUSTER}
NAME                               READY   STATUS    RESTARTS   AGE
cluster-manager-placement-controller-857f8f7654-x7sfz      1/1     Running   0          19d
cluster-manager-registration-controller-85b6bd784f-jbg8s   1/1     Running   0          19d
cluster-manager-registration-webhook-59c9b89499-n7m2x      1/1     Running   0          19d
cluster-manager-work-webhook-59cf7dc855-shq5p              1/1     Running   0          19d
...

The overall installation information is visible on the clustermanager custom resource:

kubectl get clustermanager cluster-manager -o yaml --context ${CTX_HUB_CLUSTER}

Uninstall the OCM from the control plane

Before uninstalling the OCM components from your clusters, please detach the managed cluster from the control plane.

clusteradm clean --context ${CTX_HUB_CLUSTER}

Check the instances of OCM’s hub control plane are removed.

kubectl -n open-cluster-management-hub get pod --context ${CTX_HUB_CLUSTER}
No resources found in open-cluster-management-hub namespace.

kubectl -n open-cluster-management get pod --context ${CTX_HUB_CLUSTER}
No resources found in open-cluster-management namespace.

Check the clustermanager resource is removed from the control plane.

kubectl get clustermanager --context ${CTX_HUB_CLUSTER}
error: the server doesn't have a resource type "clustermanager"

2.2 - Register a cluster

After the cluster manager is installed on the hub cluster, you need to install the klusterlet agent on another cluster so that it can be registered and managed by the hub cluster.

Prerequisites

The managed clusters should be v1.11+.
Ensure kubectl and kustomize are installed.

Network requirements

Configure your network settings for the managed clusters to allow the following connections.

Direction	Endpoint	Protocol	Purpose	Used by
Outbound	https://{hub-api-server-url}:{port}	TCP	Kubernetes API server of the hub cluster	OCM agents, including the add-on agents, running on the managed clusters

To use a proxy, please make sure the proxy server is well configured to allow the above connections and the proxy server is reachable for the managed clusters. See Register a cluster to hub through proxy server for more details.

Install clusteradm CLI tool

It’s recommended to run the following command to download and install the latest release of the clusteradm command-line tool:

curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash

You can also install the latest development version (main branch) by running:

# Installing clusteradm to $GOPATH/bin/
GO111MODULE=off go get -u open-cluster-management.io/clusteradm/...

Bootstrap a klusterlet

# The context name of the clusters in your kubeconfig
export CTX_HUB_CLUSTER=<your hub cluster context>
export CTX_MANAGED_CLUSTER=<your managed cluster context>

Copy the previously generated command – clusteradm join, and add the arguments respectively based on the different distribution.

NOTE: If there is no configmap kube-root-ca.crt in kube-public namespace of the hub cluster, the flag –ca-file should be set to provide a valid hub ca file to help set up the external client.

# NOTE: For KinD clusters use the parameter: --force-internal-endpoint-lookup
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --force-internal-endpoint-lookup \
    --context ${CTX_MANAGED_CLUSTER}

clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \   # Or other arbitrary unique name
    --context ${CTX_MANAGED_CLUSTER}

Configure CPU and memory resources

You can configure CPU and memory resources for the klusterlet agent components by adding resource flags to the clusteradm join command. These flags indicate that all components in the klusterlet agent will use the same resource requirement or limit:

# Configure resource requests and limits for klusterlet components
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \
    --resource-qos-class ResourceRequirement \
    --resource-limits cpu=800m,memory=800Mi \
    --resource-requests cpu=400m,memory=400Mi \
    --force-internal-endpoint-lookup \
    --context ${CTX_MANAGED_CLUSTER}

# Configure resource requests and limits for klusterlet components
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \
    --resource-qos-class ResourceRequirement \
    --resource-limits cpu=800m,memory=800Mi \
    --resource-requests cpu=400m,memory=400Mi \
    --context ${CTX_MANAGED_CLUSTER}

Available resource configuration flags:

--resource-qos-class: Sets the resource QoS class (Default, BestEffort, or ResourceRequirement)
--resource-limits: Specifies resource limits as key-value pairs (e.g., cpu=800m,memory=800Mi)
--resource-requests: Specifies resource requests as key-value pairs (e.g., cpu=500m,memory=500Mi)

Bootstrap a klusterlet in hosted mode (Optional)

Using the above command, the klusterlet components(registration-agent and work-agent) will be deployed on the managed cluster, it is mandatory to expose the hub cluster to the managed cluster. We provide an option for running the klusterlet components outside the managed cluster, for example, on the hub cluster(hosted mode).

The hosted mode deployment is still in experimental stage, consider using it only when:

you want to reduce the footprint of the managed cluster.
you do not want to expose the hub cluster to the managed cluster directly

In hosted mode, the cluster where the klusterlet is running is called the hosting cluster. Running the following command to the hosting cluster to register the managed cluster to the hub.

# NOTE for KinD clusters:
#  1. hub is KinD, use the parameter: --force-internal-endpoint-lookup
#  2. managed is Kind, --managed-cluster-kubeconfig should be internal: `kind get kubeconfig --name managed --internal`
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --mode hosted \
    --managed-cluster-kubeconfig <your managed cluster kubeconfig> \    # Should be an internal kubeconfig
    --force-internal-endpoint-lookup \
    --context <your hosting cluster context>

clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --mode hosted \
    --managed-cluster-kubeconfig <your managed cluster kubeconfig> \
    --context <your hosting cluster context>

Resource configuration in hosted mode:

You can also configure CPU and memory resources when using hosted mode by adding the same resource flags:

# Configure resource requests and limits for klusterlet components in hosted mode
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \
    --mode hosted \
    --managed-cluster-kubeconfig <your managed cluster kubeconfig> \
    --resource-qos-class ResourceRequirement \
    --resource-limits cpu=800m,memory=800Mi \
    --resource-requests cpu=400m,memory=400Mi \
    --force-internal-endpoint-lookup \
    --context <your hosting cluster context>

# Configure resource requests and limits for klusterlet components in hosted mode
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \
    --mode hosted \
    --managed-cluster-kubeconfig <your managed cluster kubeconfig> \
    --resource-qos-class ResourceRequirement \
    --resource-limits cpu=800m,memory=800Mi \
    --resource-requests cpu=400m,memory=400Mi \
    --context <your hosting cluster context>

Bootstrap a klusterlet in singleton mode

To reduce the footprint of agent in the managed cluster, singleton mode is introduced since v0.12.0. In the singleton mode, the work and registration agent will be run as a single pod in the managed cluster.

Note: to run klusterlet in singleton mode, you must have a clusteradm version equal or higher than v0.12.0

# NOTE: For KinD clusters use the parameter: --force-internal-endpoint-lookup
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --singleton \
    --force-internal-endpoint-lookup \
    --context ${CTX_MANAGED_CLUSTER}

clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \   # Or other arbitrary unique name
    --singleton \
    --context ${CTX_MANAGED_CLUSTER}

Resource configuration in singleton mode:

You can also configure CPU and memory resources when using singleton mode:

# Configure resource requests and limits for klusterlet components in singleton mode
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \
    --singleton \
    --resource-qos-class ResourceRequirement \
    --resource-limits cpu=600m,memory=600Mi \
    --resource-requests cpu=300m,memory=300Mi \
    --force-internal-endpoint-lookup \
    --context ${CTX_MANAGED_CLUSTER}

# Configure resource requests and limits for klusterlet components in singleton mode
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \
    --singleton \
    --resource-qos-class ResourceRequirement \
    --resource-limits cpu=600m,memory=600Mi \
    --resource-requests cpu=300m,memory=300Mi \
    --context ${CTX_MANAGED_CLUSTER}

Accept the join request and verify

After the OCM agent is running on your managed cluster, it will be sending a “handshake” to your hub cluster and waiting for an approval from the hub cluster admin. In this section, we will walk through accepting the registration requests from the perspective of an OCM’s hub admin.

Wait for the creation of the CSR object which will be created by your managed clusters’ OCM agents on the hub cluster:

kubectl get csr -w --context ${CTX_HUB_CLUSTER} | grep cluster1  # or the previously chosen cluster name

An example of a pending CSR request is shown below:

cluster1-tqcjj   33s   kubernetes.io/kube-apiserver-client   system:serviceaccount:open-cluster-management:cluster-bootstrap   Pending

Accept the join request using the clusteradm tool:
```
clusteradm accept --clusters cluster1 --context ${CTX_HUB_CLUSTER}
```
After running the accept command, the CSR from your managed cluster named “cluster1” will be approved. Additionally, it will instruct the OCM hub control plane to setup related objects (such as a namespace named “cluster1” in the hub cluster) and RBAC permissions automatically.

Verify the installation of the OCM agents on your managed cluster by running:

kubectl -n open-cluster-management-agent get pod --context ${CTX_MANAGED_CLUSTER}
NAME                                             READY   STATUS    RESTARTS   AGE
klusterlet-registration-agent-598fd79988-jxx7n   1/1     Running   0          19d
klusterlet-work-agent-7d47f4b5c5-dnkqw           1/1     Running   0          19d

Verify that the cluster1 ManagedCluster object was created successfully by running:

kubectl get managedcluster --context ${CTX_HUB_CLUSTER}

Then you should get a result that resembles the following:

NAME       HUB ACCEPTED   MANAGED CLUSTER URLS      JOINED   AVAILABLE   AGE
cluster1   true           <your endpoint>           True     True        5m23s

If the managed cluster status is not true, refer to Troubleshooting to debug on your cluster.

Apply a Manifestwork

After the managed cluster is registered, test that you can deploy a pod to the managed cluster from the hub cluster. Create a manifest-work.yaml as shown in this example:

apiVersion: work.open-cluster-management.io/v1
kind: ManifestWork
metadata:
  name: mw-01
  namespace: ${MANAGED_CLUSTER_NAME}
spec:
  workload:
    manifests:
      - apiVersion: v1
        kind: Pod
        metadata:
          name: hello
          namespace: default
        spec:
          containers:
            - name: hello
              image: busybox
              command: ["sh", "-c", 'echo "Hello, Kubernetes!" && sleep 3600']
          restartPolicy: OnFailure

Apply the yaml file to the hub cluster.

kubectl apply -f manifest-work.yaml --context ${CTX_HUB_CLUSTER}

Verify that the manifestwork resource was applied to the hub.

kubectl -n ${MANAGED_CLUSTER_NAME} get manifestwork/mw-01 --context ${CTX_HUB_CLUSTER} -o yaml

Check on the managed cluster and see the hello Pod has been deployed from the hub cluster.

$ kubectl -n default get pod --context ${CTX_MANAGED_CLUSTER}
NAME    READY   STATUS    RESTARTS   AGE
hello   1/1     Running   0          108s

Troubleshooting

If the managed cluster status is not true.

For example, the result below is shown when checking managedcluster.

$ kubectl get managedcluster --context ${CTX_HUB_CLUSTER}
NAME                   HUB ACCEPTED   MANAGED CLUSTER URLS   JOINED   AVAILABLE   AGE
${MANAGED_CLUSTER_NAME} true           https://localhost               Unknown     46m

There are many reasons for this problem. You can use the commands below to get more debug info. If the provided info doesn’t help, please log an issue to us.

On the hub cluster, check the managedcluster status.

kubectl get managedcluster ${MANAGED_CLUSTER_NAME} --context ${CTX_HUB_CLUSTER} -o yaml

On the hub cluster, check the lease status.

kubectl get lease -n ${MANAGED_CLUSTER_NAME} --context ${CTX_HUB_CLUSTER}

On the managed cluster, check the klusterlet status.

kubectl get klusterlet -o yaml --context ${CTX_MANAGED_CLUSTER}

Detach the cluster from hub

Remove the resources generated when registering with the hub cluster.

clusteradm unjoin --cluster-name "cluster1" --context ${CTX_MANAGED_CLUSTER}

Check the installation of the OCM agent is removed from the managed cluster.

kubectl -n open-cluster-management-agent get pod --context ${CTX_MANAGED_CLUSTER}
No resources found in open-cluster-management-agent namespace.

Check the klusterlet is removed from the managed cluster.

kubectl get klusterlet --context ${CTX_MANAGED_CLUSTER}
error: the server doesn't have a resource type "klusterlet

Resource cleanup when the managed cluster is deleted

When a user deletes the managedCluster resource, all associated resources within the cluster namespace must also be removed. This includes managedClusterAddons, manifestWorks, and the roleBindings for the klusterlet agent. Resource cleanup follows a specific sequence to prevent resources from being stuck in a terminating state:

managedClusterAddons are deleted first.
manifestWorks are removed subsequently after all managedClusterAddons are deleted.
For the same resource as managedClusterAddon or manifestWork, custom deletion ordering can be defined using the open-cluster-management.io/cleanup-priority annotation:
- Priority values range from 0 to 100 (lower values execute first).

The open-cluster-management.io/cleanup-priority annotation controls deletion order when resource instances have dependencies. For example:

A manifestWork that applies a CRD and operator should be deleted after a manifestWork that creates a CR instance, allowing the operator to perform cleanup after the CR is removed.

The ResourceCleanup featureGate for cluster registration on the Hub cluster enables automatic cleanup of managedClusterAddons and manifestWorks within the cluster namespace after cluster unjoining.

Version Compatibility:

The ResourceCleanup featureGate was introduced in OCM v0.13.0, and was disabled by default in OCM v0.16.0 and earlier versions. To activate it, you need to modify the clusterManager CR configuration:

registrationConfiguration:
  featureGates:
  - feature: ResourceCleanup
    mode: Enable

Starting with OCM v0.17.0, the ResourceCleanup featureGate has been upgraded from Alpha to Beta status and is enabled by default.

Disabling the Feature: To deactivate this functionality, update the clusterManager CR on the hub cluster:

registrationConfiguration:
  featureGates:
  - feature: ResourceCleanup
    mode: Disable

2.3 - Add-on management

Add-on enablement

From a user’s perspective, to install the addon to the hub cluster the hub admin should register a globally-unique ClusterManagementAddon resource as a singleton placeholder in the hub cluster. For instance, the helloworld add-on can be registered to the hub cluster by creating:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
spec:
  addOnMeta:
    displayName: helloworld

Enable the add-on manually

The addon manager running on the hub is taking responsibility of configuring the installation of addon agents for each managed cluster. When a user wants to enable the add-on for a certain managed cluster, the user should create a ManagedClusterAddOn resource on the cluster namespace. The name of the ManagedClusterAddOn should be the same name of the corresponding ClusterManagementAddon. For instance, the following example enables helloworld add-on in “cluster1”:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
spec:
  installNamespace: helloworld

Enable the add-on automatically

If the addon is developed with automatic installation, which support auto-install by cluster discovery, then the ManagedClusterAddOn will be created for all managed cluster namespaces automatically, or be created for the selected managed cluster namespaces automatically.

Enable the add-on by install strategy

If the addon is developed following the guidelines mentioned in managing the add-on agent lifecycle by addon-manager, the user can define an installStrategy in the ClusterManagementAddOn to specify on which clusters the ManagedClusterAddOn should be enabled. Details see install strategy.

Add-on healthiness

The healthiness of the addon instances are visible when we list the addons via kubectl:

$ kubectl get managedclusteraddon -A
NAMESPACE   NAME                     AVAILABLE   DEGRADED   PROGRESSING
<cluster>   <addon>                  True

The addon agent are expected to report its healthiness periodically as long as it’s running. Also the versioning of the addon agent can be reflected in the resources optionally so that we can control the upgrading the agents progressively.

Clean the add-ons

Last but not least, a neat uninstallation of the addon is also supported by simply deleting the corresponding ClusterManagementAddon resource from the hub cluster which is the “root” of the whole addon. The OCM platform will automatically sanitize the hub cluster for you after the uninstalling by removing all the components either in the hub cluster or in the manage clusters.

Add-on lifecycle management

Install strategy

InstallStrategy represents that related ManagedClusterAddOns should be installed on certain clusters. For example, the following example enables the helloworld add-on on clusters with the aws label.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  installStrategy:
    type: Placements
    placements:
    - name: placement-aws
      namespace: default

apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
  name: placement-aws
  namespace: default
spec:
  predicates:
    - requiredClusterSelector:
        claimSelector:
          matchExpressions:
            - key: platform.open-cluster-management.io
              operator: In
              values:
                - aws

Rollout strategy

With the rollout strategy defined in the ClusterManagementAddOn API, users can control the upgrade behavior of the addon when there are changes in the configurations.

For example, if the add-on user updates the “deploy-config” and wants to apply the change to the add-ons to a “canary” decision group first. If all the add-on upgrade successfully, then upgrade the rest of clusters progressively per cluster at a rate of 25%. The rollout strategy can be defined as follows:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  installStrategy:
    type: Placements
    placements:
    - name: placement-aws
      namespace: default
      configs:
      - group: addon.open-cluster-management.io
        resource: addondeploymentconfigs
        name: deploy-config
        namespace: open-cluster-management
      rolloutStrategy:
        type: Progressive
        progressive:
          mandatoryDecisionGroups:
          - groupName: "prod-canary-west"
          - groupName: "prod-canary-east"
          maxConcurrency: 25%
          minSuccessTime: 5m
          progressDeadline: 10m
          maxFailures: 2

In the above example with type Progressive, once user updates the “deploy-config”, controller will rollout on the clusters in mandatoryDecisionGroups first, then rollout on the other clusters with the rate defined in maxConcurrency.

minSuccessTime is a “soak” time, means the controller will wait for 5 minutes when a cluster reach a successful state and maxFailures isn’t breached. If, after this 5 minutes interval, the workload status remains successful, the rollout progresses to the next.
progressDeadline means the controller will wait for a maximum of 10 minutes for the workload to reach a successful state. If, the workload fails to achieve success within 10 minutes, the controller stops waiting, marking the workload as “timeout,” and includes it in the count of maxFailures.
maxFailures means the controller can tolerate update to 2 clusters with failed status, once maxFailures is breached, the rollout will stop.

Currently add-on supports 3 types of rolloutStrategy, they are All, Progressive and ProgressivePerGroup, for more info regards the rollout strategies check the Rollout Strategy document.

Add-on configurations

Default configurations

In ClusterManagementAddOn, spec.supportedConfigs is a list of configuration types supported by the add-on. defaultConfig represents the namespace and name of the default add-on configuration. In scenarios where all add-ons have the same configuration. Only one configuration of the same group and resource can be specified in the defaultConfig.

In the example below, add-ons on all the clusters will use “default-deploy-config” and “default-example-config”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  supportedConfigs:
  - defaultConfig:
      name: default-deploy-config
      namespace: open-cluster-management
    group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
  - defaultConfig:
      name: default-example-config
      namespace: open-cluster-management
    group: example.open-cluster-management.io
    resource: exampleconfigs

Configurations per install strategy

In ClusterManagementAddOn, spec.installStrategy.placements[].configs lists the configuration of ManagedClusterAddon during installation for a group of clusters. For the need to use multiple configurations with the same group and resource can be defined in this field since OCM v0.15.0. It will override the Default configurations on certain clusters by group and resource.

In the example below, add-ons on clusters selected by Placement placement-aws will use “deploy-config”, “example-config-1” and “example-config-2”, while all the other add-ons will still use “default-deploy-config” and “default-example-config”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  supportedConfigs:
  - defaultConfig:
      name: default-deploy-config
      namespace: open-cluster-management
    group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
  installStrategy:
    type: Placements
    placements:
    - name: placement-aws
      namespace: default
      configs:
      - group: addon.open-cluster-management.io
        resource: addondeploymentconfigs
        name: deploy-config
        namespace: open-cluster-management
      - group: example.open-cluster-management.io
        resource: exampleconfigs
        name: example-config-1
        namespace: open-cluster-management
      - group: example.open-cluster-management.io
        resource: exampleconfigs
        name: example-config-2
        namespace: open-cluster-management

Configurations per cluster

In ManagedClusterAddOn, spec.configs is a list of add-on configurations. In scenarios where the current add-on has its own configurations. It also supports defining multiple configurations with the same group and resource since OCM v0.15.0. It will override the Default configurations and Configurations per install strategy defined in ClusterManagementAddOn by group and resource.

In the below example, add-on on cluster1 will use “cluster1-deploy-config” and “cluster1-example-config”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
spec:
  configs:
  - group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
    name: cluster1-deploy-config
    namespace: open-cluster-management
  - group: example.open-cluster-management.io
    resource: exampleconfigs
    name: cluster1-example-config
    namespace: open-cluster-management

Supported configurations

Supported configurations is a list of configuration types that are allowed to override the add-on configurations defined in ClusterManagementAddOn spec. They are listed in the ManagedClusterAddon status.supportedConfigs, for example:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
spec:
...
status:
...
  supportedConfigs:
  - group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
  - group: example.open-cluster-management.io
    resource: exampleconfigs

Effective configurations

As the above described, there are 3 places to define the add-on configurations, they have an override order and eventually only one takes effect. The final effective configurations are listed in the ManagedClusterAddOn status.configReferences.

desiredConfig record the desired config and it’s spec hash.
lastAppliedConfig record the config when the corresponding ManifestWork is applied successfully.

For example:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
...
status:
...
  configReferences:
  - desiredConfig:
      name: cluster1-deploy-config
      namespace: open-cluster-management
      specHash: dcf88f5b11bd191ed2f886675f967684da8b5bcbe6902458f672277d469e2044
    group: addon.open-cluster-management.io
    lastAppliedConfig:
      name: cluster1-deploy-config
      namespace: open-cluster-management
      specHash: dcf88f5b11bd191ed2f886675f967684da8b5bcbe6902458f672277d469e2044
    lastObservedGeneration: 1
    name: cluster1-deploy-config
    resource: addondeploymentconfigs

2.4 - Running on EKS

Use this solution to use AWS EKS cluster as a hub. This solution uses AWS IAM roles for authentication, hence only Managed Clusters running on EKS will be able to use this solution.

Refer this article for detailed registration instructions.

3 - Add-ons and Integrations

Enhance the open-cluster-management core control plane with optional add-ons and integrations.

3.1 - Policy

The Policy Add-on enables auditing and enforcement of configuration across clusters managed by OCM, enhancing security, easing maintenance burdens, and increasing consistency across the clusters for your compliance and reliability requirements.

View the following sections to learn more about the Policy Add-on:

Policy framework

Learn about the architecture of the Policy Add-on that delivers policies defined on the hub cluster to the managed clusters and how to install and enable the add-on for your OCM clusters.
Policy API concepts

Learn about the APIs that the Policy Add-on uses and how the APIs are related to one another to deliver policies to the clusters managed by OCM.
Supported managed cluster policy engines
- Configuration policy
  
  The ConfigurationPolicy is provided by OCM and defines Kubernetes manifests to compare with objects that currently exist on the cluster. The action that the ConfigurationPolicy will take is determined by its complianceType. Compliance types include musthave, mustnothave, and mustonlyhave. musthave means the object should have the listed keys and values as a subset of the larger object. mustnothave means an object matching the listed keys and values should not exist. mustonlyhave ensures objects only exist with the keys and values exactly as defined.
- Open Policy Agent Gatekeeper
  
  Gatekeeper is a validating webhook with auditing capabilities that can enforce custom resource definition-based policies that are run with the Open Policy Agent (OPA). Gatekeeper ConstraintTemplates and constraints can be provided in an OCM Policy to sync to managed clusters that have Gatekeeper installed on them.

3.1.1 - Policy framework

The policy framework provides governance capabilities to OCM managed Kubernetes clusters. Policies provide visibility and drive remediation for various security and configuration aspects to help IT administrators meet their requirements.

API Concepts

View the Policy API page for additional details about the Policy API managed by the Policy Framework components, including:

Architecture

The governance policy framework distributes policies to managed clusters and collects results to send back to the hub cluster.

Prerequisite

You must meet the following prerequisites to install the policy framework:

Ensure kubectl and kustomize are installed.
Ensure the open-cluster-management cluster manager is installed. See Start the control plane for more information.
Ensure the open-cluster-management klusterlet is installed. See Register a cluster for more information.
If you are using PlacementRules with your policies, ensure the open-cluster-management application is installed . See Application management for more information. If you are using the default Placement API, you can skip the Application management installation, but you do need to install the PlacementRule CRD with this command:
```
kubectl apply -f https://raw.githubusercontent.com/open-cluster-management-io/multicloud-operators-subscription/main/deploy/hub-common/apps.open-cluster-management.io_placementrules_crd.yaml
```

Install the governance-policy-framework hub components

Install via Clusteradm CLI

Ensure clusteradm CLI is installed and is at least v0.3.0. Download and extract the clusteradm binary. For more details see the clusteradm GitHub page.

Deploy the policy framework controllers to the hub cluster:

# The context name of the clusters in your kubeconfig
# If the clusters are created by KinD, then the context name will the follow the pattern "kind-<cluster name>".
export CTX_HUB_CLUSTER=<your hub cluster context>           # export CTX_HUB_CLUSTER=kind-hub
export CTX_MANAGED_CLUSTER=<your managed cluster context>   # export CTX_MANAGED_CLUSTER=kind-cluster1

# Set the deployment namespace
export HUB_NAMESPACE="open-cluster-management"

# Deploy the policy framework hub controllers
clusteradm install hub-addon --names governance-policy-framework --context ${CTX_HUB_CLUSTER}

Ensure the pods are running on the hub with the following command:

$ kubectl get pods -n ${HUB_NAMESPACE}
NAME                                                       READY   STATUS    RESTARTS   AGE
governance-policy-addon-controller-bc78cbcb4-529c2         1/1     Running   0          94s
governance-policy-propagator-8c77f7f5f-kthvh               1/1     Running   0          94s

See more about the governance-policy-framework components:
- policy-propagator
- policy-addon-controller

Deploy the synchronization components to the managed cluster(s)

Deploy via Clusteradm CLI

To deploy the synchronization components to a self-managed hub cluster:

clusteradm addon enable --names governance-policy-framework --clusters <managed_hub_cluster_name> --annotate addon.open-cluster-management.io/on-multicluster-hub=true --context ${CTX_HUB_CLUSTER}

To deploy the synchronization components to a managed cluster:

clusteradm addon enable --names governance-policy-framework --clusters <cluster_name> --context ${CTX_HUB_CLUSTER}

Verify that the governance-policy-framework-addon controller pod is running on the managed cluster with the following command:

$ kubectl get pods -n open-cluster-management-agent-addon
NAME                                                     READY   STATUS    RESTARTS   AGE
governance-policy-framework-addon-57579b7c-652zj         1/1     Running   0          87s

What is next

Install the policy controllers to the managed clusters.

3.1.2 - Policy API concepts

Overview

The policy framework has the following API concepts:

Policy Templates are the policies that perform a desired check or action on a managed cluster. For example, ConfigurationPolicy objects are embedded in Policy objects under the policy-templates array.
A Policy is a grouping mechanism for Policy Templates and is the smallest deployable unit on the hub cluster. Embedded Policy Templates are distributed to applicable managed clusters and acted upon by the appropriate policy controller.
A PolicySet is a grouping mechanism of Policy objects. Compliance of all grouped Policy objects is summarized in the PolicySet. A PolicySet is a deployable unit and its distribution is controlled by a Placement.
A PlacementBinding binds a Placement to a Policy or PolicySet.

Additional resources:

View the following resources to learn more about the Policy Addon:
- Video: Open Cluster Management - Configuring Your Kubernetes Fleet With the Policy Addon
- Slides: KubeCon NA 2022 - OCM Multicluster App & Config Management

Policy

A Policy is a grouping mechanism for Policy Templates and is the smallest deployable unit on the hub cluster. Embedded Policy Templates are distributed to applicable managed clusters and acted upon by the appropriate policy controller. The compliance state and status of a Policy represents all embedded Policy Templates in the Policy. The distribution of Policy objects is controlled by a Placement.

View a simple example of a Policy that embeds a ConfigurationPolicy policy template to manage a namespace called “prod”.

apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
  name: policy-namespace
  namespace: policies
  annotations:
    policy.open-cluster-management.io/standards: NIST SP 800-53
    policy.open-cluster-management.io/categories: CM Configuration Management
    policy.open-cluster-management.io/controls: CM-2 Baseline Configuration
spec:
  remediationAction: enforce
  disabled: false
  policy-templates:
    - objectDefinition:
        apiVersion: policy.open-cluster-management.io/v1
        kind: ConfigurationPolicy
        metadata:
          name: policy-namespace-example
        spec:
          remediationAction: inform
          severity: low
          object-templates:
            - complianceType: musthave
              objectDefinition:
                kind: Namespace # must have namespace 'prod'
                apiVersion: v1
                metadata:
                  name: prod

The annotations are standard annotations for informational purposes and can be used by user interfaces, custom report scripts, or components that integrate with OCM.

The optional spec.remediationAction field dictates whether the policy controller should inform or enforce when violations are found and overrides the remediationAction field on each policy template. When set to inform, the Policy will become noncompliant if the underlying policy templates detect that the desired state is not met. When set to enforce, the policy controller applies the desired state when necessary and feasible.

The policy-templates array contains an array of Policy Templates. Here a single ConfigurationPolicy called policy-namespace-example defines a Namespace manifest to compare with objects on the cluster. It has the remediationAction set to inform but it is overridden by the optional global spec.remediationAction. The severity is for informational purposes similar to the annotations.

Inside of the embedded ConfigurationPolicy, the object-templates section describes the prod Namespace object that the ConfigurationPolicy applies to. The action that the ConfigurationPolicy will take is determined by the complianceType. In this case, it is set to musthave which means the prod Namespace object will be created if it doesn’t exist. Other compliance types include mustnothave and mustonlyhave. mustnothave would delete the prod Namespace object. mustonlyhave would ensure the prod Namespace object only exists with the fields defined in the ConfigurationPolicy. See the ConfigurationPolicy page for more information or see the templating in configuration policies topic for advanced templating use cases with ConfigurationPolicy.

When the Policy is bound to a Placement using a PlacementBinding, the Policy status will report on each cluster that matches the bound Placement:

status:
  compliant: Compliant
  placement:
    - placement: placement-hub-cluster
      placementBinding: binding-policy-namespace
  status:
    - clustername: local-cluster
      clusternamespace: local-cluster
      compliant: Compliant

To fully explore the Policy API, run the following command:

kubectl get crd policies.policy.open-cluster-management.io -o yaml

To fully explore the ConfigurationPolicy API, run the following command:

kubectl get crd configurationpolicies.policy.open-cluster-management.io -o yaml

PlacementBinding

A PlacementBinding binds a Placement to a Policy or PolicySet.

Below is an example of a PlacementBinding that binds the policy-namespace Policy to the placement-hub-cluster Placement.

apiVersion: policy.open-cluster-management.io/v1
kind: PlacementBinding
metadata:
  name: binding-policy-namespace
  namespace: policies
placementRef:
  apiGroup: cluster.open-cluster-management.io
  kind: Placement
  name: placement-hub-cluster
subjects:
  - apiGroup: policy.open-cluster-management.io
    kind: Policy
    name: policy-namespace

Once the Policy is bound, it will be distributed to and acted upon by the managed clusters that match the Placement.

PolicySet

A PolicySet is a grouping mechanism of Policy objects. Compliance of all grouped Policy objects is summarized in the PolicySet. A PolicySet is a deployable unit and its distribution is controlled by a Placement when bound through a PlacementBinding.

This enables a workflow where subject matter experts write Policy objects and then an IT administrator creates a PolicySet that groups the previously written Policy objects and binds the PolicySet to a Placement that deploys the PolicySet.

An example of a PolicySet is shown below.

apiVersion: policy.open-cluster-management.io/v1beta1
kind: PolicySet
metadata:
  name: ocm-hardening
  namespace: policies
spec:
  description: Apply standard best practices for hardening your Open Cluster Management installation.
  policies:
    - policy-check-backups
    - policy-managedclusteraddon-available
    - policy-subscriptions

Managed cluster policy controllers

The Policy on the hub delivers the policies defined in spec.policy-templates to the managed clusters via the policy framework controllers. Once on the managed cluster, these Policy Templates are acted upon by the associated controller on the managed cluster. The policy framework supports delivering the Policy Template kinds listed here:

Configuration policy

The ConfigurationPolicy is provided by OCM and defines Kubernetes manifests to compare with objects that currently exist on the cluster. The action that the ConfigurationPolicy will take is determined by its complianceType. Compliance types include musthave, mustnothave, and mustonlyhave. musthave means the object should have the listed keys and values as a subset of the larger object. mustnothave means an object matching the listed keys and values should not exist. mustonlyhave ensures objects only exist with the keys and values exactly as defined. See the page on Configuration Policy for more information.
Open Policy Agent Gatekeeper

Gatekeeper is a validating webhook with auditing capabilities that can enforce custom resource definition-based policies that are run with the Open Policy Agent (OPA). Gatekeeper ConstraintTemplates and constraints can be provided in an OCM Policy to sync to managed clusters that have Gatekeeper installed on them. See the page on Gatekeeper integration for more information.

Templating in configuration policies

Configuration policies support the inclusion of Golang text templates in the object definitions. These templates are resolved at runtime either on the hub cluster or the target managed cluster using configurations related to that cluster. This gives you the ability to define configuration policies with dynamic content and to inform or enforce Kubernetes resources that are customized to the target cluster.

The template syntax must follow the Golang template language specification, and the resource definition generated from the resolved template must be a valid YAML. (See the Golang documentation about package templates for more information.) Any errors in template validation appear as policy violations. When you use a custom template function, the values are replaced at runtime.

Template functions, such as resource-specific and generic lookup template functions, are available for referencing Kubernetes resources on the hub cluster (using the {{hub ... hub}} delimiters), or managed cluster (using the {{ ... }} delimiters). See the Hub cluster templates section for more details. The resource-specific functions are used for convenience and makes content of the resources more accessible. If you use the generic function, lookup, which is more advanced, it is best to be familiar with the YAML structure of the resource that is being looked up. In addition to these functions, utility functions like base64encode, base64decode, indent, autoindent, toInt, and toBool are also available.

To conform templates with YAML syntax, templates must be set in the policy resource as strings using quotes or a block character (| or >). This causes the resolved template value to also be a string. To override this, consider using toInt or toBool as the final function in the template to initiate further processing that forces the value to be interpreted as an integer or boolean respectively.

To bypass template processing you can either:

Override a single template by wrapping the template in additional braces. For example, the template {{ template content }} would become {{ '{{ template content }}' }}.
Override all templates in a ConfigurationPolicy by adding the policy.open-cluster-management.io/disable-templates: "true" annotation in the ConfigurationPolicy section of your Policy. Template processing will be bypassed for that ConfigurationPolicy.

Hub cluster templating in configuration policies

Hub cluster templates are used to define configuration policies that are dynamically customized to the target cluster. This reduces the need to create separate policies for each target cluster or hardcode configuration values in the policy definitions.

Hub cluster templates are based on Golang text template specifications, and the {{hub … hub}} delimiter indicates a hub cluster template in a configuration policy.

A configuration policy definition can contain both hub cluster and managed cluster templates. Hub cluster templates are processed first on the hub cluster, then the policy definition with resolved hub cluster templates is propagated to the target clusters. On the managed cluster, the Configuration Policy controller processes any managed cluster templates in the policy definition and then enforces or verifies the fully resolved object definition.

In OCM versions 0.9.x and older, policies are processed on the hub cluster only upon creation or after an update. Therefore, hub cluster templates are only resolved to the data in the referenced resources upon policy creation or update. Any changes to the referenced resources are not automatically synced to the policies.

A special annotation, policy.open-cluster-management.io/trigger-update can be used to indicate changes to the data referenced by the templates. Any change to the special annotation value initiates template processing, and the latest contents of the referenced resource are read and updated in the policy definition that is the propagator for processing on managed clusters. A typical way to use this annotation is to increment the value by one each time.

Templating value encryption

The encryption algorithm uses AES-CBC with 256-bit keys. Each encryption key is unique per managed cluster and is automatically rotated every 30 days. This ensures that your decrypted value is never stored in the policy on the managed cluster.

To force an immediate encryption key rotation, delete the policy.open-cluster-management.io/last-rotated annotation on the policy-encryption-key Secret in the managed cluster namespace on the hub cluster. Policies are then reprocessed to use the new encryption key.

Templating functions

Function	Description	Sample
`fromSecret`	Returns the value of the given data key in the secret.	`PASSWORD: '{{ fromSecret "default" "localsecret" "PASSWORD" }}'`
`fromConfigmap`	Returns the value of the given data key in the ConfigMap.	`log-file: '{{ fromConfigMap "default" "logs-config" "log-file" }}'`
`fromClusterClaim`	Returns the value of `spec.value` in the `ClusterClaim` resource.	`platform: '{{ fromClusterClaim "platform.open-cluster-management.io" }}'`
`lookup`	Returns the Kubernetes resource as a JSON compatible map. Note that if the requested resource does not exist, an empty map is returned.	`metrics-url: \|` `http://{{ (lookup "v1" "Service" "default" "metrics").spec.clusterIP }}:8080`
`base64enc`	Returns a `base64` encoded value of the input string.	`USER_NAME: '{{ fromConfigMap "default" "myconfigmap" "admin-user" \| base64enc }}'`
`base64dec`	Returns a `base64` decoded value of the input string.	`app-name: \|` `"{{ ( lookup "v1" "Secret" "testns" "mytestsecret") .data.appname ) \| base64dec }}"`
`indent`	Returns the input string indented by the given number of spaces.	`Ca-cert: \|` `{{ ( index ( lookup "v1" "Secret" "default" "mycert-tls" ).data "ca.pem" ) \| base64dec \| indent 4 }}`
`autoindent`	Acts like the `indent` function but automatically determines the number of leading spaces needed based on the number of spaces before the template.	`Ca-cert: \|` `{{ ( index ( lookup "v1" "Secret" "default" "mycert-tls" ).data "ca.pem" ) \| base64dec \| autoindent }}`
`toInt`	Returns the integer value of the string and ensures that the value is interpreted as an integer in the YAML.	`vlanid: \|` `{{ (fromConfigMap "site-config" "site1" "vlan") \| toInt }}`
`toBool`	Returns the boolean value of the input string and ensures that the value is interpreted as a boolean in the YAML.	`enabled: \|` `{{ (fromConfigMap "site-config" "site1" "enabled") \| toBool }}`
`protect`	Encrypts the input string. It is decrypted when the policy is evaluated. On the replicated policy in the managed cluster namespace, the resulting value resembles the following: `$ocm_encrypted:<encrypted-value>`	`enabled: \|` `{{hub "(lookup "route.openshift.io/v1" "Route" "openshift-authentication" "oauth-openshift").spec.host \| protect hub}}`

Additionally, OCM supports the following template functions that are included from the sprig open source project:

cat
contains
default
empty
fromJson
hasPrefix
hasSuffix
join
list
lower
mustFromJson
quote
replace
semver
semverCompare
split
splitn
ternary
trim
until
untilStep
upper

See the Sprig documentation for more details.

3.1.3 - Configuration Policy

The ConfigurationPolicy defines Kubernetes manifests to compare with objects that currently exist on the cluster. The Configuration policy controller is provided by Open Cluster Management and runs on managed clusters.

View the Policy API concepts page to learn more about the ConfigurationPolicy API.

Prerequisites

You must meet the following prerequisites to install the configuration policy controller:

Ensure kubectl and kustomize are installed.
Ensure Golang is installed, if you are planning to install from the source.
Ensure the open-cluster-management policy framework is installed. See Policy Framework for more information.

Installing the configuration policy controller

Deploy via Clusteradm CLI

Ensure clusteradm CLI is installed and is newer than v0.3.0. Download and extract the clusteradm binary. For more details see the clusteradm GitHub page.

Deploy the configuration policy controller to the managed clusters (this command is the same for a self-managed hub):

# Deploy the configuration policy controller
clusteradm addon enable addon --names config-policy-controller --clusters <cluster_name> --context ${CTX_HUB_CLUSTER}

Ensure the pod is running on the managed cluster with the following command:

$ kubectl get pods -n open-cluster-management-agent-addon
NAME                                               READY   STATUS    RESTARTS   AGE
config-policy-controller-7f8fb64d8c-pmfx4          1/1     Running   0          44s

Sample configuration policy

After a successful deployment, test the policy framework and configuration policy controller with a sample policy.

For more information on how to use a ConfigurationPolicy, read the Policy API concept section.

Run the following command to create a policy on the hub that uses Placement:

# Configure kubectl to point to the hub cluster
kubectl config use-context ${CTX_HUB_CLUSTER}

# Apply the example policy and placement
kubectl apply -n default -f https://raw.githubusercontent.com/open-cluster-management-io/policy-collection/main/community/CM-Configuration-Management/policy-pod-placement.yaml

Update the Placement to distribute the policy to the managed cluster with the following command (this clusterSelector will deploy the policy to all managed clusters):

kubectl patch -n default placement.cluster.open-cluster-management.io/placement-policy-pod --type=merge -p "{\"spec\":{\"predicates\":[{\"requiredClusterSelector\":{\"labelSelector\":{\"matchExpressions\":[]}}}]}}"

Make sure the default namespace has a ManagedClusterSetBinding for a ManagedClusterSet with at least one managed cluster resource in the ManagedClusterSet. See Bind ManagedClusterSet to a namespace for more information on this.

To confirm that the managed cluster is selected by the Placement, run the following command:

$ kubectl get -n default placementdecision.cluster.open-cluster-management.io/placement-policy-pod-decision-1 -o yaml
...
status:
  decisions:
  - clusterName: <managed cluster name>
    reason: ""
...

Enforce the policy to make the configuration policy automatically correct any misconfigurations on the managed cluster:

$ kubectl patch -n default policy.policy.open-cluster-management.io/policy-pod --type=merge -p "{\"spec\":{\"remediationAction\": \"enforce\"}}"
policy.policy.open-cluster-management.io/policy-pod patched

After a few seconds, your policy is propagated to the managed cluster. To confirm, run the following command:

$ kubectl config use-context ${CTX_MANAGED_CLUSTER}
$ kubectl get policy -A
NAMESPACE   NAME                 REMEDIATION ACTION   COMPLIANCE STATE   AGE
cluster1    default.policy-pod   enforce              Compliant          4m32s

The missing pod is created by the policy on the managed cluster. To confirm, run the following command on the managed cluster:

$ kubectl get pod -n default
NAME               READY   STATUS    RESTARTS   AGE
sample-nginx-pod   1/1     Running   0          23s

3.1.4 - Open Policy Agent Gatekeeper

Gatekeeper is a validating webhook with auditing capabilities that can enforce custom resource definition-based policies that are run with the Open Policy Agent (OPA). Gatekeeper constraints can be used to evaluate Kubernetes resource compliance. You can leverage OPA as the policy engine, and use Rego as the policy language.

Installing Gatekeeper

See the Gatekeeper documentation to install the desired version of Gatekeeper to the managed cluster.

Sample Gatekeeper policy

Gatekeeper policies are written using constraint templates and constraints. View the following YAML examples that use Gatekeeper constraints in an OCM Policy:

ConstraintTemplates and constraints: Use the Gatekeeper integration feature by using OCM policies for multicluster distribution of Gatekeeper constraints and Gatekeeper audit results aggregation on the hub cluster. The following example defines a Gatekeeper ConstraintTemplate and constraint (K8sRequiredLabels) to ensure the “gatekeeper” label is set on all namespaces:

apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
  name: require-gatekeeper-labels-on-ns
spec:
  remediationAction: inform # (1)
  disabled: false
  policy-templates:
    - objectDefinition:
        apiVersion: templates.gatekeeper.sh/v1beta1
        kind: ConstraintTemplate
        metadata:
          name: k8srequiredlabels
        spec:
          crd:
            spec:
              names:
                kind: K8sRequiredLabels
              validation:
                openAPIV3Schema:
                  properties:
                    labels:
                      type: array
                      items: string
          targets:
            - target: admission.k8s.gatekeeper.sh
              rego: |
                package k8srequiredlabels
                violation[{"msg": msg, "details": {"missing_labels": missing}}] {
                  provided := {label | input.review.object.metadata.labels[label]}
                  required := {label | label := input.parameters.labels[_]}
                  missing := required - provided
                  count(missing) > 0
                  msg := sprintf("you must provide labels: %v", [missing])
                }
    - objectDefinition:
        apiVersion: constraints.gatekeeper.sh/v1beta1
        kind: K8sRequiredLabels
        metadata:
          name: ns-must-have-gk
        spec:
          enforcementAction: dryrun
          match:
            kinds:
              - apiGroups: [""]
                kinds: ["Namespace"]
          parameters:
            labels: ["gatekeeper"]

Since the remediationAction is set to “inform”, the enforcementAction field of the Gatekeeper constraint is overridden to “warn”. This means that Gatekeeper detects and warns you about creating or updating a namespace that is missing the “gatekeeper” label. If the policy remediationAction is set to “enforce”, the Gatekeeper constraint enforcementAction field is overridden to “deny”. In this context, this configuration prevents any user from creating or updating a namespace that is missing the gatekeeper label.

With the previous policy, you might receive the following policy status message:

warn - you must provide labels: {“gatekeeper”} (on Namespace default); warn - you must provide labels: {“gatekeeper”} (on Namespace gatekeeper-system).

Once a policy containing Gatekeeper constraints or ConstraintTemplates is deleted, the constraints and ConstraintTemplates are also deleted from the managed cluster.

Notes:

The Gatekeeper audit functionality runs every minute by default. Audit results are sent back to the hub cluster to be viewed in the OCM policy status of the managed cluster.

Auditing Gatekeeper events: The following example uses an OCM configuration policy within an OCM policy to check for Kubernetes API requests denied by the Gatekeeper admission webhook:

apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
  name: policy-gatekeeper-admission
spec:
  remediationAction: inform
  disabled: false
  policy-templates:
    - objectDefinition:
      apiVersion: policy.open-cluster-management.io/v1
      kind: ConfigurationPolicy
      metadata:
        name: policy-gatekeeper-admission
      spec:
        remediationAction: inform # will be overridden by remediationAction in parent policy
        severity: low
        object-templates:
          - complianceType: mustnothave
            objectDefinition:
              apiVersion: v1
              kind: Event
              metadata:
                namespace: gatekeeper-system # set it to the actual namespace where gatekeeper is running if different
                annotations:
                  constraint_action: deny
                  constraint_kind: K8sRequiredLabels
                  constraint_name: ns-must-have-gk
                  event_type: violation

3.2 - Application lifecycle management

After the setup of Open Cluster Management (OCM) hub and managed clusters, you could install the OCM built-in application management add-on. The OCM application management add-on leverages the Argo CD to provide declarative GitOps based application lifecycle management across multiple Kubernetes clusters.

Architecture

Traditional Argo CD resource delivery primarily uses a push model, where resources are deployed from a centralized Argo CD instance to remote or managed clusters.

With the OCM Argo CD add-on, users can leverage a pull based resource delivery model, where managed clusters pull and apply application configurations.

For more details, visit the Argo CD Pull Integration GitHub page.

Prerequisites

You must meet the following prerequisites to install the application lifecycle management add-on:

Ensure kubectl is installed.
Ensure the OCM cluster manager is installed. See Start the control plane for more information.
Ensure the OCM klusterlet is installed. See Register a cluster for more information.
Ensure clusteradm CLI tool is installed. Download and extract the clusteradm binary. For more details see the clusteradm GitHub page.

Installation

Install Argo CD on the Hub cluster:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

See Argo CD website for more details.

Install the OCM Argo CD add-on on the Hub cluster:

clusteradm install hub-addon --names argocd

If your hub controller starts successfully, you should see:

$ kubectl -n argocd get deploy argocd-pull-integration
NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
argocd-pull-integration   1/1     1            1           55s

Enable the add-on for your choice of Managed clusters:

clusteradm addon enable --names argocd --clusters cluster1,cluster2

Replace cluster1 and cluster2 with your Managed cluster names.

If your add-on starts successfully, you should see:

$ kubectl -n cluster1 get managedclusteraddon argocd
NAME     AVAILABLE   DEGRADED   PROGRESSING
argocd   True                   False

On the Hub cluster, apply the example guestbook-app-set manifest:

kubectl apply -f https://raw.githubusercontent.com/open-cluster-management-io/ocm/refs/heads/main/solutions/deploy-argocd-apps-pull/example/guestbook-app-set.yaml

Note: The Application template inside the ApplicationSet must contain the following content:

labels:
  apps.open-cluster-management.io/pull-to-ocm-managed-cluster: 'true'
annotations:
  argocd.argoproj.io/skip-reconcile: 'true'
  apps.open-cluster-management.io/ocm-managed-cluster: '{{name}}'

The label allows the pull model controller to select the Application for processing.

The skip-reconcile annotation is to prevent the Application from reconciling on the Hub cluster.

The ocm-managed-cluster annotation is for the ApplicationSet to generate multiple Application based on each cluster generator targets.

When this guestbook ApplicationSet reconciles, it will generate an Application for the registered Managed clusters. For example:

$ kubectl -n argocd get appset
NAME            AGE
guestbook-app   84s
$ kubectl -n argocd get app
NAME                     SYNC STATUS   HEALTH STATUS
cluster1-guestbook-app
cluster2-guestbook-app

On the Hub cluster, the pull controller will wrap the Application with a ManifestWork. For example:

$ kubectl -n cluster1 get manifestwork
NAME                          AGE
cluster1-guestbook-app-d0e5   2m41s

On a Managed cluster, you should see that the Application is pulled down successfully. For example:

$ kubectl -n argocd get app
NAME                     SYNC STATUS   HEALTH STATUS
cluster1-guestbook-app   Synced        Healthy
$ kubectl -n guestbook get deploy
NAME           READY   UP-TO-DATE   AVAILABLE   AGE
guestbook-ui   1/1     1            1           7m36s

On the Hub cluster, the status controller will sync the dormant Application with the ManifestWork status feedback. For example:

$ kubectl -n argocd get app
NAME                     SYNC STATUS   HEALTH STATUS
cluster1-guestbook-app   Synced        Healthy
cluster2-guestbook-app   Synced        Healthy

3.3 - Cluster proxy

Cluster proxy is an OCM addon providing L4 network connectivity from hub cluster to the managed clusters without any additional requirement to the managed cluster’s network infrastructure by leveraging the Kubernetes official SIG sub-project apiserver-network-proxy.

Background

The original architecture of OCM allows a cluster from anywhere to be registered and managed by OCM’s control plane (i.e. the hub cluster) as long as a klusterlet agent can reach hub cluster’s endpoint. So the minimal requirement for the managed cluster’s network infrastructure in OCM is “klusterlet -> hub” connectivity. However, there are still some cases where the components in the hub cluster hope to proactively dail/request the services in the managed clusters which will need the “hub -> klusterlet” connectivity on the other hand. In addition to that, the cases can be even more complex when each of the managed clusters are not in the same network.

Cluster proxy is aiming at seamlessly delivering the outbound L4 requests to the services in the managed cluster’s network without any assumptions upon the infrastructure as long as the clusters are successfully registered. Basically the connectivity provided by cluster proxy is working over the secured reserve proxy tunnels established by the apiserver-network-proxy.

About apiserver-network-proxy

Apiserver-network-proxy is the underlying technique of a Kubernetes' feature called konnectivity egress-selector which is majorly for setting up a TCP-level proxy for kube-apiserver to get access to the node/cluster network. Here are a few terms we need to clarify before we elaborate on how the cluster proxy resolve multi-cluster control plane network connectivity for us:

Proxy Tunnel: A Grpc long connection that multiplexes and transmits TCP-level traffic from the proxy servers to the proxy agents. Note that there will be only one tunnel instance between each pair of server and agent.
Proxy Server: An mTLS Grpc server opened for establishing tunnels which is the traffic ingress of proxy tunnel.
Proxy Agent: A mTLS Grpc agent that maintains the tunnel between the server and is also the egress of the proxy tunnel.
Konnectivity Client: The SDK library for talking through the tunnel. Applicable to any Golang client of which the Dialer is overridable. Note that for non-golang clients, the proxy server also supports HTTP-Connect based proxying as alternative.

Architecture

Cluster proxy runs inside OCM’s hub cluster as an addon manager which is developed based on the Addon-Framework. The addon manager of cluster proxy will be responsible for:

Managing the installation of proxy servers in the hub cluster.
Managing the installation of proxy agents in the managed cluster.
Collecting healthiness and the other stats consistently in the hub cluster.

The following picture shows the overall architecture of cluster proxy:

Note that the green lines in the picture above is the active proxy tunnels between proxy servers and agents, and HA setup is natively supported by apiserver-network-proxy both for the servers and the agents. The orange dash line started by the konnectivity client is the path of how the traffic flows from the hub cluster to arbitrary managed clusters. Meanwhile the core components including registration and work will help us manage the lifecycle of all the components distributed in the multiple managed clusters, so the hub admin won’t need to directly operate the managed clusters to install or configure the proxy agents no more.

Prerequisite

You must meet the following prerequisites to install the cluster-proxy:

Ensure your open-cluster-management release is greater than v0.5.0.
Ensure kubectl is installed.
Ensure helm is installed.

Installation

To install the cluster proxy addon to the OCM control plane, run:

$ helm repo add ocm https://open-cluster-management.io/helm-charts
$ helm repo update
$ helm search repo ocm
NAME                              	CHART VERSION	APP VERSION	DESCRIPTION
ocm/cluster-proxy                 	v0.1.1       	1.0.0      	A Helm chart for Cluster-Proxy
...

Then run the following helm command to install the cluster-proxy addon:

$ helm install -n open-cluster-management-addon --create-namespace \
    cluster-proxy ocm/cluster-proxy

Note: If you’re using a non-Kind cluster, for example, an Openshift cluster, you need to configure the ManagedProxyConfiguration by setting proxyServer.entrypointAddress in the values.yaml to the address of the proxy server.

To do this at install time, you can run the following command:

$ helm install -n open-cluster-management-addon --create-namespace \
    cluster-proxy ocm/cluster-proxy \
    --set "proxyServer.entrypointAddress=<address of the proxy server>"

After the installation, you can check the deployment status of the cluster-proxy addon by running the following command:

$ kubectl -n open-cluster-management-addon get deploy
NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
cluster-proxy                          3/3     3            3           24h
cluster-proxy-addon-manager            1/1     1            1           24h
...

Then the addon manager of cluster-proxy will be created into the hub cluster in the form of a deployment named cluster-proxy-addon-manager. As is also shown above, the proxy servers will also be created as deployment resource called cluster-proxy.

By default, the addon manager will be automatically discovering the addition or removal the managed clusters and installs the proxy agents into them on the fly. To check out the healthiness status of the proxy agents, we can run:

$  kubectl get managedclusteraddon -A
NAMESPACE     NAME                     AVAILABLE   DEGRADED   PROGRESSING
<cluster#1>   cluster-proxy            True
<cluster#2>   cluster-proxy            True

The proxy agent distributed in the managed cluster will be periodically renewing the lease lock of the addon instance.

Usage

Command-line tools

Using the clusteradm to check the status of the cluster-proxy addon:

$ clusteradm proxy health
CLUSTER NAME    INSTALLED    AVAILABLE    PROBED HEALTH    LATENCY
<cluster#1>     True         True         True             67.595144ms
<cluster#2>     True         True         True             85.418368ms

Example code

An example client in the cluster proxy repo shows us how to dynamically talk to the kube-apiserver of a managed cluster from the hub cluster by simply prescribing the name of the target cluster. Here’s also a TL;DR code snippet:

// 1. instantiate a dialing tunnel instance.
// NOTE: recommended to be a singleton in your golang program.
tunnel, err := konnectivity.CreateSingleUseGrpcTunnel(
    context.TODO(),
    <your proxy server endpoint>,
    grpc.WithTransportCredentials(grpccredentials.NewTLS(<your proxy server TLS config>)),
)
if err != nil {
    panic(err)
}
...
// 2. Overriding the Dialer to tunnel. Dialer is a common abstraction
// in Golang SDK.
cfg.Dial = tunnel.DialContext

Another example will be cluster-gateway which is an aggregated apiserver optionally working over cluster-proxy for routing traffic to the managed clusters dynamically in HTTPs protocol.

Note that by default the client credential for konnectivity client will be persisted as secrets resources under the namespace where the addon-manager is running. With that being said, to mount the secret to the systems in the other namespaces, the users are expected to copy the secret on their own manually.

More insights

Troubleshooting

The installation of proxy servers and agents are prescribed by the custom resource called “managedproxyconfiguration”. We can check it out by the following commands:

$ kubectl get managedproxyconfiguration cluster-proxy -o yaml
apiVersion: proxy.open-cluster-management.io/v1alpha1
kind: ManagedProxyConfiguration
metadata: ...
spec:
  proxyAgent:
    image: <expected image of the proxy agents>
    replicas: <expected replicas of proxy agents>
  proxyServer:
    entrypoint:
      loadBalancerService:
        name: proxy-agent-entrypoint
      type: LoadBalancerService # Or "Hostname" to set a fixed address
                                # for establishing proxy tunnels.
    image: <expected image of the proxy servers>
    inClusterServiceName: proxy-entrypoint
    namespace: <target namespace to install proxy server>
    replicas: <expected replicas of proxy servers>
  authentication: # Customize authentication between proxy server/agent
status:
  conditions: ...

See the original design proposal for reference.

3.4 - Managed service account

Managed Service Account is an OCM addon enabling a hub cluster admin to manage service account across multiple clusters on ease. By controlling the creation and removal of the service account, the addon agent will project and rotate the corresponding token back to the hub cluster which is very useful for the Kube API client from the hub cluster to request against the managed clusters.

Background

Normally there are two major approaches for a Kube API client to authenticate and access a Kubernetes cluster:

Valid X.509 certificate-key pair
Service account bearer token

The service account token will be automatically persisted as a secret resource inside the hosting Kubernetes clusters upon creation, which is commonly used for the “in-cluster” client. However, in terms of OCM, the hub cluster is completely an external system to the managed clusters, so we will need a local agent in each managed cluster to reflect the tokens consistently to the hub cluster so that the Kube API client from hub cluster can “push” the requests directly against the managed cluster. By delegating the multi-cluster service account management to this addon, we can:

Project the service account token from the managed clusters to the hub cluster with custom API audience.
Rotate the service account tokens dynamically.
Homogenize the client identities so that we can easily write a static RBAC policy that applies to multiple managed clusters.

Prerequisite

You must meet the following prerequisites to install the managed service account:

Ensure your open-cluster-management release is greater than v0.5.0.
Ensure kubectl is installed.
Ensure helm is installed.

Installation

To install the managed service account addon to the OCM control plane, run:

$ helm repo add ocm https://open-cluster-management.io/helm-charts
$ helm repo update
$ helm search repo ocm
NAME                              	CHART VERSION	APP VERSION	DESCRIPTION
ocm/managed-serviceaccount          <...>           1.0.0       A Helm chart for Managed ServiceAccount Addon
...

Then run the following helm command to continue the installation:

$ helm install -n open-cluster-management-addon --create-namespace \
    managed-serviceaccount  ocm/managed-serviceaccount
$ kubectl -n open-cluster-management-addon get pod
NAME                                                    READY   STATUS    RESTARTS   AGE
managed-serviceaccount-addon-manager-5m9c95b7d8-xsb94   1/1     Running   1          4d4h
...

By default, the addon manager will be automatically discovering the addition or removal the managed clusters and installs the managed serviceaccount agents into them on the fly. To check out the healthiness status of the managed serviceaccount agents, we can run:

$ kubectl get managedclusteraddon -A
NAMESPACE         NAME                     AVAILABLE   DEGRADED   PROGRESSING
<cluster name>    managed-serviceaccount   True

Usage

To exercise the new ManagedServiceAccount API introduced by this addon, we can start by applying the following sample resource:

$ export CLUSTER_NAME=<cluster name>
$ kubectl create -f - <<EOF
apiVersion: authentication.open-cluster-management.io/v1alpha1
kind: ManagedServiceAccount
metadata:
  name: my-sample
  namespace: ${CLUSTER_NAME}
spec:
  rotation: {}
EOF

Then the addon agent in each of the managed cluster is responsible for executing and refreshing the status of the ManagedServiceAccount, e.g.:

$ kubectl describe ManagedServiceAccount -n cluster1
...
  status:
    conditions:
    - lastTransitionTime: "2021-12-09T09:08:15Z"
      message: ""
      reason: TokenReported
      status: "True"
      type: TokenReported
    - lastTransitionTime: "2021-12-09T09:08:15Z"
      message: ""
      reason: SecretCreated
      status: "True"
      type: SecretCreated
    expirationTimestamp: "2022-12-04T09:08:15Z"
    tokenSecretRef:
      lastRefreshTimestamp: "2021-12-09T09:08:15Z"
      name: my-sample

The service account will be created in the managed cluster (assume the name is cluster1):

$ kubectl get sa my-sample -n open-cluster-management-managed-serviceaccount --context kind-cluster1
NAME        SECRETS   AGE
my-sample   1         9m57s

The corresponding secret will also be created in the hub cluster, which is visible via:

$ kubectl -n <your cluster> get secret my-sample
NAME        TYPE     DATA   AGE
my-sample   Opaque   2      2m23s

Repo: https://github.com/open-cluster-management-io/managed-serviceaccount

See the design proposal at: https://github.com/open-cluster-management-io/enhancements/tree/main/enhancements/sig-architecture/19-projected-serviceaccount-token

3.5 - Multicluster Control Plane

What is `Multicluster Control Plane`

The multicluster control plane is a lightweight Open Cluster Manager (OCM) control plane that is easy to install and has a small footprint. It can be running anywhere with or without a Kubernetes environment to serve the OCM control plane capabilities.

Why use `Multicluster Control Plane`

Some Kubernetes environments do not have CSR (e.g., EKS) so that the standard OCM control plane cannot be installed. The multicluster control plane can be able to install in these environments and expose the OCM control plane API via loadbalancer.
Some users may want to run multiple OCM control planes to isolate the data. The typical case is that the user wants to run one OCM control plane for production and another OCM control plane for development. The multicluster control plane is able to be installed in different namespaces in a single cluster. Each multicluster control plane is running independently and serving the OCM control plane capabilities.
Some users may want to run the OCM control plane without a Kubernetes environment. The multicluster control plane can run in a standalone mode, for example, running in a VM. Expose the control plane API to the outside so the managed clusters can register to it.

How to use `Multicluster Control Plane`

Start the standalone multicluster control plane

You need build multicluster-controlplane in your local host. Follow the below steps to build the binary and start the multicluster control plane.

git clone https://github.com/open-cluster-management-io/multicluster-controlplane.git
cd multicluster-controlplane
make run

Once the control plane is running, you can access the control plane by using kubectl --kubeconfig=./_output/controlplane/.ocm/cert/kube-aggregator.kubeconfig.

You can customize the control plane configurations by creating a config file and using the environment variable CONFIG_DIR to specify your config file directory. Please check the repository documentation for details.

Install via clusteradm

Install clusteradm CLI tool

It’s recommended to run the following command to download and install the latest release of the clusteradm command-line tool:

curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash

Install multicluster control plane

You can use clusteradm init to deploy the multicluster control plane in your Kubernetes environment.

Set the environment variable KUBECONFIG to your cluster kubeconfig path. For instance, create a new KinD cluster and deploy multicluster control plane in it.

export KUBECONFIG=/tmp/kind-controlplane.kubeconfig
kind create cluster --name multicluster-controlplane
export mc_cp_node_ip=$(kubectl get nodes -o=jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}')

Run following command to deploy a control plane

clusteradm init --singleton=true --set route.enabled=false --set nodeport.enabled=true --set nodeport.port=30443 --set apiserver.externalHostname=$mc_cp_node_ip --set apiserver.externalPort=30443 --singleton-name multicluster-controlplane

Refer to the repository documentation for how to customize the control plane configurations.

Get the control plane kubeconfig by running the following command:

kubectl -n multicluster-controlplane get secrets multicluster-controlplane-kubeconfig -ojsonpath='{.data.kubeconfig}' | base64 -d > /tmp/multicluster-controlplane.kubeconfig

Join a cluster to the multicluster control plane

You can use clusteradm to join a cluster. For instance, take the KinD cluster as an example, run the following command to join the cluster to the control plane:

kind create cluster --name cluster1 --kubeconfig /tmp/kind-cluster1.kubeconfig
clusteradm --kubeconfig=/tmp/multicluster-controlplane.kubeconfig get token --use-bootstrap-token
clusteradm --singleton=true --kubeconfig /tmp/kind-cluster1.kubeconfig join --hub-token <controlplane token> --hub-apiserver https://$mc_cp_node_ip:30443/ --cluster-name cluster1
clusteradm --kubeconfig=/tmp/multicluster-controlplane.kubeconfig accept --clusters cluster1

Verify the cluster join

Run this command to verify the cluster join:

kubectl --kubeconfig=/tmp/multicluster-controlplane.kubeconfig get managedcluster
NAME       HUB ACCEPTED   MANAGED CLUSTER URLS                  JOINED   AVAILABLE   AGE
cluster1   true           https://cluster1-control-plane:6443   True     True        5m25s

You should see the managedcluster joins to the multicluster control plane. Congratulations!

3.6 - FleetConfig Controller

What is the `FleetConfig Controller`

The fleetconfig-controller introduces the FleetConfig custom resource to the OCM ecosystem. It reconciles FleetConfig resources to declaratively manage the lifecycle of Open Cluster Management (OCM) multi-clusters.

The fleetconfig-controller will initialize an OCM hub and one or more managed clusters; add, remove, and upgrade clustermanagers and klusterlets when their bundle versions change, manage their feature gates, and uninstall all OCM components properly whenever a FleetConfig is deleted.

The controller is a lightweight wrapper around clusteradm. Anything you can accomplish imperatively via a series of clusteradm commands can now be accomplished declaratively using the fleetconfig-controller.

Quick Start

Prerequisites

Helm v3.17+

Installation

The controller is installed via Helm.

helm repo add ocm https://open-cluster-management.io/helm-charts
helm repo update ocm
helm install fleetconfig-controller ocm/fleetconfig-controller -n fleetconfig-system --create-namespace

By default the Helm chart will also produce a FleetConfig to orchestrate, however that behaviour can be disabled. Refer to the chart README for full documentation.

Support Matrix

Support for orchestration of OCM multi-clusters varies based on the Kubernetes distribution and/or cloud provider.

Kubernetes Distribution	Support Level
Vanilla Kubernetes	✅ Fully Supported
Amazon EKS	✅ Fully Supported
Google GKE	✅ Fully Supported
Azure AKS	🚧 On Roadmap

4 - Administration

A few general guide about operating the open-cluster-management’s control plane and the managed clusters.

4.1 - Monitoring OCM using Prometheus-Operator

In this page, we provide a way to monitor your OCM environment using Prometheus-Operator.

Before you get started

You must have an OCM environment set up. You can also follow our recommended quick start guide to set up a playground OCM environment.

And then please install the Prometheus-Operator in your hub cluster. You can also run the following commands copied from the official doc:

git clone https://github.com/prometheus-operator/kube-prometheus.git
cd kube-prometheus

# Create the namespace and CRDs, and then wait for them to be availble before creating the remaining resources
kubectl create -f manifests/setup

# Wait until the "servicemonitors" CRD is created. The message "No resources found" means success in this context.
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done

kubectl create -f manifests/

Monitoring the control-plane resource usage.

You can use kubectl proxy to open prometheus UI in your browser on localhost:9090:

kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090

The following queries are to monitor the control-plane pods’ cpu usage, memory usage and apirequestcount for critical CRs:

rate(container_cpu_usage_seconds_total{namespace=~"open-cluster-management.*"}[3m])

container_memory_working_set_bytes{namespace=~"open-cluster-management.*"}

rate(apiserver_request_total{resource=~"managedclusters|managedclusteraddons|managedclustersetbindings|managedclustersets|addonplacementscores|placementdecisions|placements|manifestworks|manifestworkreplicasets"}[1m])

Visualized with Grafana

We provide an initial grafana dashboard for you to visualize the metrics. But you can also customize your own dashboard.

First, use the following command to proxy grafana service:

kubectl --namespace monitoring port-forward svc/grafana 3000

Next, open the grafana UI in your browser on localhost:3000.

Click the “Import Dashboard” and run the following command to copy a sample dashboard and paste it to the grafana:

curl https://raw.githubusercontent.com/open-cluster-management-io/open-cluster-management-io.github.io/main/content/en/getting-started/administration/assets/grafana-sample.json | pbcopy

Then, you will get a sample grafana dashboard that you can fine-tune further:

grafana

4.2 - Upgrading your OCM environment

This page provides the suggested steps to upgrade your OCM environment including both the hub cluster and the managed clusters. Overall the major steps you should follow are:

Read the release notes to confirm the latest OCM release version. (Note that some add-ons’ version might be different from OCM’s overall release version.)
Upgrade your command line tools clusteradm

Before you begin

You must have an existing OCM environment and there’s supposed to be registration-operator running in your clusters. The registration-operators is supposed to be installed if you’re previously following our recommended quick start guide to set up your OCM. The operator is responsible for helping you upgrade the other components on ease.

Upgrade command-line tool

In order to retrieve the latest version of OCM’s command-line tool clusteradm, run the following one-liner command:

$ curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash

Then you’re supposed to see the following outputs:

Getting the latest clusteradm CLI...
Your system is darwin_amd64

clusteradm CLI is detected:
Reinstalling clusteradm CLI - /usr/local/bin/clusteradm...

Installing v0.1.0 OCM clusteradm CLI...
Downloading https://github.com/open-cluster-management-io/clusteradm/releases/download/v0.1.0/clusteradm_darwin_amd64.tar.gz ...
clusteradm installed into /usr/local/bin successfully.

To get started with clusteradm, please visit https://open-cluster-management.io/getting-started/

Also, your can confirm the installed cli version by running:

$ clusteradm version
client		    version	:v0.1.0
server release	version	: ...

Upgrade OCM Components via Command-line tool

Hub Cluster

For example, to upgrade OCM components in the hub cluster, run the following command:

$ clusteradm upgrade clustermanager --bundle-version=0.7.0

Then clusteradm will make sure everything in the hub cluster is upgraded to the expected version. To check the latest status after the upgrade, continue to run the following command:

$ clusteradm get hub-info

Managed Clusters

To upgrade the OCM components in the managed clusters, switch the client context e.g. overriding KUBECONFIG environment variable, then simply run the following command:

$ clusteradm upgrade klusterlet --bundle-version=0.7.0

To check the status after the upgrade, continue running this command against the managed cluster:

$ clusteradm get klusterlet-info

Upgrade OCM Components via Manual Edit

Hub Cluster

Upgrading the registration-operator

Navigate into the namespace where you installed registration-operator (named “open-cluster-management” by default) and edit the image version of its deployment resource:

$ kubectl -n open-cluster-management edit deployment cluster-manager

Then update the image tag version to your target release version, which is exactly the OCM’s overall release version.

--- image: quay.io/open-cluster-management/registration-operator:<old release>
+++ image: quay.io/open-cluster-management/registration-operator:<new release>

Upgrading the core components

After the upgrading of registration-operator is done, it’s time to upgrade the working modules of OCM. Go on and edit the clustermanager custom resource to prescribe the registration-operator to perform the automated upgrading:

$ kubectl edit clustermanager cluster-manager

In the content of clustermanager resource, you’re supposed to see a few images listed in its spec:

apiVersion: operator.open-cluster-management.io/v1
kind: ClusterManager
metadata: ...
spec:
  registrationImagePullSpec: quay.io/open-cluster-management/registration:<target release>
  workImagePullSpec: quay.io/open-cluster-management/work:<target release>
  # NOTE: Placement release versioning differs from the OCM root version, please refer to the release note.
  placementImagePullSpec: quay.io/open-cluster-management/placement:<target release>

Replacing the old release version to the latest and commit the changes will trigger the process of background upgrading. Note that the status of upgrade can be actively tracked via the status of clustermanager, so if anything goes wrong during the upgrade it should also be reflected in that status.

Managed Clusters

Upgrading the registration-operator

Similar to the process of upgrading hub’s registration-operator, the only difference you’re supposed to notice when upgrading the managed cluster is the name of deployment. Note that before running the following command, you are expected to switch the context to access the managed clusters not the hub.

$ kubectl -n open-cluster-management edit deployment klusterlet

Then repeatedly, update the image tag version to your target release version and commit the changes will upgrade the registration-operator.

Upgrading the agent components

After the registration-operator is upgraded, move on and edit the corresponding klusterlet custom resource to trigger the upgrading process in your managed cluster:

$ kubectl edit klusterlet klusterlet

In the spec of klusterlet, what is expected to be updated is also its image list:

apiVersion: operator.open-cluster-management.io/v1
kind: Klusterlet
metadata: ...
spec:
  ...
  registrationImagePullSpec: quay.io/open-cluster-management/registration:<target release>
  workImagePullSpec: quay.io/open-cluster-management/work:<target release>

After committing the updates, actively checking the status of the klusterlet to confirm whether everything is correctly upgraded. And repeat the above steps to each of the managed clusters to perform a cluster-wise progressive upgrade.

Confirm the upgrade

Getting the overall status of the managed cluster will help you to detect the availability in case any of the managed clusters are running into failure:

$ kubectl get managedclusters

And the upgrading is all set if all the steps above have succeeded.

4.3 - Feature Gates

Feature gates are a way to enable or disable experimental or optional features in Open Cluster Management (OCM). They provide a safe mechanism to gradually roll out new functionality and maintain backward compatibility.

Overview

OCM uses Kubernetes’ feature gate mechanism to control the availability of features across different components:

Hub Components: Features running on the hub cluster
Spoke Components: Features running on managed clusters

Feature gates follow a standard lifecycle:

Alpha (disabled by default): Experimental features that may change or be removed
Beta (enabled by default): Well-tested features that are expected to be promoted to GA
GA (always enabled): Stable features that are part of the core functionality

Available Feature Gates

Registration Features

Hub Registration Features

Feature Gate	Default	Stage	Description
`DefaultClusterSet`	`true`	Alpha	When it is enabled, it will make registration hub controller to maintain a default clusterset and a global clusterset. Adds clusters without cluster set labels to the default cluster set. All clusters will be included to the global clusterset.
`V1beta1CSRAPICompatibility`	`false`	Alpha	When it is enabled, it will make the spoke registration agent to issue CSR requests via V1beta1 api.
`ManagedClusterAutoApproval`	`false`	Alpha	When it is enabled, it will approve a managed cluster registration request automatically.
`ResourceCleanup`	`true`	Beta	When it is enabled, it will start gc controller to clean up resources in cluster ns after cluster is deleted.
`ClusterProfile`	`false`	Alpha	When it is enabled, it will start new controller in the Hub that can be used to sync ManagedCluster to ClusterProfile.
`ClusterImporter`	`false`	Alpha	When it is enabled, it will enable the auto import of managed cluster for certain cluster providers, e.g. cluster-api.

Spoke Registration Features

Feature Gate	Default	Stage	Description
`ClusterClaim`	`true`	Beta	When it is enabled, will start a new controller in the spoke-agent to manage the cluster-claim resources in the managed cluster.
`ClusterProperty`	`false`	Alpha	When it is enabled on the spoke agent, it will use the claim controller to manage the managed cluster property.
`AddonManagement`	`true`	Beta	When it is enabled on the spoke agent, it will start a new controllers to manage the managed cluster addons registration and maintains the status of managed cluster addons through watching their leases.
`V1beta1CSRAPICompatibility`	`false`	Alpha	Will make the spoke registration agent to issue CSR requests via V1beta1 api.
`MultipleHubs`	`false`	Alpha	Enables configuration of multiple hub clusters for high availability. Allows user to configure multiple bootstrapkubeconfig connecting to different hubs via Klusterlet and let agent decide which one to use.

Work Management Features

Hub Work Features

Feature Gate	Default	Stage	Description
`NilExecutorValidating`	`false`	Alpha	When it is enabled, it will make the work-webhook to validate ManifestWork even when executor is nil, checking execute-as permissions with default executor.
`ManifestWorkReplicaSet`	`false`	Alpha	When it is enabled, it will start new controller in the Hub that can be used to deploy manifestWorks to group of clusters selected by a placement.
`CloudEventsDrivers`	`false`	Alpha	When it is enabled, it will enable the cloud events drivers (mqtt or grpc) for the hub controller, so that the controller can deliver manifestworks to the managed clusters via cloud events.

Spoke Work Features

Feature Gate	Default	Stage	Description
`ExecutorValidatingCaches`	`false`	Alpha	When it is enabled, it will start a new controller in the work agent to cache subject access review validating results for executors.
`RawFeedbackJsonString`	`false`	Alpha	When it is enabled, it will make the work agent to return the feedback result as a json string if the result is not a scalar value.

Addon Management Features

Feature Gate	Default	Stage	Description
`AddonManagement`	`true`	Beta	When it is enabled on hub controller, it will start a new controller to process addon automatic installation and rolling out.

Configuration Methods

1. Command Line Flags

Feature gates can be configured using command line flags when starting OCM components:

# Enable a single feature gate
clusteradm init --feature-gates=DefaultClusterSet=true

# Disable a feature gate
clusteradm init --feature-gates=ClusterClaim=false

# Configure multiple feature gates
clusteradm init --feature-gates=ClusterClaim=false,AddonManagement=true,DefaultClusterSet=false

2. Operator Configuration

Feature gates can be configured through the ClusterManager and Klusterlet custom resources:

ClusterManager Configuration (Hub)

apiVersion: operator.open-cluster-management.io/v1
kind: ClusterManager
metadata:
  name: cluster-manager
spec:
  registrationConfiguration:
    featureGates:
    - feature: DefaultClusterSet
      mode: Enable
    - feature: ManagedClusterAutoApproval
      mode: Enable
  workConfiguration:
    featureGates:
    - feature: ManifestWorkReplicaSet
      mode: Enable
  addOnManagerConfiguration:
    featureGates:
    - feature: AddonManagement
      mode: Enable

Klusterlet Configuration (Spoke)

apiVersion: operator.open-cluster-management.io/v1
kind: Klusterlet
metadata:
  name: klusterlet
spec:
  registrationConfiguration:
    featureGates:
    - feature: ClusterClaim
      mode: Disable
    - feature: AddonManagement
      mode: Enable
  workConfiguration:
    featureGates:
    - feature: ExecutorValidatingCaches
      mode: Enable

Getting Started

1 - Quick Start

Prerequisites

Install clusteradm CLI tool

Setup hub and managed cluster

What is next

2 - Installation

2.1 - Start the control plane

Prerequisites

Network requirements

Install clusteradm CLI tool

Bootstrap a cluster manager

Configure CPU and memory resources

Check out the running instances of the control plane

Uninstall the OCM from the control plane

2.2 - Register a cluster

Prerequisites

Network requirements

Install clusteradm CLI tool

Bootstrap a klusterlet

Configure CPU and memory resources

Bootstrap a klusterlet in hosted mode (Optional)

Bootstrap a klusterlet in singleton mode

Accept the join request and verify

Apply a Manifestwork

Troubleshooting

Detach the cluster from hub

Resource cleanup when the managed cluster is deleted

2.3 - Add-on management

Add-on enablement

Enable the add-on manually

Enable the add-on automatically

Enable the add-on by install strategy

Add-on healthiness

Clean the add-ons

Add-on lifecycle management

Install strategy

Rollout strategy

Add-on configurations

Default configurations

Configurations per install strategy

Configurations per cluster

Supported configurations

Effective configurations

2.4 - Running on EKS

3 - Add-ons and Integrations

3.1 - Policy

Policy framework

Policy API concepts

Supported managed cluster policy engines

Configuration policy

Open Policy Agent Gatekeeper

3.1.1 - Policy framework

API Concepts

Architecture

Prerequisite

Install the governance-policy-framework hub components

Install via Clusteradm CLI

Deploy the synchronization components to the managed cluster(s)

Deploy via Clusteradm CLI

What is next

3.1.2 - Policy API concepts

Overview

Policy

PlacementBinding

PolicySet

Managed cluster policy controllers

Templating in configuration policies

Hub cluster templating in configuration policies

Templating value encryption

Templating functions

3.1.3 - Configuration Policy

Prerequisites

Installing the configuration policy controller

Deploy via Clusteradm CLI

Sample configuration policy

3.1.4 - Open Policy Agent Gatekeeper

Installing Gatekeeper

Sample Gatekeeper policy

3.2 - Application lifecycle management

What is `Multicluster Control Plane`

Why use `Multicluster Control Plane`

How to use `Multicluster Control Plane`

What is the `FleetConfig Controller`