This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Installation

Install the core control plane that includes cluster registration and manifests distribution on the hub cluster.

Install the klusterlet agent on the managed cluster so that it can be registered and managed by the hub cluster.

1 - Start the control plane

Prerequisite

  • The hub cluster should be v1.19+. (To run on hub cluster version between [v1.16, v1.18], please manually enable feature gate “V1beta1CSRAPICompatibility”).
  • Currently the bootstrap process relies on client authentication via CSR. Therefore, if your Kubernetes distributions (like EKS) don’t support it, you can choose the multicluster-controlplane as the hub controlplane.
  • Ensure kubectl and kustomize are installed.

Network requirements

Configure your network settings for the hub cluster to allow the following connections.

Direction Endpoint Protocol Purpose Used by
Inbound https://{hub-api-server-url}:{port} TCP Kubernetes API server of the hub cluster OCM agents, including the add-on agents, running on the managed clusters

Install clusteradm CLI tool

It’s recommended to run the following command to download and install the latest release of the clusteradm command-line tool:

curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash

You can also install the latest development version (main branch) by running:

# Installing clusteradm to $GOPATH/bin/
GO111MODULE=off go get -u open-cluster-management.io/clusteradm/...

Bootstrap a cluster manager

Before actually installing the OCM components into your clusters, export the following environment variables in your terminal before running our command-line tool clusteradm so that it can correctly discriminate the hub cluster.

# The context name of the clusters in your kubeconfig
export CTX_HUB_CLUSTER=<your hub cluster context>

Call clusteradm init:

 # By default, it installs the latest release of the OCM components.
 # Use e.g. "--bundle-version=latest" to install latest development builds.
 # NOTE: For hub cluster version between v1.16 to v1.19 use the parameter: --use-bootstrap-token
 clusteradm init --wait --context ${CTX_HUB_CLUSTER}

The clusteradm init command installs the registration-operator on the hub cluster, which is responsible for consistently installing and upgrading a few core components for the OCM environment.

After the init command completes, a generated command is output on the console to register your managed clusters. An example of the generated command is shown below.

clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub kube-apiserver endpoint> \
    --wait \
    --cluster-name <cluster_name>

It’s recommended to save the command somewhere secure for future use. If it’s lost, you can use clusteradm get token to get the generated command again.

Check out the running instances of the control plane

kubectl -n open-cluster-management get pod --context ${CTX_HUB_CLUSTER}
NAME                               READY   STATUS    RESTARTS   AGE
cluster-manager-695d945d4d-5dn8k   1/1     Running   0          19d

Additionally, to check out the instances of OCM’s hub control plane, run the following command:

kubectl -n open-cluster-management-hub get pod --context ${CTX_HUB_CLUSTER}
NAME                               READY   STATUS    RESTARTS   AGE
cluster-manager-placement-controller-857f8f7654-x7sfz      1/1     Running   0          19d
cluster-manager-registration-controller-85b6bd784f-jbg8s   1/1     Running   0          19d
cluster-manager-registration-webhook-59c9b89499-n7m2x      1/1     Running   0          19d
cluster-manager-work-webhook-59cf7dc855-shq5p              1/1     Running   0          19d
...

The overall installation information is visible on the clustermanager custom resource:

kubectl get clustermanager cluster-manager -o yaml --context ${CTX_HUB_CLUSTER}

Uninstall the OCM from the control plane

Before uninstalling the OCM components from your clusters, please detach the managed cluster from the control plane.

clusteradm clean --context ${CTX_HUB_CLUSTER}

Check the instances of OCM’s hub control plane are removed.

kubectl -n open-cluster-management-hub get pod --context ${CTX_HUB_CLUSTER}
No resources found in open-cluster-management-hub namespace.
kubectl -n open-cluster-management get pod --context ${CTX_HUB_CLUSTER}
No resources found in open-cluster-management namespace.

Check the clustermanager resource is removed from the control plane.

kubectl get clustermanager --context ${CTX_HUB_CLUSTER}
error: the server doesn't have a resource type "clustermanager"

2 - Register a cluster

After the cluster manager is installed on the hub cluster, you need to install the klusterlet agent on another cluster so that it can be registered and managed by the hub cluster.

Prerequisite

  • The managed clusters should be v1.11+.
  • Ensure kubectl and kustomize are installed.

Network requirements

Configure your network settings for the managed clusters to allow the following connections.

Direction Endpoint Protocol Purpose Used by
Outbound https://{hub-api-server-url}:{port} TCP Kubernetes API server of the hub cluster OCM agents, including the add-on agents, running on the managed clusters

To use a proxy, please make sure the proxy server is well configured to allow the above connections and the proxy server is reachable for the managed clusters. See Register a cluster to hub through proxy server for more details.

Install clusteradm CLI tool

It’s recommended to run the following command to download and install the latest release of the clusteradm command-line tool:

curl -L https://raw.githubusercontent.com/open-cluster-management-io/clusteradm/main/install.sh | bash

You can also install the latest development version (main branch) by running:

# Installing clusteradm to $GOPATH/bin/
GO111MODULE=off go get -u open-cluster-management.io/clusteradm/...

Bootstrap a klusterlet

Before actually installing the OCM components into your clusters, export the following environment variables in your terminal before running our command-line tool clusteradm so that it can correctly discriminate the managed cluster:

# The context name of the clusters in your kubeconfig
export CTX_HUB_CLUSTER=<your hub cluster context>
export CTX_MANAGED_CLUSTER=<your managed cluster context>

Copy the previously generated command – clusteradm join, and add the arguments respectively based on the different distribution.

NOTE: If there is no configmap kube-root-ca.crt in kube-public namespace of the hub cluster, the flag –ca-file should be set to provide a valid hub ca file to help set up the external client.

# NOTE: For KinD clusters use the parameter: --force-internal-endpoint-lookup
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --force-internal-endpoint-lookup \
    --context ${CTX_MANAGED_CLUSTER}
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \   # Or other arbitrary unique name
    --context ${CTX_MANAGED_CLUSTER}

Bootstrap a klusterlet in hosted mode(Optional)

Using the above command, the klusterlet components(registration-agent and work-agent) will be deployed on the managed cluster, it is mandatory to expose the hub cluster to the managed cluster. We provide an option for running the klusterlet components outside the managed cluster, for example, on the hub cluster(hosted mode).

The hosted mode deploying is till in experimental stage, consider to use it only when:

  • want to reduce the footprints of the managed cluster.
  • do not want to expose the hub cluster to the managed cluster directly

In hosted mode, the cluster where the klusterlet is running is called the hosting cluster. Running the following command to the hosting cluster to register the managed cluster to the hub.

# NOTE for KinD clusters:
#  1. hub is KinD, use the parameter: --force-internal-endpoint-lookup
#  2. managed is Kind, --managed-cluster-kubeconfig should be internal: `kind get kubeconfig --name managed --internal`
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --mode hosted \
    --managed-cluster-kubeconfig <your managed cluster kubeconfig> \    # Should be an internal kubeconfig
    --force-internal-endpoint-lookup \
    --context <your hosting cluster context>
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --mode hosted \
    --managed-cluster-kubeconfig <your managed cluster kubeconfig> \
    --context <your hosting cluster context>

Bootstrap a klusterlet in singleton mode

To reduce the footprint of agent in the managed cluster, singleton mode is introduced since v0.12.0. In the singleton mode, the work and registration agent will be run as a single pod in the managed cluster.

Note: to run klusterlet in singleton mode, you must have a clusteradm version equal or higher than v0.12.0

# NOTE: For KinD clusters use the parameter: --force-internal-endpoint-lookup
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \    # Or other arbitrary unique name
    --singleton \
    --force-internal-endpoint-lookup \
    --context ${CTX_MANAGED_CLUSTER}
clusteradm join \
    --hub-token <your token data> \
    --hub-apiserver <your hub cluster endpoint> \
    --wait \
    --cluster-name "cluster1" \   # Or other arbitrary unique name
    --singleton \
    --context ${CTX_MANAGED_CLUSTER}

Accept the join request and verify

After the OCM agent is running on your managed cluster, it will be sending a “handshake” to your hub cluster and waiting for an approval from the hub cluster admin. In this section, we will walk through accepting the registration requests from the perspective of an OCM’s hub admin.

  1. Wait for the creation of the CSR object which will be created by your managed clusters’ OCM agents on the hub cluster:

    kubectl get csr -w --context ${CTX_HUB_CLUSTER} | grep cluster1  # or the previously chosen cluster name
    

    An example of a pending CSR request is shown below:

    cluster1-tqcjj   33s   kubernetes.io/kube-apiserver-client   system:serviceaccount:open-cluster-management:cluster-bootstrap   Pending
    
  2. Accept the join request using the clusteradm tool:

    clusteradm accept --clusters cluster1 --context ${CTX_HUB_CLUSTER}
    

    After running the accept command, the CSR from your managed cluster named “cluster1” will be approved. Additionally, it will instruct the OCM hub control plane to setup related objects (such as a namespace named “cluster1” in the hub cluster) and RBAC permissions automatically.

  3. Verify the installation of the OCM agents on your managed cluster by running:

    kubectl -n open-cluster-management-agent get pod --context ${CTX_MANAGED_CLUSTER}
    NAME                                             READY   STATUS    RESTARTS   AGE
    klusterlet-registration-agent-598fd79988-jxx7n   1/1     Running   0          19d
    klusterlet-work-agent-7d47f4b5c5-dnkqw           1/1     Running   0          19d
    
  4. Verify that the cluster1 ManagedCluster object was created successfully by running:

    kubectl get managedcluster --context ${CTX_HUB_CLUSTER}
    

    Then you should get a result that resembles the following:

    NAME       HUB ACCEPTED   MANAGED CLUSTER URLS      JOINED   AVAILABLE   AGE
    cluster1   true           <your endpoint>           True     True        5m23s
    

If the managed cluster status is not true, refer to Troubleshooting to debug on your cluster.

Apply a Manifestwork

After the managed cluster is registered, test that you can deploy a pod to the managed cluster from the hub cluster. Create a manifest-work.yaml as shown in this example:

apiVersion: work.open-cluster-management.io/v1
kind: ManifestWork
metadata:
  name: mw-01
  namespace: ${MANAGED_CLUSTER_NAME}
spec:
  workload:
    manifests:
      - apiVersion: v1
        kind: Pod
        metadata:
          name: hello
          namespace: default
        spec:
          containers:
            - name: hello
              image: busybox
              command: ["sh", "-c", 'echo "Hello, Kubernetes!" && sleep 3600']
          restartPolicy: OnFailure

Apply the yaml file to the hub cluster.

kubectl apply -f manifest-work.yaml --context ${CTX_HUB_CLUSTER}

Verify that the manifestwork resource was applied to the hub.

kubectl -n ${MANAGED_CLUSTER_NAME} get manifestwork/mw-01 --context ${CTX_HUB_CLUSTER} -o yaml

Check on the managed cluster and see the hello Pod has been deployed from the hub cluster.

$ kubectl -n default get pod --context ${CTX_MANAGED_CLUSTER}
NAME    READY   STATUS    RESTARTS   AGE
hello   1/1     Running   0          108s

Troubleshooting

  • If the managed cluster status is not true.

    For example, the result below is shown when checking managedcluster.

    $ kubectl get managedcluster --context ${CTX_HUB_CLUSTER}
    NAME                   HUB ACCEPTED   MANAGED CLUSTER URLS   JOINED   AVAILABLE   AGE
    ${MANAGED_CLUSTER_NAME} true           https://localhost               Unknown     46m
    

    There are many reasons for this problem. You can use the commands below to get more debug info. If the provided info doesn’t help, please log an issue to us.

    On the hub cluster, check the managedcluster status.

    kubectl get managedcluster ${MANAGED_CLUSTER_NAME} --context ${CTX_HUB_CLUSTER} -o yaml
    

    On the hub cluster, check the lease status.

    kubectl get lease -n ${MANAGED_CLUSTER_NAME} --context ${CTX_HUB_CLUSTER}
    

    On the managed cluster, check the klusterlet status.

    kubectl get klusterlet -o yaml --context ${CTX_MANAGED_CLUSTER}
    

Detach the cluster from hub

Remove the resources generated when registering with the hub cluster.

clusteradm unjoin --cluster-name "cluster1" --context ${CTX_MANAGED_CLUSTER}

Check the installation of the OCM agent is removed from the managed cluster.

kubectl -n open-cluster-management-agent get pod --context ${CTX_MANAGED_CLUSTER}
No resources found in open-cluster-management-agent namespace.

Check the klusterlet is removed from the managed cluster.

kubectl get klusterlet --context ${CTX_MANAGED_CLUSTER}
error: the server doesn't have a resource type "klusterlet

3 - Add-on management

Add-on enablement

From a user’s perspective, to install the addon to the hub cluster the hub admin should register a globally-unique ClusterManagementAddon resource as a singleton placeholder in the hub cluster. For instance, the helloworld add-on can be registered to the hub cluster by creating:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
spec:
  addOnMeta:
    displayName: helloworld

Enable the add-on manually

The addon manager running on the hub is taking responsibility of configuring the installation of addon agents for each managed cluster. When a user wants to enable the add-on for a certain managed cluster, the user should create a ManagedClusterAddOn resource on the cluster namespace. The name of the ManagedClusterAddOn should be the same name of the corresponding ClusterManagementAddon. For instance, the following example enables helloworld add-on in “cluster1”:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
spec:
  installNamespace: helloworld

Enable the add-on automatically

If the addon is developed with automatic installation, which support auto-install by cluster discovery, then the ManagedClusterAddOn will be created for all managed cluster namespaces automatically, or be created for the selected managed cluster namespaces automatically.

Enable the add-on by install strategy

If the addon is developed following the guidelines mentioned in managing the add-on agent lifecycle by addon-manager, the user can define an installStrategy in the ClusterManagementAddOn to specify on which clusters the ManagedClusterAddOn should be enabled. Details see install strategy.

Add-on healthiness

The healthiness of the addon instances are visible when we list the addons via kubectl:

$ kubectl get managedclusteraddon -A
NAMESPACE   NAME                     AVAILABLE   DEGRADED   PROGRESSING
<cluster>   <addon>                  True

The addon agent are expected to report its healthiness periodically as long as it’s running. Also the versioning of the addon agent can be reflected in the resources optionally so that we can control the upgrading the agents progressively.

Clean the add-ons

Last but not least, a neat uninstallation of the addon is also supported by simply deleting the corresponding ClusterManagementAddon resource from the hub cluster which is the “root” of the whole addon. The OCM platform will automatically sanitize the hub cluster for you after the uninstalling by removing all the components either in the hub cluster or in the manage clusters.

Add-on lifecycle management

Install strategy

InstallStrategy represents that related ManagedClusterAddOns should be installed on certain clusters. For example, the following example enables the helloworld add-on on clusters with the aws label.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  installStrategy:
    type: Placements
    placements:
    - name: placement-aws
      namespace: default
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Placement
metadata:
  name: placement-aws
  namespace: default
spec:
  predicates:
    - requiredClusterSelector:
        claimSelector:
          matchExpressions:
            - key: platform.open-cluster-management.io
              operator: In
              values:
                - aws

Rollout strategy

With the rollout strategy defined in the ClusterManagementAddOn API, users can control the upgrade behavior of the addon when there are changes in the configurations.

For example, if the add-on user updates the “deploy-config” and wants to apply the change to the add-ons to a “canary” decision group first. If all the add-on upgrade successfully, then upgrade the rest of clusters progressively per cluster at a rate of 25%. The rollout strategy can be defined as follows:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  installStrategy:
    type: Placements
    placements:
    - name: placement-aws
      namespace: default
      configs:
      - group: addon.open-cluster-management.io
        resource: addondeploymentconfigs
        name: deploy-config
        namespace: open-cluster-management
      rolloutStrategy:
        type: Progressive
        progressive:
          mandatoryDecisionGroups:
          - groupName: "prod-canary-west"
          - groupName: "prod-canary-east"
          maxConcurrency: 25%
          minSuccessTime: 5m
          progressDeadline: 10m
          maxFailures: 2

In the above example with type Progressive, once user updates the “deploy-config”, controller will rollout on the clusters in mandatoryDecisionGroups first, then rollout on the other clusters with the rate defined in maxConcurrency.

  • minSuccessTime is a “soak” time, means the controller will wait for 5 minutes when a cluster reach a successful state and maxFailures isn’t breached. If, after this 5 minutes interval, the workload status remains successful, the rollout progresses to the next.
  • progressDeadline means the controller will wait for a maximum of 10 minutes for the workload to reach a successful state. If, the workload fails to achieve success within 10 minutes, the controller stops waiting, marking the workload as “timeout,” and includes it in the count of maxFailures.
  • maxFailures means the controller can tolerate update to 2 clusters with failed status, once maxFailures is breached, the rollout will stop.

Currently add-on supports 3 types of rolloutStrategy, they are All, Progressive and ProgressivePerGroup, for more info regards the rollout strategies check the Rollout Strategy document.

Add-on configurations

Default configurations

In ClusterManagementAddOn, spec.supportedConfigs is a list of configuration types supported by the add-on. defaultConfig represents the namespace and name of the default add-on configuration. In scenarios where all add-ons have the same configuration. Only one configuration of the same group and resource can be specified in the defaultConfig.

In the example below, add-ons on all the clusters will use “default-deploy-config” and “default-example-config”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  supportedConfigs:
  - defaultConfig:
      name: default-deploy-config
      namespace: open-cluster-management
    group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
  - defaultConfig:
      name: default-example-config
      namespace: open-cluster-management
    group: example.open-cluster-management.io
    resource: exampleconfigs

Configurations per install strategy

In ClusterManagementAddOn, spec.installStrategy.placements[].configs lists the configuration of ManagedClusterAddon during installation for a group of clusters. For the need to use multiple configurations with the same group and resource can be defined in this field since OCM v0.15.0. It will override the Default configurations on certain clusters by group and resource.

In the example below, add-ons on clusters selected by Placement placement-aws will use “deploy-config”, “example-config-1” and “example-config-2”, while all the other add-ons will still use “default-deploy-config” and “default-example-config”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ClusterManagementAddOn
metadata:
  name: helloworld
  annotations:
    addon.open-cluster-management.io/lifecycle: "addon-manager"
spec:
  addOnMeta:
    displayName: helloworld
  supportedConfigs:
  - defaultConfig:
      name: default-deploy-config
      namespace: open-cluster-management
    group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
  installStrategy:
    type: Placements
    placements:
    - name: placement-aws
      namespace: default
      configs:
      - group: addon.open-cluster-management.io
        resource: addondeploymentconfigs
        name: deploy-config
        namespace: open-cluster-management
      - group: example.open-cluster-management.io
        resource: exampleconfigs
        name: example-config-1
        namespace: open-cluster-management
      - group: example.open-cluster-management.io
        resource: exampleconfigs
        name: example-config-2
        namespace: open-cluster-management

Configurations per cluster

In ManagedClusterAddOn, spec.configs is a list of add-on configurations. In scenarios where the current add-on has its own configurations. It also supports defining multiple configurations with the same group and resource since OCM v0.15.0. It will override the Default configurations and Configurations per install strategy defined in ClusterManagementAddOn by group and resource.

In the below example, add-on on cluster1 will use “cluster1-deploy-config” and “cluster1-example-config”.

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
spec:
  configs:
  - group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
    name: cluster1-deploy-config
    namespace: open-cluster-management
  - group: example.open-cluster-management.io
    resource: exampleconfigs
    name: cluster1-example-config
    namespace: open-cluster-management

Supported configurations

Supported configurations is a list of configuration types that are allowed to override the add-on configurations defined in ClusterManagementAddOn spec. They are listed in the ManagedClusterAddon status.supportedConfigs, for example:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
spec:
...
status:
...
  supportedConfigs:
  - group: addon.open-cluster-management.io
    resource: addondeploymentconfigs
  - group: example.open-cluster-management.io
    resource: exampleconfigs

Effective configurations

As the above described, there are 3 places to define the add-on configurations, they have an override order and eventually only one takes effect. The final effective configurations are listed in the ManagedClusterAddOn status.configReferences.

  • desiredConfig record the desired config and it’s spec hash.
  • lastAppliedConfig record the config when the corresponding ManifestWork is applied successfully.

For example:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: helloworld
  namespace: cluster1
...
status:
...
  configReferences:
  - desiredConfig:
      name: cluster1-deploy-config
      namespace: open-cluster-management
      specHash: dcf88f5b11bd191ed2f886675f967684da8b5bcbe6902458f672277d469e2044
    group: addon.open-cluster-management.io
    lastAppliedConfig:
      name: cluster1-deploy-config
      namespace: open-cluster-management
      specHash: dcf88f5b11bd191ed2f886675f967684da8b5bcbe6902458f672277d469e2044
    lastObservedGeneration: 1
    name: cluster1-deploy-config
    resource: addondeploymentconfigs