Diving Deeper Into Operator Framework, Part 2

Diving Deeper Into Operator Framework, Part 2

Overview


This is the part 2 of my blog series about operator frameworks deep-dive.

In part 1, here, I've walked you through a typical Kubernetes operator building process and it's time now to answer the question of "who will monitor the monitors".

It's Operator Lifecycle Manager (OLM): The Operator Lifecycle Manager (OLM) extends Kubernetes to provide a declarative way to install, manage, and upgrade Operators on a cluster.

The OLM is a component of the Operator Framework, an open source toolkit to manage Kubernetes native applications, called Operators, in a streamlined and scalable way.

Install OLM

1$ operator-sdk olm install

Within 2 minutes or so, the installation will be done.

Let's take a look what have been installed by this command through the logs:

 1INFO[0002] Fetching CRDs for version "latest"
 2INFO[0002] Fetching resources for resolved version "latest"
 3INFO[0012] Creating CRDs and resources
 4INFO[0012]   Creating CustomResourceDefinition "catalogsources.operators.coreos.com"
 5INFO[0012]   Creating CustomResourceDefinition "clusterserviceversions.operators.coreos.com"
 6INFO[0012]   Creating CustomResourceDefinition "installplans.operators.coreos.com"
 7INFO[0012]   Creating CustomResourceDefinition "operatorconditions.operators.coreos.com"
 8INFO[0012]   Creating CustomResourceDefinition "operatorgroups.operators.coreos.com"
 9INFO[0012]   Creating CustomResourceDefinition "operators.operators.coreos.com"
10INFO[0012]   Creating CustomResourceDefinition "subscriptions.operators.coreos.com"
11INFO[0012]   Creating Namespace "olm"
12INFO[0012]   Creating Namespace "operators"
13INFO[0013]   Creating ServiceAccount "olm/olm-operator-serviceaccount"
14INFO[0013]   Creating ClusterRole "system:controller:operator-lifecycle-manager"
15INFO[0013]   Creating ClusterRoleBinding "olm-operator-binding-olm"
16INFO[0013]   Creating Deployment "olm/olm-operator"
17INFO[0013]   Creating Deployment "olm/catalog-operator"
18INFO[0013]   Creating ClusterRole "aggregate-olm-edit"
19INFO[0013]   Creating ClusterRole "aggregate-olm-view"
20INFO[0013]   Creating OperatorGroup "operators/global-operators"
21INFO[0016]   Creating OperatorGroup "olm/olm-operators"
22INFO[0016]   Creating ClusterServiceVersion "olm/packageserver"
23INFO[0017]   Creating CatalogSource "olm/operatorhubio-catalog"
24INFO[0017] Waiting for deployment/olm-operator rollout to complete
25INFO[0017]   Waiting for Deployment "olm/olm-operator" to rollout: 0 of 1 updated replicas are available
26INFO[0035]   Deployment "olm/olm-operator" successfully rolled out
27INFO[0035] Waiting for deployment/catalog-operator rollout to complete
28INFO[0035]   Deployment "olm/catalog-operator" successfully rolled out
29INFO[0035] Waiting for deployment/packageserver rollout to complete
30INFO[0035]   Waiting for Deployment "olm/packageserver" to appear
31INFO[0037]   Waiting for Deployment "olm/packageserver" to rollout: 0 of 2 updated replicas are available
32INFO[0052]   Deployment "olm/packageserver" successfully rolled out
33INFO[0052] Successfully installed OLM version "latest"
34
35NAME                                            NAMESPACE    KIND                        STATUS
36catalogsources.operators.coreos.com                          CustomResourceDefinition    Installed
37clusterserviceversions.operators.coreos.com                  CustomResourceDefinition    Installed
38installplans.operators.coreos.com                            CustomResourceDefinition    Installed
39operatorconditions.operators.coreos.com                      CustomResourceDefinition    Installed
40operatorgroups.operators.coreos.com                          CustomResourceDefinition    Installed
41operators.operators.coreos.com                               CustomResourceDefinition    Installed
42subscriptions.operators.coreos.com                           CustomResourceDefinition    Installed
43olm                                                          Namespace                   Installed
44operators                                                    Namespace                   Installed
45olm-operator-serviceaccount                     olm          ServiceAccount              Installed
46system:controller:operator-lifecycle-manager                 ClusterRole                 Installed
47olm-operator-binding-olm                                     ClusterRoleBinding          Installed
48olm-operator                                    olm          Deployment                  Installed
49catalog-operator                                olm          Deployment                  Installed
50aggregate-olm-edit                                           ClusterRole                 Installed
51aggregate-olm-view                                           ClusterRole                 Installed
52global-operators                                operators    OperatorGroup               Installed
53olm-operators                                   olm          OperatorGroup               Installed
54packageserver                                   olm          ClusterServiceVersion       Installed
55operatorhubio-catalog                           olm          CatalogSource               Installed

Well, that's a lot! So, firstly, let's understand the OLM resources.

OLM resources

The following custom resource definitions (CRDs) are defined and managed by Operator Lifecycle Manager (OLM).

I try to use plain English, instead of copy-pasting from official docs, to explain what they are and how they work together.

ResourceShort nameDescription
CatalogSourcecatsrcA repository of CSVs, CRDs, and packages that define an operator, or maybe a set of operators.
ClusterServiceVersioncsvOperator metadata. For example: name, version, icon, required resources.
SubscriptionsubCalculated list of resources to be created to automatically install or upgrade a CSV.
OperatorGroupogDefines where to watch for CRs: a namespace, multiple namespaces, or cluster-wide and binds to CSV through annotation of olm.operatorGroup.
OperatorConditionsN/ACreates a communication channel between OLM and an Operator it manages. Operators can write to the Status.Conditions array to communicate complex states to OLM.

In the real world, the CatalogSource, Subscription and OperatorGroup are the most important objects that we must understand.

OLM status

operator-sdk CLI provides a handy command to check the status of OLM:

 1$ operator-sdk olm status
 2INFO[0001] Fetching CRDs for version "0.18.1"
 3INFO[0001] Fetching resources for resolved version "v0.18.1"
 4INFO[0002] Successfully got OLM status for version "0.18.1"
 5
 6NAME                                            NAMESPACE    KIND                        STATUS
 7operators.operators.coreos.com                               CustomResourceDefinition    Installed
 8operatorgroups.operators.coreos.com                          CustomResourceDefinition    Installed
 9operatorconditions.operators.coreos.com                      CustomResourceDefinition    Installed
10installplans.operators.coreos.com                            CustomResourceDefinition    Installed
11clusterserviceversions.operators.coreos.com                  CustomResourceDefinition    Installed
12olm-operator                                    olm          Deployment                  Installed
13olm-operator-binding-olm                                     ClusterRoleBinding          Installed
14operatorhubio-catalog                           olm          CatalogSource               Installed
15olm-operators                                   olm          OperatorGroup               Installed
16aggregate-olm-view                                           ClusterRole                 Installed
17catalog-operator                                olm          Deployment                  Installed
18subscriptions.operators.coreos.com                           CustomResourceDefinition    Installed
19aggregate-olm-edit                                           ClusterRole                 Installed
20olm                                                          Namespace                   Installed
21global-operators                                operators    OperatorGroup               Installed
22operators                                                    Namespace                   Installed
23packageserver                                   olm          ClusterServiceVersion       Installed
24olm-operator-serviceaccount                     olm          ServiceAccount              Installed
25catalogsources.operators.coreos.com                          CustomResourceDefinition    Installed
26system:controller:operator-lifecycle-manager                 ClusterRole                 Installed

Build and publish the bundle image

Generate the bundle files, build and publish the bundle image:

1# In part 1 we've already updated the Makefile to avoid specifying IMG
2# make bundle IMG="brightzheng100/memcached-operator:v0.0.1"
3# make bundle-build bundle-push IMAGE_TAG_BASE="brightzheng100/memcached-operator" VERSION="0.0.1"
4make bundle
5make bundle-build bundle-push

Note: If you run make bundle at first time, there are some questions prompted for your inputs, which are some of the metadata used to build the CSV:

 1Display name for the operator (required):
 2> Memcached Operator
 3
 4Description for the operator (required):
 5> Memcached Operator built by Bright for fun
 6
 7Provider's name for the operator (required):
 8> Bright Corp
 9
10Any relevant URL for the provider name (optional):
11> https://brightzheng100.github.io
12
13Comma-separated list of keywords for your operator (required):
14> memcached,operator,kubernetes,operator-sdk
15
16Comma-separated list of maintainers and their emails (e.g. 'name1:email1, name2:email2') (required):
17> someemail AT someemail.com
18...

And a series of folders and files will be generated, which include:

  • a new bundle folder
  • a new bundle.Dockerfile file
  • a new CSV file under config/mannifests/bases/ folder

And once we've done the bundle-build bundle-push, the bundle image will be built and published.

Run the bundle

Run the bundle, to install the operator, with a specified namespace if you want:

1$ operator-sdk run bundle docker.io/brightzheng100/memcached-operator-bundle:v0.0.1 -n operators

And we should be able to see some logs like:

 1INFO[0017] Successfully created registry pod: docker-io-brightzheng100-memcached-operator-bundle-v0-0-1
 2INFO[0018] Created CatalogSource: memcached-operator-catalog
 3INFO[0018] OperatorGroup "operator-sdk-og" created
 4INFO[0018] Created Subscription: memcached-operator-v0-0-1-sub
 5INFO[0025] Approved InstallPlan install-j4476 for the Subscription: memcached-operator-v0-0-1-sub
 6INFO[0025] Waiting for ClusterServiceVersion "default/memcached-operator.v0.0.1" to reach 'Succeeded' phase
 7INFO[0025]   Waiting for ClusterServiceVersion "default/memcached-operator.v0.0.1" to appear
 8INFO[0050]   Found ClusterServiceVersion "default/memcached-operator.v0.0.1" phase: Pending
 9INFO[0052]   Found ClusterServiceVersion "default/memcached-operator.v0.0.1" phase: Installing
10INFO[0063]   Found ClusterServiceVersion "default/memcached-operator.v0.0.1" phase: Succeeded
11INFO[0063] OLM has successfully installed "memcached-operator.v0.0.1"

And objects are installed into specified namespace, here is operators:

 1# Check out the pods
 2$ kubectl get pods -n operators
 3NAME                                                              READY   STATUS      RESTARTS   AGE
 4c551e776de29960763c9167350ea816c5da7be5de6ff4def66c1e48acc6w8vn   0/1     Completed   0          88s
 5docker-io-brightzheng100-memcached-operator-bundle-v0-0-1         1/1     Running     0          98s
 6memcached-operator-controller-manager-77d65c8c67-rjvt7            2/2     Running     0          76s
 7
 8# Check out all CRs created
 9$ kubectl get CatalogSource,ClusterServiceVersion,Subscription,OperatorGroup,OperatorConditions -n operators
10NAME                                                            DISPLAY              TYPE   PUBLISHER      AGE
11catalogsource.operators.coreos.com/memcached-operator-catalog   memcached-operator   grpc   operator-sdk   2m37s
12
13NAME                                                                   DISPLAY              VERSION   REPLACES   PHASE
14clusterserviceversion.operators.coreos.com/memcached-operator.v0.0.1   Memcached Operator   0.0.1                Succeeded
15
16NAME                                                              PACKAGE              SOURCE                       CHANNEL
17subscription.operators.coreos.com/memcached-operator-v0-0-1-sub   memcached-operator   memcached-operator-catalog   alpha
18
19NAME                                                  AGE
20operatorgroup.operators.coreos.com/global-operators   6d22h
21
22NAME                                                               AGE
23operatorcondition.operators.coreos.com/memcached-operator.v0.0.1   2m10s

Note:

  1. If your bundle image is hosted in a registry that is private and/or has a custom CA, these configuration steps must be completed.
  2. This must be the bundle image instead of the operator image, otherwise you would get errors like FATA[0012] Failed to run bundle: load bundle metadata: metadata not found in bundle-073235499.

Create the CR

Let's track the logs of the created operator pod:

1kubectl logs memcached-operator-controller-manager-77d65c8c67-rjvt7 -n operators -f --all-containers

And then create and delete the CR:

 1# Create a CR, which can refer to the example under /config/samples/cache_v1alpha1_memcached.yaml
 2$ kubectl apply -f - <<EOF
 3apiVersion: cache.example.com/v1alpha1
 4kind: Memcached
 5metadata:
 6  name: memcached-sample
 7spec:
 8  # Add fields here
 9  foo: bar
10EOF
11
12# Check it out
13$ kubectl get Memcached
14NAME               AGE
15memcached-sample   29s
16
17# Then delete it
18$ kubectl delete Memcached/memcached-sample

We should be able to see exactly the same message twice, as the experiment we did in part 1:

12021-06-05T13:25:28.760Z	INFO	controllers.Memcached	great, the Reconcile is really triggered!
22021-06-05T13:26:18.726Z	INFO	controllers.Memcached	great, the Reconcile is really triggered!

So you've seen that the operator works the same as the one we manually deployed.

Let's dive a bit deeper with what we have

The operator has been installed into the purposely specified namespace operators, which can be any namespace actually.

Now let's check out the yaml, which is a deployment:

1$ kubectl get deploy -n operators memcached-operator-controller-manager -o yaml
  1apiVersion: apps/v1
  2kind: Deployment
  3metadata:
  4  annotations:
  5    deployment.kubernetes.io/revision: "1"
  6  creationTimestamp: "2021-06-05T13:21:57Z"
  7  generation: 2
  8  labels:
  9    olm.deployment-spec-hash: b95697767
 10    olm.owner: memcached-operator.v0.0.1
 11    olm.owner.kind: ClusterServiceVersion
 12    olm.owner.namespace: operators
 13    operators.coreos.com/memcached-operator.operators: ""
 14    manager: kube-controller-manager
 15    operation: Update
 16    time: "2021-06-05T13:28:35Z"
 17  name: memcached-operator-controller-manager
 18  namespace: operators
 19  ownerReferences:
 20  - apiVersion: operators.coreos.com/v1alpha1
 21    blockOwnerDeletion: false
 22    controller: false
 23    kind: ClusterServiceVersion
 24    name: memcached-operator.v0.0.1
 25    uid: 05d41d88-8886-4ede-b2cc-8fd43f72259d
 26  resourceVersion: "469189"
 27  uid: 9bfef7a8-ac72-4a6d-833b-0d4a436b5a74
 28spec:
 29  progressDeadlineSeconds: 600
 30  replicas: 1
 31  revisionHistoryLimit: 1
 32  selector:
 33    matchLabels:
 34      control-plane: controller-manager
 35  strategy:
 36    rollingUpdate:
 37      maxSurge: 25%
 38      maxUnavailable: 25%
 39    type: RollingUpdate
 40  template:
 41    metadata:
 42      annotations:
 43        alm-examples: |-
 44          [
 45            {
 46              "apiVersion": "cache.example.com/v1alpha1",
 47              "kind": "Memcached",
 48              "metadata": {
 49                "name": "memcached-sample"
 50              },
 51              "spec": {
 52                "foo": "bar"
 53              }
 54            }
 55          ]          
 56        capabilities: Basic Install
 57        olm.operatorGroup: global-operators
 58        olm.operatorNamespace: operators
 59        olm.targetNamespaces: ""
 60        operatorframework.io/properties: '{"properties":[{"type":"olm.gvk","value":{"group":"cache.example.com","kind":"Memcached","version":"v1alpha1"}},{"type":"olm.package","value":{"packageName":"memcached-operator","version":"0.0.1"}}]}'
 61        operators.operatorframework.io/builder: operator-sdk-v1.7.1+git
 62        operators.operatorframework.io/project_layout: go.kubebuilder.io/v3
 63      creationTimestamp: null
 64      labels:
 65        control-plane: controller-manager
 66    spec:
 67      containers:
 68      - args:
 69        - --secure-listen-address=0.0.0.0:8443
 70        - --upstream=http://127.0.0.1:8080/
 71        - --logtostderr=true
 72        - --v=10
 73        env:
 74        - name: OPERATOR_CONDITION_NAME
 75          value: memcached-operator.v0.0.1
 76        image: gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
 77        imagePullPolicy: IfNotPresent
 78        name: kube-rbac-proxy
 79        ports:
 80        - containerPort: 8443
 81          name: https
 82          protocol: TCP
 83        resources: {}
 84        terminationMessagePath: /dev/termination-log
 85        terminationMessagePolicy: File
 86      - args:
 87        - --health-probe-bind-address=:8081
 88        - --metrics-bind-address=127.0.0.1:8080
 89        - --leader-elect
 90        command:
 91        - /manager
 92        env:
 93        - name: OPERATOR_CONDITION_NAME
 94          value: memcached-operator.v0.0.1
 95        image: brightzheng100/memcached-operator:v0.0.1
 96        imagePullPolicy: IfNotPresent
 97        livenessProbe:
 98          failureThreshold: 3
 99          httpGet:
100            path: /healthz
101            port: 8081
102            scheme: HTTP
103          initialDelaySeconds: 15
104          periodSeconds: 20
105          successThreshold: 1
106          timeoutSeconds: 1
107        name: manager
108        readinessProbe:
109          failureThreshold: 3
110          httpGet:
111            path: /readyz
112            port: 8081
113            scheme: HTTP
114          initialDelaySeconds: 5
115          periodSeconds: 10
116          successThreshold: 1
117          timeoutSeconds: 1
118        resources:
119          limits:
120            cpu: 100m
121            memory: 30Mi
122          requests:
123            cpu: 100m
124            memory: 20Mi
125        securityContext:
126          allowPrivilegeEscalation: false
127        terminationMessagePath: /dev/termination-log
128        terminationMessagePolicy: File
129      dnsPolicy: ClusterFirst
130      restartPolicy: Always
131      schedulerName: default-scheduler
132      securityContext:
133        runAsNonRoot: true
134      serviceAccount: memcached-operator-controller-manager
135      serviceAccountName: memcached-operator-controller-manager
136      terminationGracePeriodSeconds: 1

Two interesting findings:

  1. There is a sidecar container named kube-rbac-proxy with image of gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0;

  2. From the ownerReferences section we know that the operator's parent is ClusterServiceVersion/memcached-operator.v0.0.1

There are no good docs in offical docs.

Actuall it derives from this repo and is hosted by the Kubebuilder team in GCR. The main goal of having this container injected as a sidecar is to be a small HTTP proxy that can perform RBAC authorization against the Kubernetes API using SubjectAccessReview to protect the operator behind.

The schema of ClusterServiceVersion/memcached-operator.v0.0.1 can be found in file bundle/manifests/memcached-operator.clusterserviceversion.yaml and you might find this in the YAML file:

1  installModes:
2  - supported: false
3    type: OwnNamespace
4  - supported: false
5    type: SingleNamespace
6  - supported: false
7    type: MultiNamespace
8  - supported: true
9    type: AllNamespaces

So by default, the CSV generated by Operator SDK supports the operator to monitor AllNamespaces for the CR, which is Memcached in our case -- that's why we just created our CR in the default namespace and it still worked fine.

How about upgrade?

Since it's named Operator Lifecycle Manager, let's see how it handles operator upgrades.

Make more changes in our operator code

Remember the "huge" change we made in part 1 for our operator, we will make even bigger change:)

In controllers/memcached_controller.go, from:

1r.Log.Info("great, the Reconcile is really triggered!")

To:

1r.Log.Info("great, the Reconcile is really triggered in v2!")

Build and publish the operator image

As usual, let's build the publish the image. But this time we're going to specify the version:

1# Docker build and push with a specified VERSION to replace the default 0.0.1
2$ make docker-build docker-push VERSION=0.0.2

Build and publish the operator bundle image

Now we can refresh the bundle with the new version, and then push the build and push the bundle image to publish:

1# Bundle it with specified VERSION too, this will update the CSV file
2$ make bundle VERSION=0.0.2
3
4# Then build and push the bundle image
5$ make bundle-build bundle-push VERSION=0.0.2

We even can check the locally cached Docker images:

1$ docker images | grep memcached
2brightzheng100/memcached-operator-bundle              v0.0.2        dc0a255ecff9   19 seconds ago   10.1kB
3brightzheng100/memcached-operator                     v0.0.2        b3e76e7f5f78   6 minutes ago    47.4MB
4...

Upgrade the operator by upgrading the bundle

Finally, let's upgrade the bundle:

 1$ operator-sdk run bundle-upgrade docker.io/brightzheng100/memcached-operator-bundle:v0.0.2 -n operators
 2INFO[0005] Found existing subscription with name memcached-operator-v0-0-1-sub and namespace operators
 3INFO[0005] Found existing catalog source with name memcached-operator-catalog and namespace operators
 4INFO[0013] Successfully created registry pod: docker-io-brightzheng100-memcached-operator-bundle-v0-0-2
 5INFO[0013] Updated catalog source memcached-operator-catalog with address and annotations
 6INFO[0013] Deleted previous registry pod with name "docker-io-brightzheng100-memcached-operator-bundle-v0-0-1"
 7INFO[0047] Approved InstallPlan install-jgbx7 for the Subscription: memcached-operator-v0-0-1-sub
 8INFO[0047] Waiting for ClusterServiceVersion "operators/memcached-operator.v0.0.2" to reach 'Succeeded' phase
 9INFO[0047]   Waiting for ClusterServiceVersion "operators/memcached-operator.v0.0.2" to appear
10INFO[0048]   Found ClusterServiceVersion "operators/memcached-operator.v0.0.2" phase: Installing
11INFO[0059]   Found ClusterServiceVersion "operators/memcached-operator.v0.0.2" phase: Succeeded
12INFO[0059] Successfully upgraded to "memcached-operator.v0.0.2"

If you trace carefully for the upgrade process, you will find that it's quite similar to a blue-green deployment process:

  1. It downloads the desired version of bundle image and terminates the old bundle;

  2. A job starts to prepare and eventually starts a pod serving as operator registry

1$ kubectl logs -n operators docker-io-brightzheng100-memcached-operator-bundle-v0-0-2
2...
3time="2021-06-06T08:20:04Z" level=info msg="Keeping server open for infinite seconds" database=/database/index.db port=50051
4time="2021-06-06T08:20:04Z" level=info msg="serving registry" database=/database/index.db port=50051
  1. A new operator with new version is provisioned and the old operator is terminated.

But wait, what is the operator bundle and why bundle image?

Even you may have followed the steps to this far, you might still be curious or even confused: what is the bundle and why?

In Operator Framework world, a "bundle" is meant to represent a specific version of an operator with a structured directory of files with one ClusterServiceVersion.

According to the docs here: A bundle typically includes a ClusterServiceVersion and the CRDs that define the owned APIs of the CSV in its manifest directory, though additional objects may be included. It also includes an annotation file in its metadata folder which defines some higher level aggregate data that helps to describe the format and package information about how the bundle should be added into an index of bundles.

In our case the folder structure is like:

 1$ tree bundle
 2bundle
 3├── manifests
 4│   ├── cache.example.com_memcacheds.yaml
 5│   ├── memcached-operator-controller-manager-metrics-service_v1_service.yaml
 6│   ├── memcached-operator-controller-manager_v1_serviceaccount.yaml
 7│   ├── memcached-operator-manager-config_v1_configmap.yaml
 8│   ├── memcached-operator-metrics-reader_rbac.authorization.k8s.io_v1_clusterrole.yaml
 9│   └── memcached-operator.clusterserviceversion.yaml
10├── metadata
11│   └── annotations.yaml
12└── tests
13    └── scorecard
14        └── config.yaml
15
164 directories, 8 files

And obviously, bundle image is a way using OCI spec container image as a method of storing the manifest and metadata contents of individual operator bundles.

So if you recall the pods we have within our operators namespace:

1$ kubectl get pod -n operators
2NAME                                                              READY   STATUS      RESTARTS   AGE
3a861f25fcb3303cc6d11282fb2ecc2e886f24ff6142bb7511df62ea0caw7v7t   0/1     Completed   0          71m
4c551e776de29960763c9167350ea816c5da7be5de6ff4def66c1e48acc427qv   0/1     Completed   0          74m
5docker-io-brightzheng100-memcached-operator-bundle-v0-0-2         1/1     Running     0          71m
6memcached-operator-controller-manager-5f7468f5cc-rnf2b            2/2     Running     0          70m

You will realize that the pod of docker-io-brightzheng100-memcached-operator-bundle-v0-0-2 is serving as a gRPC source for OLM to discover the operators by an object with the type of CatalogSource:

 1# Let's see what CatalogSource objects we have
 2$ kubectl get CatalogSource -n operators
 3NAME                         DISPLAY              TYPE   PUBLISHER      AGE
 4memcached-operator-catalog   memcached-operator   grpc   operator-sdk   77m
 5
 6# How this CatalogSource looks like
 7$ kubectl get CatalogSource/memcached-operator-catalog -n operators -o yaml
 8apiVersion: operators.coreos.com/v1alpha1
 9kind: CatalogSource
10metadata:
11  <OMITTED>
12  name: memcached-operator-catalog
13  namespace: operators
14spec:
15  address: 10.244.1.18:50051
16  displayName: memcached-operator
17  icon:
18    base64data: ""
19    mediatype: ""
20  publisher: operator-sdk
21  secrets:
22  - ""
23  sourceType: grpc
24status:
25  connectionState:
26    address: 10.244.1.18:50051
27    lastConnect: "2021-06-06T08:20:08Z"
28    lastObservedState: READY
29  registryService:
30    createdAt: "2021-06-06T08:16:47Z"
31    protocol: grpc

See this address: 10.244.1.18:50051, which is being served exactly by the pod of docker-io-brightzheng100-memcached-operator-bundle-v0-0-2, as the operator registry.

There must be a subscription too, to subscribe for the desired operator, and here it is:

 1# Yes, there is a subscription
 2$ kubectl get subscription -n operators
 3NAME                            PACKAGE              SOURCE                       CHANNEL
 4memcached-operator-v0-0-1-sub   memcached-operator   memcached-operator-catalog   alpha
 5
 6# And the content is like this
 7$ kg subscription/memcached-operator-v0-0-1-sub -n operators -o yaml
 8apiVersion: operators.coreos.com/v1alpha1
 9kind: Subscription
10metadata:
11  name: memcached-operator-v0-0-1-sub
12  namespace: operators
13spec:
14  channel: alpha
15  installPlanApproval: Manual
16  name: memcached-operator
17  source: memcached-operator-catalog
18  sourceNamespace: operators
19  startingCSV: memcached-operator.v0.0.1

Now things are very clear:

  1. Using the bundle here is to set up an operator registry source for OLM to discover our operators -- this is more from a development lifecycle perspective;

  2. In the real world, it's more likely we're refering to a well-known registry, like OperatorHub.io. In this case, what we need to do, in most of the cases, is to define the CatalogSource and Subscription objects.

Let's further prove it by using the etcd operator published in OperatorHub.io

If we check carefully you would have found that there is already a CatalogSource with curated community operators pre-installed:

 1# Check out the catalogsource in olm
 2$ kubectl get catalogsource -n olm
 3NAME                    DISPLAY               TYPE   PUBLISHER        AGE
 4operatorhubio-catalog   Community Operators   grpc   OperatorHub.io   7d19h
 5
 6# See the content
 7$ kubectl get catalogsource/operatorhubio-catalog -n olm -o yaml
 8apiVersion: operators.coreos.com/v1alpha1
 9kind: CatalogSource
10metadata:
11  name: operatorhubio-catalog
12  namespace: olm
13spec:
14  displayName: Community Operators
15  image: quay.io/operatorhubio/catalog:latest
16  publisher: OperatorHub.io
17  sourceType: grpc

Now let's pick the etcd operator, which is published in OperatorHub.io, as example to walk through the operator UX from a service provider standpoint.

Note: Hey, you may refer to here for this operator details.

We will use a "raw" way to walk it through to see how it works with OLM.

  1. Install the etcd operator by subscribing to the operatorhubio-catalog CatalogSource:
 1# Install the etcd operator
 2kubectl apply -f - <<EOF
 3---
 4apiVersion: v1
 5kind: Namespace
 6metadata:
 7  name: my-etcd
 8---
 9apiVersion: operators.coreos.com/v1
10kind: OperatorGroup
11metadata:
12  name: operatorgroup
13  namespace: my-etcd
14spec:
15  targetNamespaces:
16  - my-etcd
17---
18apiVersion: operators.coreos.com/v1alpha1
19kind: Subscription
20metadata:
21  name: my-etcd
22  namespace: my-etcd
23spec:
24  channel: singlenamespace-alpha
25  name: etcd
26  source: operatorhubio-catalog   # refer to the catalogsource/operatorhubio-catalog
27  sourceNamespace: olm            # in olm namespace
28EOF
29
30# In seconds, you will see the etcd operator is up and running
31$ kubectl get csv,pod -n my-etcd
32NAME                                                                   DISPLAY              VERSION   REPLACES                    PHASE
33clusterserviceversion.operators.coreos.com/etcdoperator.v0.9.4         etcd                 0.9.4     etcdoperator.v0.9.2         Succeeded
34clusterserviceversion.operators.coreos.com/memcached-operator.v0.0.2   Memcached Operator   0.0.2     memcached-operator.v0.0.1   Succeeded
35
36NAME                                 READY   STATUS    RESTARTS   AGE
37pod/etcd-operator-59b94dd6df-v2xw2   3/3     Running   0          55s
  1. Declare and create the desired etcd cluster:

Once the operator is ready, it's time to create an etcd CR to provision a etcd cluster in an easy way:

 1# Create the EtcdCluster CR
 2$ kubectl -n my-etcd apply -f - <<EOF
 3apiVersion: etcd.database.coreos.com/v1beta2
 4kind: EtcdCluster
 5metadata:
 6  name: example
 7spec:
 8  size: 3
 9  version: 3.2.13
10EOF
11
12# Soon we will see the cluster is provisioned with desired 3 nodes
13$ kubectl get pod -n my-etcd
14NAME                             READY   STATUS    RESTARTS   AGE
15etcd-operator-59b94dd6df-v2xw2   3/3     Running   0          17m
16example-6sklt68tzj               1/1     Running   0          2m54s
17example-fkz5dswdd8               1/1     Running   0          2m6s
18example-fxlhbr9mpf               1/1     Running   0          86s
19
20# Let's log into it to have a try
21k exec -it -n my-etcd example-6sklt68tzj -- sh
22/ # etcdctl --endpoints http://127.0.0.1:2379 ls /
23/ # etcdctl --endpoints http://127.0.0.1:2379 set /my-key my-value
24my-value
25/ # etcdctl --endpoints http://127.0.0.1:2379 ls /
26/my-key

Yep, it works perfectly fine.

Conclusion

Operator Framework offers two major components:

In part 1, we've learned that Operator SDK has been a great tool to boost productivity for Kubernetes operator developers with a series of cool features:

  • Project scaffolding and code generation to bootstrap a new project fast
  • High level APIs and abstractions so developer can write the operational logic more intuitively
  • Extensions to cover more common operator use cases

In this part 2, we've dived deeper into Operator Lifecycle Manager (OLM) which focuses more on the operator itself, around its lifecycle management. OLM defines a precise operational model to cover bundling, distribution, discovery, and provisioning so that we have a better way to collaborate, manage and operate Kubernetes-native applications, aka operators.

OLM Model

As a result, Operator Framework has become one of the best tools while building native Kubenetes applications and platforms on top of Kubenetes foundation.

Now if you look back to this quote, you will have deeper understanding of the rational: