Written by

Distinguished Contractor at Integration Required, LLC
Article sween · Oct 23 9m read

IKO Plus: Operator Works From Home - IrisCluster Provisioning Across Kubernetes Clusters

IKO Helm Status: WFH

Here is an option for your headspace if you are designing an multi-cluster architecture and the Operator is an FTE to the design.  You can run the Operator from a central Kubernetes cluster (A), and point it to another Kubernetes cluster (B), so that when the apply an IrisCluster to B the Operator works remotely on A and plans the cluster accordingly on B.  This design keeps some resource heat off the actual workload cluster, spares us some serviceaccounts/rbac and gives us only one operator deployment to worry about so we can concentrate on the IRIS workloads.

IKO woke up and decided against the commute for work, despite needing to operate a development workload of many IrisClusters at the office that day.  Using the saved windshield time, IKO upgraded its helm values on a Kubernetes cluster at home, bounced itself, and went for a run.  Once settled back in, inspecting its logs, it could see it had planned many IrisClusters on the Office Kubernetes cluster, all at the cost of its own internet and power.

Here is how IKO managed this...

Clusters

Lets provision two Kind clusters, ikohome and ikowork.

 

ikokind.sh

cat<<EOF| kind create cluster --name ikohome --config=-
kind:ClusterapiVersion:kind.x-k8s.io/v1alpha4nodes:  - role:control-plane  - role:workernetworking:  disableDefaultCNI:trueEOFcat<<EOF| kind create cluster --name ikowork --config=-
kind:ClusterapiVersion:kind.x-k8s.io/v1alpha4nodes:  - role:control-plane  - role:workernetworking:  disableDefaultCNI:trueEOFkindgetkubeconfig--nameikohome> ikohome.kubeconfig
kind get kubeconfig --name ikowork > ikowork.kubeconfig

KUBECONFIGS=("ikohome.kubeconfig" "ikowork.kubeconfig")

for cfg in "${KUBECONFIGS[@]}"; do
  echo ">>> Running against kubeconfig: $cfg"
  cilium install --version v1.18.0 --kubeconfig "$cfg"
  cilium status --wait --kubeconfig "$cfg"

  echo ">>> Finished $cfg"
  echo
  done

 

After running the above, you should have two clusters running, loaded with the Cilium CNI and ready for business.

Install IKO at HOME 🏠

First we need to make the home cluster aware of the work cluster and load up its kubeconfig as a secret, you can get this done with the following.

kubectl create secret generic work-kubeconfig --from-file=config=ikowork.kubeconfig --kubeconfig ikohome.kubeconfig

Now, we need to make some changes to the IKO chart to WFH.

  • Mount the kubeconfig Secret as a Volume in the Operator
  • point to the kubeconfig in the operator arguments ( --kubeconfig )

The deployment.yaml in its entirety is below, edited right out of the factory, but here are the important points called out in the yaml

Mount

        volumeMounts:...        - mountPath:/airgap/.kube          name:kubeconfig          readOnly:true      volumes:...      - name:kubeconfig        secret:          secretName:work-kubeconfig          items:          - key:config            path:config

Args

The args to the container too... I tried this with the env "KUBECONFIG" but after taking a look at the controller code, found out there was a precedence to such things.

      containers:
      - name: operator
        image: {{ .Values.operator.registry }}/{{ .Values.operator.repository }}:{{ .Values.operator.tag }}
        imagePullPolicy: {{ .Values.imagePullPolicy  }}
        args:
        - run
...
        - --kubeconfig=/airgap/.kube/config
...

 

 

deployment.yaml

# GKE returns Major:"1", Minor:"10+"{{-$major:=default"0".Capabilities.KubeVersion.Major| trimSuffix "+" | int64 }}
{{- $minor := default "0" .Capabilities.KubeVersion.Minor | trimSuffix "+" | int64 }}apiVersion:apps/v1kind:Deploymentmetadata:  name:{{template"iris-operator.fullname".}}  namespace:{{.Release.Namespace}}  labels:{{-include"iris-operator.labels".| nindent 4 }}
{{- if .Values.annotations }}  annotations:{{toYaml.Values.annotations| indent 4 }}
{{- end }}spec:  replicas:{{.Values.replicaCount}}  selector:    matchLabels:      app:"{{ template "iris-operator.name" . }}"      release:"{{ .Release.Name }}"  template:    metadata:      labels:{{-include"iris-operator.labels".| nindent 8 }}
{{- if or .Values.annotations (and .Values.criticalAddon (eq .Release.Namespace "kube-system")) }}      annotations:{{-ifand.Values.criticalAddon(eq.Release.Namespace"kube-system")}}scheduler.alpha.kubernetes.io/critical-pod:''{{-end}}{{-if.Values.annotations}}{{toYaml.Values.annotations| indent 8 }}
{{- end }}{{- end }}    spec:      serviceAccountName:{{template"iris-operator.serviceAccountName".}}{{-if.Values.imagePullSecrets}}      imagePullSecrets:{{toYaml.Values.imagePullSecrets| indent 6 }}
      {{- end }}      securityContext:# ensure that s/a token is readable xref: https://issues.k8s.io/70679        fsGroup:65535      containers:      - name:operator        image:{{.Values.operator.registry}}/{{.Values.operator.repository}}:{{.Values.operator.tag}}        imagePullPolicy:{{.Values.imagePullPolicy}}        args:        -run        ---v={{.Values.logLevel}}        ---secure-port=8443        ---kubeconfig=/airgap/.kube/config        ---audit-log-path=-        ---tls-cert-file=/var/serving-cert/tls.crt        ---tls-private-key-file=/var/serving-cert/tls.key        ---enable-mutating-webhook={{.Values.apiserver.enableMutatingWebhook}}        ---enable-validating-webhook={{.Values.apiserver.enableValidatingWebhook}}        ---bypass-validating-webhook-xray={{.Values.apiserver.bypassValidatingWebhookXray}}{{-ifand(not.Values.apiserver.disableStatusSubresource)(ge$major1)(ge$minor11)}}        ---enable-status-subresource=true{{-end}}        ---use-kubeapiserver-fqdn-for-aks={{.Values.apiserver.useKubeapiserverFqdnForAks}}        ports:        - containerPort:8443        env:        - name:MY_POD_NAME          valueFrom:            fieldRef:              fieldPath:metadata.name        - name:MY_POD_NAMESPACE          valueFrom:            fieldRef:              fieldPath:metadata.namespace        - name:ISC_USE_FQDN          value:{{default"true"(quote.Values.operator.useFQDN)}}        - name:ISC_WEBSERVER_PORT          value:{{default"52773"(quote.Values.operator.webserverPort)}}        - name:ISC_USE_IRIS_FSGROUP          value:{{default"false"(quote.Values.operator.useIrisFsGroup)}}        - name:ISC_NUM_THREADS          value:{{default"2"(quote.Values.operator.numThreads)}}        - name:ISC_RESYNC_PERIOD          value:{{default"10m"(quote.Values.operator.resyncPeriod)}}        - name:ISC_WEBGATEWAY_STARTUP_TIMEOUT          value:{{default"0"(quote.Values.operator.webGatewayStartupTimeout)}}        - name:KUBECONFIG          value:/airgap/.kube/config{{-if.Values.apiserver.healthcheck.enabled}}        readinessProbe:          httpGet:            path:/healthz            port:8443            scheme:HTTPS          initialDelaySeconds:5        livenessProbe:          httpGet:            path:/healthz            port:8443            scheme:HTTPS          initialDelaySeconds:5{{-end}}        resources:{{toYaml.Values.resources| indent 10 }}
        volumeMounts:        - mountPath:/var/serving-cert          name:serving-cert        - mountPath:/airgap/.kube          name:kubeconfig          readOnly:true      volumes:      - name:serving-cert        secret:          defaultMode:420          secretName:{{template"iris-operator.fullname".}}-apiserver-cert      - name:kubeconfig        secret:          secretName:work-kubeconfig          items:          - key:config            path:config{{-ifor.Values.tolerations(and.Values.criticalAddon(eq.Release.Namespace"kube-system"))}}      tolerations:{{-if.Values.tolerations}}{{toYaml.Values.tolerations| indent 8 }}
{{- end -}}{{- if and .Values.criticalAddon (eq .Release.Namespace "kube-system") }}      - key:CriticalAddonsOnly        operator:Exists{{-end-}}{{-end-}}{{-if.Values.affinity}}      affinity:{{toYaml.Values.affinity| indent 8 }}
{{- end -}}{{- if .Values.nodeSelector }}      nodeSelector:{{toYaml.Values.nodeSelector| indent 8 }}
{{- end -}}{{- if and .Values.criticalAddon (eq .Release.Namespace "kube-system") }}      priorityClassName:system-cluster-critical{{-end-}}

 

Chart

Same with the values.yaml, here I disabled the mutating and validating webhooks.

 

values.yaml

# Default values for iris-operator.# This is a YAML-formatted file.# Declare variables to be passed into your templates.replicaCount:1operator:  registry:containers.intersystems.com  repository:intersystems/iris-operator-amd  tag:3.8.42.100# Operator Environment Variables  useFQDN:true  webserverPort:52773  useIrisFsGroup:false  numThreads:2  resyncPeriod:"10m"  webGatewayStartupTimeout:0# https://github.com/appscodelabs/Dockerfiles/tree/master/kubectlcleaner:  registry:appscode  repository:kubectl  tag:v1.14## Optionally specify an array of imagePullSecrets.## Secrets must be manually created in the namespace.## ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod##imagePullSecrets:  - name:dockerhub-secret## Specify a imagePullPolicy## ref: http://kubernetes.io/docs/user-guide/images/#pre-pulling-images##imagePullPolicy:Always## Installs voyager operator as critical addon## https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/criticalAddon:false## Log level for operatorlogLevel:3## Annotations passed to operator pod(s).##annotations:{}resources:{}## Node labels for pod assignment## Ref: https://kubernetes.io/docs/user-guide/node-selection/##nodeSelector:kubernetes.io/os:linuxkubernetes.io/arch:amd64## Tolerations for pod assignment## Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/##tolerations:{}## Affinity for pod assignment## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity##affinity:{}## Install Default RBAC roles and bindingsrbac:# Specifies whether RBAC resources should be created  create:trueserviceAccount:# Specifies whether a ServiceAccount should be created  create:true# The name of the ServiceAccount to use.# If not set and create is true, a name is generated using the fullname template  name:apiserver:# groupPriorityMinimum is the minimum priority the group should have. Please see# https://github.com/kubernetes/kube-aggregator/blob/release-1.9/pkg/apis/apiregistration/v1beta1/types.go#L58-L64# for more information on proper values of this field.  groupPriorityMinimum:10000# versionPriority is the ordering of this API inside of the group. Please see# https://github.com/kubernetes/kube-aggregator/blob/release-1.9/pkg/apis/apiregistration/v1beta1/types.go#L66-L70# for more information on proper values of this field  versionPriority:15# enableMutatingWebhook is used to configure mutating webhook for Kubernetes workloads  enableMutatingWebhook:false# enableValidatingWebhook is used to configure validating webhook for Kubernetes workloads  enableValidatingWebhook:false# CA certificate used by main Kubernetes api server  ca:not-ca-cert# If true, disables status sub resource for crds.  disableStatusSubresource:true# If true, bypasses validating webhook xray checks  bypassValidatingWebhookXray:true# If true, uses kube-apiserver FQDN for AKS cluster to workaround https://github.com/Azure/AKS/issues/522 (default true)  useKubeapiserverFqdnForAks:true# healthcheck configures the readiness and liveliness probes for the operator pod.  healthcheck:    enabled:true

 

Deploy the chart @ home and make sure its running.

Install CRDS (only) at WORK 🏢

This may be news to you, it may not, but understand that the operator actually installs the CRDS in the cluster, so in order to work from home, the CRDS need to exist in the work cluster (but without the actual operator). 

For this we can pull this maneuver:

kubectl get crd irisclusters.intersystems.com --kubeconfig ikohome.kubeconfig -o yaml > ikocrds.yaml
kubectl create -f ikocrds.yaml --kubeconfig ikowork.kubeconfig

IKO WFH

Now, lets level set the state of things:

  • IKO is running at home, not at work
  • CRDS are loaded at work, only

When we apply IrisClusters at work, the operator at home will plan and schedule them from home.

Luckily, the whole burn all .gifs things in the 90's  got worked out for the demo.

 

Operator


IrisCluster



💥

Comments