IKO Plus: Operator Works From Home - IrisCluster Provisioning Across Kubernetes Clusters
.png)
IKO Helm Status: WFH
Here is an option for your headspace if you are designing an multi-cluster architecture and the Operator is an FTE to the design. You can run the Operator from a central Kubernetes cluster (A), and point it to another Kubernetes cluster (B), so that when the apply an IrisCluster to B the Operator works remotely on A and plans the cluster accordingly on B. This design keeps some resource heat off the actual workload cluster, spares us some serviceaccounts/rbac and gives us only one operator deployment to worry about so we can concentrate on the IRIS workloads.
.png)
IKO woke up and decided against the commute for work, despite needing to operate a development workload of many IrisClusters at the office that day. Using the saved windshield time, IKO upgraded its helm values on a Kubernetes cluster at home, bounced itself, and went for a run. Once settled back in, inspecting its logs, it could see it had planned many IrisClusters on the Office Kubernetes cluster, all at the cost of its own internet and power.
Here is how IKO managed this...
Clusters
Lets provision two Kind clusters, ikohome and ikowork.
ikokind.sh
cat <<EOF | kind create cluster --name ikohome --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
networking:
disableDefaultCNI: true
EOF
cat <<EOF | kind create cluster --name ikowork --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
networking:
disableDefaultCNI: true
EOF
kind get kubeconfig --name ikohome > ikohome.kubeconfig
kind get kubeconfig --name ikowork > ikowork.kubeconfig
KUBECONFIGS=("ikohome.kubeconfig" "ikowork.kubeconfig")
for cfg in "${KUBECONFIGS[@]}"; do
echo ">>> Running against kubeconfig: $cfg"
cilium install --version v1.18.0 --kubeconfig "$cfg"
cilium status --wait --kubeconfig "$cfg"
echo ">>> Finished $cfg"
echo
done
After running the above, you should have two clusters running, loaded with the Cilium CNI and ready for business..png)
Install IKO at HOME 🏠
First we need to make the home cluster aware of the work cluster and load up its kubeconfig as a secret, you can get this done with the following.
kubectl create secret generic work-kubeconfig --from-file=config=ikowork.kubeconfig --kubeconfig ikohome.kubeconfigNow, we need to make some changes to the IKO chart to WFH.
- Mount the kubeconfig Secret as a Volume in the Operator
- point to the kubeconfig in the operator arguments ( --kubeconfig )
The deployment.yaml in its entirety is below, edited right out of the factory, but here are the important points called out in the yaml
Mount
volumeMounts:
...
- mountPath: /airgap/.kube
name: kubeconfig
readOnly: true
volumes:
...
- name: kubeconfig
secret:
secretName: work-kubeconfig
items:
- key: config
path: configArgs
The args to the container too... I tried this with the env "KUBECONFIG" but after taking a look at the controller code, found out there was a precedence to such things.
containers:
- name: operator
image: {{ .Values.operator.registry }}/{{ .Values.operator.repository }}:{{ .Values.operator.tag }}
imagePullPolicy: {{ .Values.imagePullPolicy }}
args:
- run
...
- --kubeconfig=/airgap/.kube/config
...
deployment.yaml
# GKE returns Major:"1", Minor:"10+"
{{- $major := default "0" .Capabilities.KubeVersion.Major | trimSuffix "+" | int64 }}
{{- $minor := default "0" .Capabilities.KubeVersion.Minor | trimSuffix "+" | int64 }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ template "iris-operator.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "iris-operator.labels" . | nindent 4 }}
{{- if .Values.annotations }}
annotations:
{{ toYaml .Values.annotations | indent 4 }}
{{- end }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app: "{{ template "iris-operator.name" . }}"
release: "{{ .Release.Name }}"
template:
metadata:
labels:
{{- include "iris-operator.labels" . | nindent 8 }}
{{- if or .Values.annotations (and .Values.criticalAddon (eq .Release.Namespace "kube-system")) }}
annotations:
{{- if and .Values.criticalAddon (eq .Release.Namespace "kube-system") }}
scheduler.alpha.kubernetes.io/critical-pod: ''
{{- end }}
{{- if .Values.annotations }}
{{ toYaml .Values.annotations | indent 8 }}
{{- end }}
{{- end }}
spec:
serviceAccountName: {{ template "iris-operator.serviceAccountName" . }}
{{- if .Values.imagePullSecrets }}
imagePullSecrets:
{{ toYaml .Values.imagePullSecrets | indent 6 }}
{{- end }}
securityContext:
# ensure that s/a token is readable xref: https://issues.k8s.io/70679
fsGroup: 65535
containers:
- name: operator
image: {{ .Values.operator.registry }}/{{ .Values.operator.repository }}:{{ .Values.operator.tag }}
imagePullPolicy: {{ .Values.imagePullPolicy }}
args:
- run
- --v={{ .Values.logLevel }}
- --secure-port=8443
- --kubeconfig=/airgap/.kube/config
- --audit-log-path=-
- --tls-cert-file=/var/serving-cert/tls.crt
- --tls-private-key-file=/var/serving-cert/tls.key
- --enable-mutating-webhook={{ .Values.apiserver.enableMutatingWebhook }}
- --enable-validating-webhook={{ .Values.apiserver.enableValidatingWebhook }}
- --bypass-validating-webhook-xray={{ .Values.apiserver.bypassValidatingWebhookXray }}
{{- if and (not .Values.apiserver.disableStatusSubresource) (ge $major 1) (ge $minor 11) }}
- --enable-status-subresource=true
{{- end }}
- --use-kubeapiserver-fqdn-for-aks={{ .Values.apiserver.useKubeapiserverFqdnForAks }}
ports:
- containerPort: 8443
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: ISC_USE_FQDN
value: {{ default "true" (quote .Values.operator.useFQDN)}}
- name: ISC_WEBSERVER_PORT
value: {{ default "52773" (quote .Values.operator.webserverPort)}}
- name: ISC_USE_IRIS_FSGROUP
value: {{ default "false" (quote .Values.operator.useIrisFsGroup)}}
- name: ISC_NUM_THREADS
value: {{ default "2" (quote .Values.operator.numThreads)}}
- name: ISC_RESYNC_PERIOD
value: {{ default "10m" (quote .Values.operator.resyncPeriod)}}
- name: ISC_WEBGATEWAY_STARTUP_TIMEOUT
value: {{ default "0" (quote .Values.operator.webGatewayStartupTimeout)}}
- name: KUBECONFIG
value: /airgap/.kube/config
{{- if .Values.apiserver.healthcheck.enabled }}
readinessProbe:
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
initialDelaySeconds: 5
livenessProbe:
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
initialDelaySeconds: 5
{{- end }}
resources:
{{ toYaml .Values.resources | indent 10 }}
volumeMounts:
- mountPath: /var/serving-cert
name: serving-cert
- mountPath: /airgap/.kube
name: kubeconfig
readOnly: true
volumes:
- name: serving-cert
secret:
defaultMode: 420
secretName: {{ template "iris-operator.fullname" . }}-apiserver-cert
- name: kubeconfig
secret:
secretName: work-kubeconfig
items:
- key: config
path: config
{{- if or .Values.tolerations (and .Values.criticalAddon (eq .Release.Namespace "kube-system")) }}
tolerations:
{{- if .Values.tolerations }}
{{ toYaml .Values.tolerations | indent 8 }}
{{- end -}}
{{- if and .Values.criticalAddon (eq .Release.Namespace "kube-system") }}
- key: CriticalAddonsOnly
operator: Exists
{{- end -}}
{{- end -}}
{{- if .Values.affinity }}
affinity:
{{ toYaml .Values.affinity | indent 8 }}
{{- end -}}
{{- if .Values.nodeSelector }}
nodeSelector:
{{ toYaml .Values.nodeSelector | indent 8 }}
{{- end -}}
{{- if and .Values.criticalAddon (eq .Release.Namespace "kube-system") }}
priorityClassName: system-cluster-critical
{{- end -}}
Chart
Same with the values.yaml, here I disabled the mutating and validating webhooks.
values.yaml
# Default values for iris-operator.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
replicaCount: 1
operator:
registry: containers.intersystems.com
repository: intersystems/iris-operator-amd
tag: 3.8.42.100
# Operator Environment Variables
useFQDN: true
webserverPort: 52773
useIrisFsGroup: false
numThreads: 2
resyncPeriod: "10m"
webGatewayStartupTimeout: 0
# https://github.com/appscodelabs/Dockerfiles/tree/master/kubectl
cleaner:
registry: appscode
repository: kubectl
tag: v1.14
## Optionally specify an array of imagePullSecrets.
## Secrets must be manually created in the namespace.
## ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
##
imagePullSecrets:
- name: dockerhub-secret
## Specify a imagePullPolicy
## ref: http://kubernetes.io/docs/user-guide/images/#pre-pulling-images
##
imagePullPolicy: Always
## Installs voyager operator as critical addon
## https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
criticalAddon: false
## Log level for operator
logLevel: 3
## Annotations passed to operator pod(s).
##
annotations: {}
resources: {}
## Node labels for pod assignment
## Ref: https://kubernetes.io/docs/user-guide/node-selection/
##
nodeSelector:
kubernetes.io/os: linux
kubernetes.io/arch: amd64
## Tolerations for pod assignment
## Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
##
tolerations: {}
## Affinity for pod assignment
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
##
affinity: {}
## Install Default RBAC roles and bindings
rbac:
# Specifies whether RBAC resources should be created
create: true
serviceAccount:
# Specifies whether a ServiceAccount should be created
create: true
# The name of the ServiceAccount to use.
# If not set and create is true, a name is generated using the fullname template
name:
apiserver:
# groupPriorityMinimum is the minimum priority the group should have. Please see
# https://github.com/kubernetes/kube-aggregator/blob/release-1.9/pkg/apis/apiregistration/v1beta1/types.go#L58-L64
# for more information on proper values of this field.
groupPriorityMinimum: 10000
# versionPriority is the ordering of this API inside of the group. Please see
# https://github.com/kubernetes/kube-aggregator/blob/release-1.9/pkg/apis/apiregistration/v1beta1/types.go#L66-L70
# for more information on proper values of this field
versionPriority: 15
# enableMutatingWebhook is used to configure mutating webhook for Kubernetes workloads
enableMutatingWebhook: false
# enableValidatingWebhook is used to configure validating webhook for Kubernetes workloads
enableValidatingWebhook: false
# CA certificate used by main Kubernetes api server
ca: not-ca-cert
# If true, disables status sub resource for crds.
disableStatusSubresource: true
# If true, bypasses validating webhook xray checks
bypassValidatingWebhookXray: true
# If true, uses kube-apiserver FQDN for AKS cluster to workaround https://github.com/Azure/AKS/issues/522 (default true)
useKubeapiserverFqdnForAks: true
# healthcheck configures the readiness and liveliness probes for the operator pod.
healthcheck:
enabled: true
Deploy the chart @ home and make sure its running.

Install CRDS (only) at WORK 🏢
This may be news to you, it may not, but understand that the operator actually installs the CRDS in the cluster, so in order to work from home, the CRDS need to exist in the work cluster (but without the actual operator).
For this we can pull this maneuver:
kubectl get crd irisclusters.intersystems.com --kubeconfig ikohome.kubeconfig -o yaml > ikocrds.yaml
kubectl create -f ikocrds.yaml --kubeconfig ikowork.kubeconfig
IKO WFH
Now, lets level set the state of things:
- IKO is running at home, not at work
- CRDS are loaded at work, only
When we apply IrisClusters at work, the operator at home will plan and schedule them from home.
.png)
Luckily, the whole burn all .gifs things in the 90's got worked out for the demo.
Operator
IrisCluster
.png)
💥
Comments
The unstoppable K-Ninja! :-) Love it
.png)