Vendor-Neutral, Self-Managed K8s Cluster Part-2
In Part-1 of this blog, we have setup an operating three nodes cluster and we are going to define a series of YAML files (manifests) containing additional configuration such as
Ingress, etc. These manifests not only define the desired state for Kubernetes, but also provide verbose and accurate documentation for developers and system administrators. In this cluster, we will define the manifests for Ingress, Certification and File Storage systems
The majority of Kubernetes services are assigned with a
ClusterIP and only accessible from within the cluster. Kubetnetes Ingress allows external
HTTPS connections to Services within the cluster. The Kubernetes Ingress Resource defines a configuration that must be backed by an Ingress controller. Kubernetes does not provide a default Ingress controller, leaving this decision for administrators and system architects to pick on that fits the needs of development.
To define the Ingress resources, first we will create the dedicated Namespace.
apiVersion: v1 kind: Namespace metadata: name: ingress-nginx
Then we will define
Role-Based Access Control (RBAC) service account for Ingress Nginx.
apiVersion: v1 kind: ServiceAccount metadata: name: nginx-ingress-serviceaccount namespace: ingress-nginx
We also need a configuration file describing an RBAC Cluster Role for Ingress Nginx. This config will defines the access rules of Cluster for many Kubernetes API endpoints.
apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: nginx-ingress-clusterrole rules: - apiGroups: [""] resources: ["configmaps", "endpoints", "nodes", "pods", "secrets"] verbs: ["list", "watch"] - apiGroups: [""] resources: ["nodes"] verbs: ["get"] - apiGroups: [""] resources: ["services"] verbs: ["get", "list", "watch"] - apiGroups: ["extensions"] resources: ["ingresses"] verbs: ["get","list","watch"] - apiGroups: [""] resources: ["events"] verbs: ["create", "patch"] - apiGroups: ["extensions"] resources: ["ingresses/status"] verbs: ["update"]
Similar to the
ClusterRole resources, we also need a Role resource for Nginx Ingress.
apiVersion: rbac.authorization.k8s.io/v1beta1 kind: Role metadata: name: nginx-ingress-role namespace: ingress-nginx rules: - apiGroups: [""] resources: ["configmaps", "pods", "secrets", "namespaces"] verbs: ["get"] - apiGroups: [""] resources: ["configmaps"] resourceNames: - "ingress-controller-leader-nginx" verbs: ["get", "update"] - apiGroups: [""] resources: ["configmaps"] verbs: ["create"] - apiGroups: [""] resources: ["endpoints"] verbs: ["get"]
apiVersion: rbac.authorization.k8s.io/v1beta1 kind: RoleBinding metadata: name: nginx-ingress-role-nisa-binding namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: nginx-ingress-role subjects: - kind: ServiceAccount name: nginx-ingress-serviceaccount namespace: ingress-nginx
apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: nginx-ingress-clusterrole-nisa-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: nginx-ingress-clusterrole subjects: - kind: ServiceAccount name: nginx-ingress-serviceaccount namespace: ingress-nginx
ClusterRole and corresponsing binding resources, we will define the Service resource to expose the Ingress Nginx via HTTP and HTTPS.
apiVersion: v1 kind: Service metadata: name: default-http-backend namespace: ingress-nginx labels: app: default-http-backend spec: ports: - port: 80 targetPort: 8080 selector: app: default-http-backend --- apiVersion: v1 kind: Service metadata: name: ingress-nginx namespace: ingress-nginx spec: type: NodePort ports: - name: http port: 80 targetPort: 80 protocol: TCP - name: https port: 443 targetPort: 443 protocol: TCP selector: app: ingress-nginx
Now we will define a ConfigMap resource for Ingress Nginx
apiVersion: v1 kind: ConfigMap metadata: name: nginx-configuration namespace: ingress-nginx labels: app: ingress-nginx --- apiVersion: v1 kind: ConfigMap metadata: name: tcp-services namespace: ingress-nginx --- apiVersion: v1 kind: ConfigMap metadata: name: udp-services namespace: ingress-nginx
A Deployment resource is lso needed for a defauly HTTP back end server.
apiVersion: apps/v1 kind: Deployment metadata: name: default-http-backend labels: app: default-http-backend namespace: ingress-nginx spec: replicas: 2 revisionHistoryLimit: 1 selector: matchLabels: app: default-http-backend template: metadata: labels: app: default-http-backend spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - default-http-backend topologyKey: kubernetes.io/hostname terminationGracePeriodSeconds: 60 containers: - name: default-http-backend image: gcr.io/google_containers/defaultbackend:1.4 livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 ports: - containerPort: 8080 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi
Finally, we will create a configuration file describing a Kubernetes DaemonSet for the Ingress Nginx controller. This DaemonSet instructs Kubernetes to ensure one Ingress Nginx controller is running on each node. Ingress Nginx controller listens on TCP ports 80 (
HTTP) and 443 (
apiVersion: apps/v1 kind: DaemonSet metadata: name: nginx-ingress-controller namespace: ingress-nginx spec: revisionHistoryLimit: 1 selector: matchLabels: app: ingress-nginx template: metadata: labels: app: ingress-nginx annotations: prometheus.io/port: '10254' prometheus.io/scrape: 'true' spec: serviceAccountName: nginx-ingress-serviceaccount hostNetwork: true containers: - name: nginx-ingress-controller image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.14.0 args: - /nginx-ingress-controller - --default-backend-service=$(POD_NAMESPACE)/default-http-backend - --configmap=$(POD_NAMESPACE)/nginx-configuration - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services - --udp-services-configmap=$(POD_NAMESPACE)/udp-services - --annotations-prefix=nginx.ingress.kubernetes.io env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace ports: - name: http containerPort: 80 hostPort: 80 - name: https containerPort: 443 hostPort: 443 livenessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 readinessProbe: failureThreshold: 3 httpGet: path: /healthz port: 10254 scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 securityContext: runAsNonRoot: false
All resources for the Nginx is now defined. You can either create each resource one by one, or create all in one time within the directory.
$ kubectl apply -f ./
After all resources created, the cluster is ready to accept web traffic through ports 80 and 443 on each node. This set of Ingress Nginx configuration manifests accurately represents the current or desired state of the Cluster, additionally providing documentation to others and the ability to reproduce this state at a later time, or on another cluster. In the next section, we will generate TLS certificates that Ingress Nginx can use to service encrypted
TLS/HTTPS with Cert Manager
Cert Manager is used to automate the management and issuance of TLS certificates from various issuing resources. This tutorial uses Let's Encrypt for secure, free TLS certificate issuance, later configured with a Cert Manager custom resource called
To setup the certificate, first we should create the dedicated Namespace resource.
apiVersion: v1 kind: Namespace metadata: name: cert-manager labels: certmanager.k8s.io/disable-validation: "true"
Next, we will utilize the Cert Manager's custom resource definitions (CRDs) and save them as
yml file. We utilize the release version
v0.11.0, but feel free to try it with newer releases.
$ curl -L https://github.com/jetstack/cert-manager/releases/download/v0.11.0/cert-manager.yaml >2-crd.yml
Now let's create these two resources. After the creation we can check the generated resources.
$ kubectl apply -f 1-namespace.yml 2-crd.yml $ kubectl get pods -n cert-manager
Cert Manager defined new custom resources called
Certificate, among others. A Certificate describes the desired TLS certificate and the use of an
Issuer to retrieve the TLS certificate from an authority such as Let's Encrypt. This tutorial uses a
ClusterIssuer for all Certificates. So create the following
apiVersion: cert-manager.io/v1alpha2 kind: ClusterIssuer metadata: name: letsencrypt-production spec: acme: server: https://acme-v02.api.letsencrypt.org/directory # Email address used for ACME registration email: firstname.lastname@example.org # Change email address accordingly privateKeySecretRef: name: letsencrypt-production # Enable the HTTP-01 challenge provider solvers: - http01: ingress: class: traefik
Any namespace in the Cluster can use the new
ClusterIssuer. The cluster is now able to accept inbound
HTTPS and perform automatic generation of TLS certificates.
3) Persistent Volumes, Block Storage and Shared Filesystem with Rook Ceph
This cluster will be mostly used for various projects which are grouped as K8s-MLOps, K8s-DevOps and K8s-Data-Platform. Therefore a Persistent Storage is an essential and often tricky requirement for some Kubernetes deployments. Kubernetes Pods are considered transient and their file systems along with them. External databases are a great way to persist data obtained by an application container in a Pod. However, some Pods may represent databases or filesystems themselves, and therfore any connected data volumes must survive beyond the lifespan of the Pod itself.
This section of the blog enables Kubernetes Persistent Volumes backed by Ceph orchestrated by Rook. Ceph is a distributed storage cluster, providing Kubernetes Persistent volumes for object, block and filesystem-based storage. The official Rook documentation for Ceph suggests starting with their example configuration manifests and customizing them where desired. We will also use the Rook's CRDs, operator, cluster and toolbox deployments.
$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/common.yaml >1-namespace-crd.yml $ kubectl apply -f 1-namespace-crd.yml
Next, we will get the Rook Ceph Operator Deployment manifest and apply to Kubernetes
$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/operator.yaml >2-deployment-oper.yml $ kubectl apply -f 2-deployment-oper.yml
Next, we will get the Rook Ceph Cluster configuration and apply it to Kubernetes.
$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/cluster-test.yaml >3-cluster-rook-ceph.yml $ kubectl apply -f 3-cluster-rook-ceph.yml
Finally we will get the Rook Ceph Deployment manifesta nd apply it to Kubernetes
$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/toolbox.yaml >4-deployment-toolbox.yml $ kubectl apply -f 4-deployment-toolbox.yml
rook-ceph Namespace now contains the Pods for managing the underlying Ceph cluster along with the ability to provision Persistent Volumnes from Persistent Volume Claims. Pods requiring Persistent Volumes may do so Persistent Volume Claims (PVCs). PVCs require a defined
StorageClass used by Rook to provision a Persistent Volume. For this, we will setup a new
rook-ceph-block backed by a
CephBlockPool able to provision Persistent Volumes from PVS requests.
apiVersion: ceph.rook.io/v1 kind: CephBlockPool metadata: name: replicapool namespace: rook-ceph spec: failureDomain: host replicated: size: 1 --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: rook-ceph-block provisioner: ceph.rook.io/block parameters: blockPool: replicapool clusterNamespace: rook-ceph fstype: xfs reclaimPolicy: Delete
$ kubectl apply -f 5-rook-ceph-block.yml
The cluster should now support PVCs commonly used by Kubernetes StatefulSets. In the upcoming tutorials, PVCs for stateful application will be used for databases, data indexes and event queues.
Next, we will define the resources for a cluster-wide shared filesystem backed by Ceph. Shared filesystems provide opportunities to separate responsilibility around the management of files. Shared filesystems allow scenarios where one set of Pods may enable users to upload files such as images, while another set of Pods retrieves and processes them. Although there are many other ways to share files across deployments, a shared filesystem backed by Ceph empowers flexible options in architecting a data-centric platform in the Cluster. We will define the
CephFilesystem resource and apply it to Kubernetes
apiVersion: ceph.rook.io/v1 kind: CephFilesystem metadata: name: rook-ceph-clusterfs namespace: rook-ceph spec: metadataPool: replicated: size: 1 dataPools: - failureDomain: host replicated: size: 2 metadataServer: activeCount: 1 activeStandby: true
$ kubectl apply -f 6-rook-ceph-clusterfs.yml
After the resource is created, the cluster is now able to
- accept and route inbound web traffic with Ingress Nginx
- create and use TLS certificates with Cert Manager and Let's Encrypt, and
- provision PVCs, and offer a shared filesystem with Rook and Ceph.
It seems like a lot of effort is required to bring this cluster up and running with these essential capabilities when the major cloud providers offer much of this stack with a click of a button. However, the cluster configured in this tutorial blog can run on nearly any provider, making it truly portable, cloud native and vendor neutral. In the further planned blog topics, this cluster will be utilized as the back of any data-centric services. You can access the GitLab repository of these tutorial series from here