Vendor-Neutral, Self-Managed K8s Cluster Part-2

In Part-1 of this blog, we have setup an operating three nodes cluster and we are going to define a series of YAML files (manifests) containing additional configuration such as Namespaces, Volumes, Ingress, etc. These manifests not only define the desired state for Kubernetes, but also provide verbose and accurate documentation for developers and system administrators. In this cluster, we will define the manifests for Ingress, Certification and File Storage systems

1) Ingress

Ingress

The majority of Kubernetes services are assigned with a ClusterIP and only accessible from within the cluster. Kubetnetes Ingress allows external HTTP and HTTPS connections to Services within the cluster. The Kubernetes Ingress Resource defines a configuration that must be backed by an Ingress controller. Kubernetes does not provide a default Ingress controller, leaving this decision for administrators and system architects to pick on that fits the needs of development.

To define the Ingress resources, first we will create the dedicated Namespace.

apiVersion: v1
kind: Namespace
metadata:
  name: ingress-nginx

Then we will define Role-Based Access Control (RBAC) service account for Ingress Nginx.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: nginx-ingress-serviceaccount
  namespace: ingress-nginx

We also need a configuration file describing an RBAC Cluster Role for Ingress Nginx. This config will defines the access rules of Cluster for many Kubernetes API endpoints.

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: nginx-ingress-clusterrole
rules:
  - apiGroups: [""]
    resources: ["configmaps", "endpoints", "nodes", "pods", "secrets"]
    verbs: ["list", "watch"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get"]
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["extensions"]
    resources: ["ingresses"]
    verbs: ["get","list","watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "patch"]
  - apiGroups: ["extensions"]
    resources: ["ingresses/status"]
    verbs: ["update"]

Similar to the ClusterRole resources, we also need a Role resource for Nginx Ingress.

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
  name: nginx-ingress-role
  namespace: ingress-nginx
rules:
  - apiGroups: [""]
    resources: ["configmaps", "pods", "secrets", "namespaces"]
    verbs: ["get"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames:
      - "ingress-controller-leader-nginx"
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get"]

No we will define the RoleBinding and ClusterRoleBinding resources for the Role and ClusterRole.

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
  name: nginx-ingress-role-nisa-binding
  namespace: ingress-nginx
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: nginx-ingress-role
subjects:
  - kind: ServiceAccount
    name: nginx-ingress-serviceaccount
    namespace: ingress-nginx
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: nginx-ingress-clusterrole-nisa-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: nginx-ingress-clusterrole
subjects:
  - kind: ServiceAccount
    name: nginx-ingress-serviceaccount
    namespace: ingress-nginx

After the Role, ClusterRole and corresponsing binding resources, we will define the Service resource to expose the Ingress Nginx via HTTP and HTTPS.

apiVersion: v1
kind: Service
metadata:
  name: default-http-backend
  namespace: ingress-nginx
  labels:
    app: default-http-backend
spec:
  ports:
    - port: 80
      targetPort: 8080
  selector:
    app: default-http-backend
---
apiVersion: v1
kind: Service
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
spec:
  type: NodePort
  ports:
    - name: http
      port: 80
      targetPort: 80
      protocol: TCP
    - name: https
      port: 443
      targetPort: 443
      protocol: TCP
  selector:
    app: ingress-nginx

Now we will define a ConfigMap resource for Ingress Nginx

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
  labels:
    app: ingress-nginx
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: tcp-services
  namespace: ingress-nginx
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: udp-services
  namespace: ingress-nginx

A Deployment resource is lso needed for a defauly HTTP back end server.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: default-http-backend
  labels:
    app: default-http-backend
  namespace: ingress-nginx
spec:
  replicas: 2
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      app: default-http-backend
  template:
    metadata:
      labels:
        app: default-http-backend
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - default-http-backend
                topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 60
      containers:
        - name: default-http-backend
          image: gcr.io/google_containers/defaultbackend:1.4
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 30
            timeoutSeconds: 5
          ports:
            - containerPort: 8080
          resources:
            limits:
              cpu: 10m
              memory: 20Mi
            requests:
              cpu: 10m
              memory: 20Mi

Finally, we will create a configuration file describing a Kubernetes DaemonSet for the Ingress Nginx controller. This DaemonSet instructs Kubernetes to ensure one Ingress Nginx controller is running on each node. Ingress Nginx controller listens on TCP ports 80 (HTTP) and 443 (HTTPS).

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nginx-ingress-controller
  namespace: ingress-nginx
spec:
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      app: ingress-nginx
  template:
    metadata:
      labels:
        app: ingress-nginx
      annotations:
        prometheus.io/port: '10254'
        prometheus.io/scrape: 'true'
    spec:
      serviceAccountName: nginx-ingress-serviceaccount
      hostNetwork: true
      containers:
        - name: nginx-ingress-controller
          image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.14.0
          args:
            - /nginx-ingress-controller
            - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
            - --configmap=$(POD_NAMESPACE)/nginx-configuration
            - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
            - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
            - --annotations-prefix=nginx.ingress.kubernetes.io
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          ports:
            - name: http
              containerPort: 80
              hostPort: 80
            - name: https
              containerPort: 443
              hostPort: 443
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          securityContext:
            runAsNonRoot: false

All resources for the Nginx is now defined. You can either create each resource one by one, or create all in one time within the directory.

$ kubectl apply -f ./

After all resources created, the cluster is ready to accept web traffic through ports 80 and 443 on each node. This set of Ingress Nginx configuration manifests accurately represents the current or desired state of the Cluster, additionally providing documentation to others and the ability to reproduce this state at a later time, or on another cluster. In the next section, we will generate TLS certificates that Ingress Nginx can use to service encrypted HTTPS traffic.

2) TLS/HTTPS with Cert Manager

Cert Manager

Cert Manager is used to automate the management and issuance of TLS certificates from various issuing resources. This tutorial uses Let's Encrypt for secure, free TLS certificate issuance, later configured with a Cert Manager custom resource called ClusterIssuer.

To setup the certificate, first we should create the dedicated Namespace resource.

apiVersion: v1
kind: Namespace
metadata:
  name: cert-manager
  labels:
    certmanager.k8s.io/disable-validation: "true"

Next, we will utilize the Cert Manager's custom resource definitions (CRDs) and save them as yml file. We utilize the release version v0.11.0, but feel free to try it with newer releases.

$ curl -L https://github.com/jetstack/cert-manager/releases/download/v0.11.0/cert-manager.yaml >2-crd.yml

Now let's create these two resources. After the creation we can check the generated resources.

$ kubectl apply -f 1-namespace.yml 2-crd.yml
$ kubectl get pods -n cert-manager

Cert Manager defined new custom resources called ClusterIssuer and Certificate, among others. A Certificate describes the desired TLS certificate and the use of an Issuer to retrieve the TLS certificate from an authority such as Let's Encrypt. This tutorial uses a ClusterIssuer for all Certificates. So create the following ClusterIssuer resource.

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-production
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: mehmet@ylcnky.com # Change email address accordingly
    privateKeySecretRef:
      name: letsencrypt-production
    # Enable the HTTP-01 challenge provider
    solvers:
    - http01:
        ingress:
          class: traefik

Any namespace in the Cluster can use the new ClusterIssuer. The cluster is now able to accept inbound HTTP and HTTPS and perform automatic generation of TLS certificates.

3) Persistent Volumes, Block Storage and Shared Filesystem with Rook Ceph

Rook

This cluster will be mostly used for various projects which are grouped as K8s-MLOps, K8s-DevOps and K8s-Data-Platform. Therefore a Persistent Storage is an essential and often tricky requirement for some Kubernetes deployments. Kubernetes Pods are considered transient and their file systems along with them. External databases are a great way to persist data obtained by an application container in a Pod. However, some Pods may represent databases or filesystems themselves, and therfore any connected data volumes must survive beyond the lifespan of the Pod itself.

This section of the blog enables Kubernetes Persistent Volumes backed by Ceph orchestrated by Rook. Ceph is a distributed storage cluster, providing Kubernetes Persistent volumes for object, block and filesystem-based storage. The official Rook documentation for Ceph suggests starting with their example configuration manifests and customizing them where desired. We will also use the Rook's CRDs, operator, cluster and toolbox deployments.

$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/common.yaml >1-namespace-crd.yml
$ kubectl apply -f 1-namespace-crd.yml

Next, we will get the Rook Ceph Operator Deployment manifest and apply to Kubernetes

$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/operator.yaml >2-deployment-oper.yml
$ kubectl apply -f 2-deployment-oper.yml

Next, we will get the Rook Ceph Cluster configuration and apply it to Kubernetes.

$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/cluster-test.yaml >3-cluster-rook-ceph.yml
$ kubectl apply -f 3-cluster-rook-ceph.yml

Finally we will get the Rook Ceph Deployment manifesta nd apply it to Kubernetes

$ curl -L https://github.com/rook/rook/raw/release-1.0/cluster/examples/kubernetes/ceph/toolbox.yaml >4-deployment-toolbox.yml
$ kubectl apply -f 4-deployment-toolbox.yml

The rook-ceph Namespace now contains the Pods for managing the underlying Ceph cluster along with the ability to provision Persistent Volumnes from Persistent Volume Claims. Pods requiring Persistent Volumes may do so Persistent Volume Claims (PVCs). PVCs require a defined StorageClass used by Rook to provision a Persistent Volume. For this, we will setup a new StorageClass called rook-ceph-block backed by a CephBlockPool able to provision Persistent Volumes from PVS requests.

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 1
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-ceph-block
provisioner: ceph.rook.io/block
parameters:
  blockPool: replicapool
  clusterNamespace: rook-ceph
  fstype: xfs
reclaimPolicy: Delete
$ kubectl apply -f 5-rook-ceph-block.yml

The cluster should now support PVCs commonly used by Kubernetes StatefulSets. In the upcoming tutorials, PVCs for stateful application will be used for databases, data indexes and event queues.

Next, we will define the resources for a cluster-wide shared filesystem backed by Ceph. Shared filesystems provide opportunities to separate responsilibility around the management of files. Shared filesystems allow scenarios where one set of Pods may enable users to upload files such as images, while another set of Pods retrieves and processes them. Although there are many other ways to share files across deployments, a shared filesystem backed by Ceph empowers flexible options in architecting a data-centric platform in the Cluster. We will define the CephFilesystem resource and apply it to Kubernetes

apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: rook-ceph-clusterfs
  namespace: rook-ceph
spec:
  metadataPool:
    replicated:
    size: 1
  dataPools:
    - failureDomain: host
      replicated:
        size: 2
  metadataServer:
    activeCount: 1
    activeStandby: true
$ kubectl apply -f 6-rook-ceph-clusterfs.yml

After the resource is created, the cluster is now able to

  • accept and route inbound web traffic with Ingress Nginx
  • create and use TLS certificates with Cert Manager and Let's Encrypt, and
  • provision PVCs, and offer a shared filesystem with Rook and Ceph.

Conclusion

It seems like a lot of effort is required to bring this cluster up and running with these essential capabilities when the major cloud providers offer much of this stack with a click of a button. However, the cluster configured in this tutorial blog can run on nearly any provider, making it truly portable, cloud native and vendor neutral. In the further planned blog topics, this cluster will be utilized as the back of any data-centric services. You can access the GitLab repository of these tutorial series from here

blog

copyright©2021 ylcnky all rights reserved