Kubernetes Patterns: The Daemon
A daemon is a process that runs in the background. Typically, a daemon does not produce visible output to the user; it does not accept inputs. Daemons exist to perform background jobs. For example, in Linux, we have the https to respond to HTTP requests, sshd to grant remote users secure remote-shell access. We also have several kernel daemons that do not accept users input; they exist to perform housekeeping and other essential tasks that the kernel needs to function correctly. Sometimes, users may create or install their daemons. For example, logrotated is a popular Linux daemon that routinely archives old log files in configurable paths according to user-defined settings. Another example is log shippers (filebat, fluentd) that periodically send logs to a log aggregation service like ELK stack for analysis and correlation.
Do We Need Daemons in Kubernetes?
Kubernetes is often referred to as the data center operating system. As we just discussed, an operating system needs daemons to perform background jobs that users do not interact with. So , in Kubernetes, higher-level contollers like Deployments need to contniually monitor the number of running Pods so that it spawsns or kills Pod as required. Such a task needs to run through a daemon: a background process that needs no user interaction, it is always running and is chiefly managed by the Kubernetes engine itself. Kubernetes administrators may also need daemons to execute tasks on the runnin nodes. For that purpose, Kubernetes offers tht DaemonSet resource. Like a Deployment, ReplicaSet or a StatefulSet, a DaemonSet creates and manages Pods. However, those Pods are configured so that tehy run on all the cluster nodes.
Why Not Just Install the Daemon on the Node Itself ?
Because that is not how the cluster works, a daemon that's directly installed on a node is not managed by Kubernetes, it is, instead, controlled by and reports to the node's operating system. Any changes that you need to make the daemon configuration need to be performed on every node. If the daemon stops working or reports errors, Kubernetes does nothing for you. You need to configure the OS or some third-party tool to restart the daemon if it fails. But, wasn't Kubernetes designed for that sort of tasks? That's why you would be much better off using a DaemonSet.
The following is stripped-down version of the official fluentd daemonset definition file.
1apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: fluentd namespace: kube-system labels: k8s-app: fluentd-logging version: v1 spec: template: metadata: labels: k8s-app: fluentd-logging version: v1 spec: nodeSelector: env: prod containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:elasticsearch volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers
How are DaemonSets Different From ReplicaSets ?
DaemonSets runs one Pod on every node. This can be limited using nodeSelector or taints and tolerations depending on your required scenario. A ReplicaSet works by selecting the most suitable node(s) to run ght Pods. It doesn't guarantee that every node has on running Pod.
DaemonSets Pods do no need a scheduler to run. Each Pod has the nodeName parameter already specified. This makes DaemonSets ideal for running the Kubernetes system daemons. Pods created by DaemonSets are treated differently by Kubernetes. For example, they have higher priorities than the rest of Pods; the descheduler does not evict it, and so on.
DaemonSet Pods Access Patterns
Most of the time, you don't need to communicate with Pods spawned by DaemonSets. Those Pods are mainly used for background tasks, house keeping, log aggregation and so on. But what if you do want to sent an HTTP request to the Pod and examine the response? For example, a custom log-collection Pod may have a health or staus endpoint that displays information about the number of logs it already processed, whether there were errors, etc. Let's explore the different options that you have:
- Create a traditional Service and set the Pod selector the same as the one used by the DaemonSet. The drawback of this approach is that you always get a response from a random node.
- Create Headless Service that uses the same Pod selector as the DaemonSet but does not expose an IP address. The headless service returns a list of IP addresses of the Pods it matches. It is up to you to parse the response and select the appropriate Pod to communicate with.
- Use the hostPort option with the Daemon Pods. Using hostPort, you make the Pod accessible through the node's IP address. There is no service layer in between. This approach has a significant drawback because you are limited to the availability of the port on the node. You can use this method in small or development environments.