The Importance of Using Labels in Kubernetes

Image Even a small Kubernets cluster may have hundreds of Containers, Pods, Services and many other Kubernetes API ojects. It quickly becomes annoying to page through pages of kubectl output to find your object. -labels address this issue perfectly. The primary reasons you should use labels can be:

  • enables you to logically organize all your Kubernetes workloads in all your clusters.
  • enables you to very selectively filter kubectö outputs to just the objects you need.
  • enables you to understand the layers and hierarchies of all your API objects-

Labels vs. Annotations

Labels and annotations are sometimes confused. Having a quick look at the documentation makes this understandable.

Labels

"metadata": {
  "labels": {
    "key1" : "value1",
    "key2" : "value2"
  }
}

Annotations

"metadata": {
  "annotations": {
    "key1" : "value1",
    "key2" : "value2"
  }
}

Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels can be used to organize and to select subsets of obkects.

You can use Kubernetes annotations to attach arbitrary non-identiying metadata to objects. Clients such as tools and libraries can retrieve this metadata. You can use either labels or annotataions to attach metadata to Kubernetes objects. Labels can be used to select objects and to find collections of objects that satisfy certain conditions. In contrast, annotations are not used to identify and select objects. The metadata in an annotation can be small or large, structured or unstructured.

Example labels:

"release" : "stable"
"release" : "canary"

"environment" : "dev"
"environment" : "qa"
"environment" : "production"

Example annotations:

standbyphone: 000-000 0000
developer: Neil Armstrong

Let's now focus more deeply labels and how to use them.

image

Well-Known Labels, Annotations and Taints

Before we create our own labels, let's look at some labels that Kubernetes creates automatically. Kubernetes automatically creates these labels on nodes.

kubernetes.io/arch
Example:
kubernetes.io/arch=amd64

kubernetes.io/os
Example:
kubernetes.io/os=linux

node.kubernetes.io/instance-type
Example:
node.kubernetes.io/instance-type=m3.medium


topology.kubernetes.io/zone

Example 1:
topology.kubernetes.io/region=us-east-1

Example 2:
topology.kubernetes.io/zone=us-east-1c

These labels now allow us to filter our nodes in the following interesting ways

List All Linux Nodes

$ kubectl get nodes -l 'kubernetes.io/os=linux'

List all nodes with instance type m3.medium

$ kubectl get nodes -l 'node.kubernetes.io/instance-type=m3.medium'

List all nodes in a specific region

$ kubectl get nodes -l 'topology.kubernetes.io/region=us-east-1'

List all nodes in specific regions

$ kubectl get nodes -l 'topology.kubernetes.io/region in (us-east-1, us-west-1)'

If we apply these labels on all our Pods we may filter the kubectl output as follows

"release" : "stable"
"release" : "canary"

"environment" : "dev"
"environment" : "qa"
"environment" : "production"

$ kubectl get pods -l 'environment in (production), release in (canary)'

$ kubectl get pods -l 'environment in (production, qa)'

$ kubectl get pods -l 'environment notin (qa)'

Considering a given complex environment of multiple Kubernetes clusters, multiple nodes and many more namespaces, it's easy to see the ability to filter kubectl output is a major timesaver. In addition, Job, Deployment, ReplicaSet and DaemonSet, support set-based selectors as well.

selector:
  matchLabels:
    component: redis
  matchExpressions:
    - {key: tier, operator: In, values: [cache]}
    - {key: environment, operator: NotIn, values: [dev]}

image

Organize All your Kubernetes Workloads in All your Clusters.

By taking the AWS terminology as the base, we can create an example labeling schema. First some definitions:

  • Region: A physical location around the world
  • Availability Zone: A group of data centers inside a region.

This means that your containers have the following hierarchy: Region --> Availability Zone --> K8s Cluster --> Namespace --> Deployment --> Pod --> Containers

You can use labels to add labels at every level in this hierarchy. This enables you to understand the full global scope of all layers and hierarchies of all your API objects. When you combine this with the label selectors, you have an infinite number of ways to filter your Kubernetes workloads.

Example: Find Pods by Labels to Get Their Pod Logs

Given a namespace your-namespace and a label query that identifies the pods you are interested in, you can get the logs for all of those pods. If the pod isn't unique, it will fetch the logs for each pod in parallel.

$ ns='qa' ; label='release=canary' ; kubectl get pods -n $ns -l $label -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' | xargs -I {} kubectl -n $ns logs {}

$ ns = your-namespace

$ kubectl get pods -n $ns -l $label -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}'

This command gets a list of Pod names in the $ns namespace with label of release=canary. It outputs the Pod names.

$ | xargs -I {} kubectl -n $ns logs {}

This part of the command receives the list of Pod names and shows their logs. xargs is the domain of Linux administor shell experts. The point of this example is that you can very selectively via Linux batch scripting process lists of API objects. This gets more useful the more clusters, namespaces and API objects you have.

Kubernetes Recommended Labels

The official Kubernetes documentation recommends that you use the following labels:

  • name: name of application¨
  • instance: unique name of instance
  • version: semantic version number
  • component: the component within your logical architecture
  • part-of: the name of the higher level application this object is part of.
  • managed-by: helm for example. An example from the documentation:
apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app.kubernetes.io/name: mysql
    app.kubernetes.io/instance: mysql-abcxzy
    app.kubernetes.io/version: "5.7.21"
    app.kubernetes.io/component: database
    app.kubernetes.io/part-of: wordpress
    app.kubernetes.io/managed-by: helm

You should also define such labels that all API objects at your cluster. For example, a Wordpress application may use:

  • PersistentVolume
  • PersistentVolumeClaim
  • Deployment
  • Pods
  • Containers
  • Service
  • Ingress

You can relate all the above API objects via app.kubernetes.io/part-of: wordpress When you do this, the one command can list all those objects in one go.

$ kubectl get all -l 'app.kubernetes.io/part-of=wordpress'

You may filter on labels in an equality-based manner:

environment = production
tier != frontend

You may also filter on labels in a set-based manner

environment in (production, qa)
tier notin (frontend, backend)
us-west-1
!us-west-1
blog

copyright©2021 ylcnky all rights reserved