How Helm Uses ConfigMaps to Store Data
Helm, the package manager for Kubernetes, uses first-class Kubernetes objects to store its data. Here's how we use ConfigMaps to track Helm releases.
Helm follows the formula "Chart + Values = Release". You start with a Helm chart (a software package), you add your own configuration values, and you install it into your cluster. That makes a release.
On the command line, we do this with the following command:
$ helm install -f config.yaml stable/wordpress NAME: amber-gopher LAST DEPLOYED: Thu Mar 23 15:57:40 2017 NAMESPACE: default STATUS: DEPLOYED ...
What you get back is a release. In the case above, our release is named
amber-gopher. Each time I upgrade this release, I get a new release version.
$ helm upgrade --set "foo=bar" amber-gopher stable/wordpress Release "amber-gopher" has been upgraded. Happy Helming! LAST DEPLOYED: Thu Mar 23 15:59:14 2017 NAMESPACE: default STATUS: DEPLOYED
An upgrade takes an existing release and upgrades it with the given values (
--set foo=bar) and chart (again,
Now I have two revisions of the same release:
$ helm history amber-gopher REVISION UPDATED STATUS CHART DESCRIPTION 1 Thu Mar 23 15:57:40 2017 SUPERSEDED wordpress-0.4.3 Install complete 2 Thu Mar 23 15:59:14 2017 DEPLOYED wordpress-0.4.3 Upgrade complete
How does Helm track those releases?
The Release ConfigMaps
Helm's in-cluster component is called Tiller. By default, it installs into the Kubernetes system namespace (
kube-system). It has the following jobs:
- Answer requests from Helm clients
- Expand and render charts into a set of Kubernetes resources
- Manage releases
That last part requires that Tiller maintain a list of all of the releases. When we run
helm list, Tiller shows us all of the releases. And we can use
helm history to see all of the revisions for a given release.
Tiller stores all of this information in Kubernetes ConfigMap objects. And those objects are located in the same namespace as Tiller. We can easily get a list of them:
$ kubectl get configmap -n kube-system -l "OWNER=TILLER" NAME DATA AGE amber-gopher.v1 1 7m amber-gopher.v2 1 6m foolhardy-alligator.v1 1 2d voting-otter.v1 1 2d
We can see above that there are three releases, and one of the releases (
amber-gopher) has two revisions.
A Peek at a Release Revision
So let's take a quick look at one of these ConfigMaps with
kubectl get configmap -n kube-system -o yaml amber-gopher.v2:
apiVersion: v1 data: release: H4sIAAAAAAAC/+x9TYwcS5qQ5tmen5x3... # REALLY LONG STRING REMOVED kind: ConfigMap metadata: creationTimestamp: 2017-03-23T21:59:15Z labels: CREATED_AT: "1490306355" NAME: amber-gopher OWNER: TILLER STATUS: DEPLOYED VERSION: "2" name: amber-gopher.v2 namespace: kube-system resourceVersion: "86277" selfLink: /api/v1/namespaces/kube-system/configmaps/amber-gopher.v2 uid: f4dacc7d-1013-11e7-a017-be7592efdc06
As you can see, this is a pretty basic ConfigMap. Tiller uses a robust set of labels to mark up and track revision history. For example, you can grab a particular revision using a label selector:
kubectl get configmap -n kube-system -l "NAME=amber-gopher,VERSION=2"
But right in the middle of the ConfigMap is a giant base-64 encoded blob stored under
data.release. In the example above, I have redacted it out.
This blob is a base-64 encoded, gzipped archive of the entire release record, which includes the original chart, the values, and some useful state-tracking information. The release record is in a binary protobuf format, which is of little use to most users.
You can see a human-friendly version of this data with the
helm get command:
$ helm get amber-gopher REVISION: 2 RELEASED: Thu Mar 23 15:59:14 2017 CHART: wordpress-0.4.3 USER-SUPPLIED VALUES: foo: bar COMPUTED VALUES: foo: bar image: bitnami/wordpress:4.7.3-r0 imagePullPolicy: IfNotPresent ingress: # ....
With access to the ConfigMaps, you can perform a number of operations on your Tiller setup, like querying how many release revisions have been created across the cluster. However, we strongly advise that you do not modify a release record, as it may lead to mismatches between what is inside of the gzipped data, and what is in the YAML wrapper.
Finally, it is worth noting that Tiller does not currently cache any state information about releases. The definitive source of information about each release is the set of ConfigMaps associated with the release.
Frequently Asked Questions
There are a few questions about the ConfigMap setup that get asked occasionally.
How does Helm access the ConfigMaps?
ConfigMaps were designed to be a "general purpose" object for storing configuration data. When ConfigMaps were first introduced, that description sounded exactly like what we wanted.
Kubernetes provides several ways of accessing ConfigMaps, including the most common method of mounting a ConfigMap as a volume within a pod. We did not choose to use ConfigMaps as volumes.
Instead, Tiller queries the Kubernetes API on demand. For example, running
helm history amber-gopher causes Tiller to contact the Kubernetes API and ask for all the config maps with the name
Why aren't all of the revisions stored in the same ConfigMap?
There are a few parts to this answer:
- We wanted to make it easy to query revisions via
kubectl. We consider
kubectlto be the "expert level" tool for diagnosing Helm issues.
- ConfigMaps (as all resource types) have a 1M upper limit. This is true with Kube 1.5 and earlier due to a limitation of etcd.
Because of this design, you (as a power user who does not fear voiding warranties and tearing the tag off your mattress) can actually remove unwanted revisions using
kubectl delete configmap... commands.
Why didn't you use Third Party Resources (TPRs)?
Simply because they were an alpha-level feature when we wrote Tiller, and still are not (in our opinion) at the level of stability we would like. We've enjoyed the fact that ConfigMaps are very easy to work with, both within Tiller and with other Kubernetes tools.
The Helm core developers have considered moving to TPRs in Helm 3, but we will probably not do it before then because of backward compatibility issues.
Why didn't you use (insert database name here)?
We could have used any number of storage backends for Tiller. But what we wanted was something that required very little additional overhead. Storing releases as native Kubernetes objects meant that our data storage was always as stable and persistent as the cluster, which seemed like the right metric for a package manager.
But we did write the storage driver interface in such a way that one could use another storage mechanism. In fact, Tiller ships with a second storage driver. It stores the release history in memory, and is useful for debugging or learning the storage system.
Why do you store the data in base-64 encoded gzipped binaries?
It's not for security (as some people have suggested), nor is it so that we can willfully obscure data. It's simply because we need to store a large chunk of data in a compact format. Again, we are limited to records no larger than 1M. Gzipping the protobuf serialized data has been highly effective, and base-64 encoding it matches the data format that Kubernetes expects.
Why didn't you use Secrets?
At the time we wrote Tiller, Secrets were more primitive than ConfigMaps. They have since hit feature parity. But there seems to be no compelling reason for us to switch from one to the other, and doing so would break backward compatibility.
Note that in spite of their name, Secrets are no more (inherently) secure than ConfigMaps. In fact, the encoding for a Secret would be identical to the encoding we use now. At some point, we are hoping we will be able to find a way to store the release data encrypted-at-rest.