With this episode I will try to shed some light on what a good container image is and what we can learn (or even generate) from it.
First of all, it should fulfill my needs, it should not be something that I build for a single purpose. To fulfill my needs I need to configure the container that is started from the image. I don’t want to build a special purpose container image that only fits my one application I am working on. The ‘good container image’ should be a atomic component that will be reused by a Nulecule.
Lets have a look at Postgresql, there is a wonderful OpenShift Postgresql container image based on CentOS7 and installing Postgresql itself from the software collections. In addition to the obvious, the OpenShift team has added a few LABELs which in turn are used by OpenShift: “io.openshift.expose-services” could be used to create an OpenShift service straight from that container image, “io.k8s.display-name” contains the string that will show up in the OpenShift web console if you look at pods based on this Postgresql container image. I think you got the pattern: lets use LABELs to deliver value add via the toolchain. And the OpenShift team has put lots of valuable documentation (in and) around the Dockerfile.
Can we introduce LABELs that will help the Nulecule toolchain? I think so.
I have set up a proof of concept to generate a Nulecule file from such a ‘good container image’. The POC will use labels under “io.projectatomic.nulecule” to give information what needs to be present in the Nulecule file. Lets have a look at the ‘(even more) good container image’: I put it on my repo on github wich is a fork of the OpenShift Postgresql.
It first thing you will notice is that I translated the human readable documentation OpenShift team included as comments in the Dockerfile to labels. It is around https://github.com/goern/postgresql/blob/feature/enhanced-labels/9.4/Dockerfile.rhel7#L22 you see a list of required and optional environment variables that this container images uses. These two labels can be directly translated to Nulecule parameters (ok, we lack constraints, but hey…)
The second improvement of the good container image, is at line 65: VOLUME is used by docker build and the label “io.projectatomic.nulecule.volume” could be used to generate parts of Nulecule’s storage requirements.
And by the way we can generate Nulecule metadata from “io.k8s” and “io.openshift” labels (around the head of the Dockerfile).
sudo dnf copr enable goern/grasshopper sudo dnf install grasshopper curl -gO https://raw.githubusercontent.com/goern/postgresql/feature/enhanced-labels/9.4/Dockerfile.rhel7 grasshopper-0.0.47 nulecule guess Dockerfile.rhel7
What you will see is a Nulecule file (and ja, the parameters are missing with version 0.0.47, it’s a POC remember) completely generated from the Dockerfile. QED we can release the developer from writing all that Nulecule boiler plate code. What we can not do is to release him from structuring his application, back to our example that means: he must write to WordPress Nulecule file and all its artifacts.
Open Questions: can we even generate the artifacts for such a base container? I will explore this in the following weeks. What I know is that we can generate OpenShift/Kubernetes PersistentVolumeClaims from Nulecule’s storage requirement. That feature is targeted for grasshopper 0.1.0 (aka xmas) release.
How do we know about good container images? Where to find them? I completely unashamed reused an idea of Vasek: nulecule-library index searching You can use grasshopper to find a set of Nuleculized applications which are based on (more or less) good container images:
grasshopper nulecule index list
Over the past half year a team of Red Hatters has invented and pushed forward the Nulecule Specification. We want to define applications and provide a capability to parameterize and deploy them to a (so called provider). By now Atomic App is the reference implementation for the Nulecule Specification, so Atomic App is responsible for the ‘parameterize and deploy’ part of our mission.
Now that we see some adoption of Atomic App and the Nulecule Specification, we also see some (maybe) limitations, at least we see some areas where we need to improve. You will notice that I am a big fan of the Nulecule Specification itself, not that I am one of the main authors… but I love its extensibility. In the end, we gave a meaning to a few statements in a JSON file (or YAML if you prefer). The statements can be evaluated by Atomic App, but inside of a Nulecule file there may be other statements, that have a meaning for other software. So you can basically put any information regarding “your Application” (or a component of your application) in a Nulecule file and have a centralized description (including ‘install howto’) of it.
So we observed two major topic coming up with the early adopters:
- a guided walk thru from A to Z
- too much boiler plate, to less value add
Keep in mind: this is my blog, my thoughts and statements
Regarding the guided walk thru I have a quiet long conversation with @kanarip, it is obvious that we need to revive the idea of an Über-example I explained earlier. This just seems to be a GTD problem.
But what about that boiler plate and too less value? Let me give you a super short walk thru…
The Mission: Nuleculize WordPress
- separate database from (web) frontend
- reuse container images that are available
- deploy to CentOS7
- best case: OpenShift
- still ok: Kubernetes
- deliver everything with a container
We need to find some “good” container images for MariaDB (or Postgresql) and WordPress. As we want to run them on CentOS7 we should not choose some that carry the Ubuntu user space. There have been long posts on this topic. Further more these container images should be parameterizeable: we need to be able to set env vars within the container to modify the configuration of the application inside the container. If you have a look at OpenShift’s MySQL container, you will see that they do a pretty good job at documenting what inputs this container images accepts/needs.
Nice, we found all the good containers, so we can run a database and a frontend.. now, lets define what our application is: lets write a Nulecule. First of all, Nuleculize your application is a recursive and iterative process: we need to Nuleculize the database, than the frontend and afterwards define our application that consists of these two components. “Nuleculize” means: in addition to the fact that we know which good container images we would like to use, we need to define their dependencies and we need to define which parameters need to be there when deploying the application, what component needs to know which of these parameters (the database password for example). We said that we want to deploy our application on top of OpenShift and Kubernetes platforms, therefor we need to write (or find) configurations to run the backend and frontend containers on both of these platform: we need service definitions, replication controllers, routes… We call these configurations artifacts, and we need to declare within these configurations which parameters should be replaced by the value declared within the Nulecule file.
Containers: found, Application: described, let’s package! Next step is to package everything we got into a container. We don’t want to deliver a huge readme and a set of RPMs. Short recap: MySql container image, WordPress, both Nuleculized (so we packaged their Nulecule files and artifacts separated) and the application’s Nulecule itself. So how many container images do we have now? Its #3″§53! But that is another aspect we will fix later ;)
In the next episode I will focus on what a good container is and how we can make the life of a Nuleculizer a little easier.