Jul 12, 2018

Virtual Kubelet - Run pods without nodes

During my recent visit of the ContainerDays 2018 in Hamburg (19.-20.06.2018) I attended an interesting talk held by Ria Bhatia from Microsoft about Virtual Kubelet.

Virtual Kubelet is an open source Kubernetes Kubelet implementation that allows you to run Kubernetes pods without having to manage nodes with enough capacity to run the pods.

In classical Kubernetes setups, the Kubelet is an agent running on each node of the Kubernetes cluster. The Kubelet provides an API that allows to manage the pod lifecycle. After a kubelet has launched, it registers itself as a node at the Kubernetes API Server. The node is then known within the cluster and the Kubernetes scheduler can assign pods to the new node, accordingly.

Especially in environments with volatile workloads, managing a Kubernetes cluster means providing the right number of nodes over time. Adding nodes just in time is often not an option, since spinning up a new node just takes too much time. Thus, operators are forced into running and paying for additional nodes to support payload spikes.

The Virtual Kubelet project addresses such operational hardships by introducing an application that masquerades as a kubelet. Just like a normal kubelet, the Virtual Kubelet registers at the Kubernetes API Server as node and provides the Kubelet API to manage pod lifecycles. Instead of interacting with the container runtime on a host, the Virtual Kubelet utilizes serverless container platforms like Azure Container Instances, Fargate or Hyper.sh to run the pods.

Image Source: https://github.com/virtual-kubelet/virtual-kubelet

Using these services via the Virtual Kubelet allows you to run containers within seconds and paying for them per seconds of use, while still having the Kubernetes capabilities for orchestrating them.

The interaction with external services out of the Virtual Kubelet is abstracted by a provider interface. Implementing it allows to bind other external services for running pods.

The project is still in an early state and currently not ready for use in production. However, it’s a very interesting link between container orchestration platforms and serverless platforms and has numerous use cases.

Jul 11, 2018

ContainerDays 2018: Top talks on conference day 2

Bildergebnis für containerdays logo

I attended the ContainerDays 2018 in the Hamburg Hafenmuseum. It was a very cool location midst containers (the real ones), cranes and big freight ships. There were three stages, the coolest stage was definitely in the belly of an old ship. I‘ll write about the talks I visited and give a short summary. You find a list of all talks here: https://containerdays.io/program/conference-day-2.html. Videos are coming soon – I’ll edit this post when they are available.

Update: Here they are: https://www.youtube.com/playlist?list=PLHhKcdBlprMcVSL5OmlzyUrFtc7ib1V4w


One Cluster to Rule Them All - Multi-Tenancy for 1k Users

Lutz Behnke from the HAW Hamburg (Hamburg University of Applied Sciences) talked about running a private cloud in a university. Their requirements on a cloud are somewhat different from what you see in the private sector: Sometimes they just need a small web server to serve a static web page for a student, sometimes they need loads of GPUs for computing a heavy research project. All that should be easily available to more than 1000 students, which are reluctant to read documentation.
They had to build that from the ground up, as universities mostly can’t use AWS, Google Cloud etc. The first version was based on VMware, but that was scrapped quickly: Students overestimated their requirements on resources and requested quad core CPUs with a load of memory to just serve their small web application they needed for their network course. After the course was done, no one released the resources. Of course no student ever applied security patches to these eternally running virtual machines.
The second version of the private cloud is based on Kubernetes (k8s). K8s should in theory support multi-tenancy, but everyone understands that concept a little bit differently. The HAW needed LDAP authentication in k8s, so they built a small tool called kubelogin, which authenticates against a LDAP server. The authorization is managed via GitLab groups, and they built a tool which syncs these GitLab groups back into k8s. Rook.io is used for the distributed storage.
They already solved many problems and the solutions have been communicated back into the Multitenancy Working Group. But some problems are still unsolved: How to handle the logs from 2000+ nodes? How to share GPU nodes? They also ran into problems with etcd – the default of 2 GB storage space is too little when you have high pod churn.
One lesson learned: Even when every of your nodes is cattle, etcd is your pet. They published all the work done on their website http://hypathia.net/en/.


Lightning Talk: Gitkube: Continuous Deployment to Kubernetes using Git Push

Shahidh K. Muhammed from Hasura talked about GitKube. He isn‘t happy with the current deployment flow when using k8s and wants something similar to Heroku, where you just push code to a git remote and the system does the rest: compiling, packaging, deploying. He showed us GitKube (https://github.com/hasura/gitkube), which works by using git hooks. When you push to the git remote, a special worker in the k8s cluster builds, packages and deploys the code. He demonstrated the whole setup in a live demonstration. Very cool!


Distributed Microservices Metrics and Tracing with Istio and OpenCensus

Sandeep Dinesh from Google talked about the microservice hype, metrics, tracing and Istio. Turns out that microservices, for all their glory, have downsides: They increase the complexity in infrastructure, development and introduce more latency. Also tracing the request through multiple services and metrics collection gets a lot harder.
For distributed tracing, you need in essence: a trace id generator, passing of the trace id to downstream services, span (service and method calls) to collector sending and finally some data processing of these traces and spans in the collector. An example of such a processing tool is Zipkin (https://zipkin.io/). As you don‘t want vendor locking to one tracing tool, Google created a new initiative called OpenCensus (https://opencensus.io/). This decouples the implementation (e.g. Zipkin) from the API to which the service is compiled to.
Istio, which uses sidecars to instrument and trace services on k8s, also supports OpenCensus. Istio takes care, for example, of monitoring the incoming and outgoing traffic. It also creates the trace id, if none is available. As Istio can‘t look in the k8s service, you need to call the OpenCensus API to create the spans. Istio then merges the spans from OpenCensus with its own observed behavior and reports it to the collector.
Sandeep showed all of that in a live demonstration and also emphasized that the whole stack is early development and should not be used in production.


Applying (D)DDD and CQ(R)S to Cloud Architectures

Benjamin Nothdurft talked about domain driven design (DDD) and high level architecture. He explained a technique to find the bounded contexts of your domain and gave an introduction into CQRS.
CQRS is essentially splitting your models into two: one for querying, one for updating. In the software Benjamin presented, they used JPA and a relational database for the update model and an ElasticSearch instance for the querying model. They also split the service into two, one for updating, one for querying. When updating the data, the update event is put in a queue. The querying service processes the events from the queue and applies the updates on the querying model.
This, of course, complicates the whole system and makes sense when you have an asymmetric load – in this case the querying side had to be scaled independently from the updating side.


Containers on billions of devices

Alexander Sack from Pantarcor talked about containers on devices. In his talk, a device can be a router, a drone, a tablet etc. He excluded the smaller things, like embedded sensors or actors.
The way these devices are built today is as follows: Design the hardware and the firmware, then develop this software, assemble the hardware in the factory, put the software on it and then never touch it until end of life. One of the things lost this way is security. A better way would be to pick general stock hardware and peripherals, assemble them in a chassis and put some general purpose software on it. The device specific software can then be developed and updated even after the device has been shipped to the customer.
One way to do that is - you guessed it - with containers! There are already some products which do this, for example resin.io using Docker containers. The problem using Docker containers is that they are really heavy on the disk and not that suitable for smaller devices, with, say, 8 MB of flash space.
Pantarcor developed a solution which is completely open source. They are packaging the whole system in containers, with a small hypervisor to orchestrate the containers and to update the base system. They are using LXC containers under the hood, which lowers the space consumption. PantaHub (https://www.pantahub.com/) is their UI to manage the devices.


Secret Management with Hashicorp's Vault

Daniel Bornkessel from innoQ talked about Vault (https://www.vaultproject.io/). Vault manages application secrets like encryption keys, database credentials and more. It also does credential rolling, revocation of credentials and auditing.
He explained the architecture and concepts of Vault: A client authenticates to the Vault server, Vault uses a connector to read (or generate) the secret (e.g. the database credentials). The authentication is pluggable and supports multiple plugins, like hard-coded secrets, k8s security or AWS credentials.
Vault also supports generating credentials on the fly. It can, for example, log into your PostgreSQL database, create a user name and password on the fly and hand it to the client. When the client authentication expires, it cleans up the user in PostgreSQL. That way there are no hard-coded passwords. This is definitely a big win for security!