Jul 12, 2018

Virtual Kubelet - Run pods without nodes

During my recent visit of the ContainerDays 2018 in Hamburg (19.-20.06.2018) I attended an interesting talk held by Ria Bhatia from Microsoft about Virtual Kubelet.

Virtual Kubelet is an open source Kubernetes Kubelet implementation that allows you to run Kubernetes pods without having to manage nodes with enough capacity to run the pods.

In classical Kubernetes setups, the Kubelet is an agent running on each node of the Kubernetes cluster. The Kubelet provides an API that allows to manage the pod lifecycle. After a kubelet has launched, it registers itself as a node at the Kubernetes API Server. The node is then known within the cluster and the Kubernetes scheduler can assign pods to the new node, accordingly.

Especially in environments with volatile workloads, managing a Kubernetes cluster means providing the right number of nodes over time. Adding nodes just in time is often not an option, since spinning up a new node just takes too much time. Thus, operators are forced into running and paying for additional nodes to support payload spikes.

The Virtual Kubelet project addresses such operational hardships by introducing an application that masquerades as a kubelet. Just like a normal kubelet, the Virtual Kubelet registers at the Kubernetes API Server as node and provides the Kubelet API to manage pod lifecycles. Instead of interacting with the container runtime on a host, the Virtual Kubelet utilizes serverless container platforms like Azure Container Instances, Fargate or Hyper.sh to run the pods.

Image Source: https://github.com/virtual-kubelet/virtual-kubelet

Using these services via the Virtual Kubelet allows you to run containers within seconds and paying for them per seconds of use, while still having the Kubernetes capabilities for orchestrating them.

The interaction with external services out of the Virtual Kubelet is abstracted by a provider interface. Implementing it allows to bind other external services for running pods.

The project is still in an early state and currently not ready for use in production. However, it’s a very interesting link between container orchestration platforms and serverless platforms and has numerous use cases.

Jul 11, 2018

ContainerDays 2018: Top talks on conference day 2

Bildergebnis für containerdays logo

I attended the ContainerDays 2018 in the Hamburg Hafenmuseum. It was a very cool location midst containers (the real ones), cranes and big freight ships. There were three stages, the coolest stage was definitely in the belly of an old ship. I‘ll write about the talks I visited and give a short summary. You find a list of all talks here: https://containerdays.io/program/conference-day-2.html. Videos are coming soon – I’ll edit this post when they are available.

Update: Here they are: https://www.youtube.com/playlist?list=PLHhKcdBlprMcVSL5OmlzyUrFtc7ib1V4w


One Cluster to Rule Them All - Multi-Tenancy for 1k Users

Lutz Behnke from the HAW Hamburg (Hamburg University of Applied Sciences) talked about running a private cloud in a university. Their requirements on a cloud are somewhat different from what you see in the private sector: Sometimes they just need a small web server to serve a static web page for a student, sometimes they need loads of GPUs for computing a heavy research project. All that should be easily available to more than 1000 students, which are reluctant to read documentation.
They had to build that from the ground up, as universities mostly can’t use AWS, Google Cloud etc. The first version was based on VMware, but that was scrapped quickly: Students overestimated their requirements on resources and requested quad core CPUs with a load of memory to just serve their small web application they needed for their network course. After the course was done, no one released the resources. Of course no student ever applied security patches to these eternally running virtual machines.
The second version of the private cloud is based on Kubernetes (k8s). K8s should in theory support multi-tenancy, but everyone understands that concept a little bit differently. The HAW needed LDAP authentication in k8s, so they built a small tool called kubelogin, which authenticates against a LDAP server. The authorization is managed via GitLab groups, and they built a tool which syncs these GitLab groups back into k8s. Rook.io is used for the distributed storage.
They already solved many problems and the solutions have been communicated back into the Multitenancy Working Group. But some problems are still unsolved: How to handle the logs from 2000+ nodes? How to share GPU nodes? They also ran into problems with etcd – the default of 2 GB storage space is too little when you have high pod churn.
One lesson learned: Even when every of your nodes is cattle, etcd is your pet. They published all the work done on their website http://hypathia.net/en/.


Lightning Talk: Gitkube: Continuous Deployment to Kubernetes using Git Push

Shahidh K. Muhammed from Hasura talked about GitKube. He isn‘t happy with the current deployment flow when using k8s and wants something similar to Heroku, where you just push code to a git remote and the system does the rest: compiling, packaging, deploying. He showed us GitKube (https://github.com/hasura/gitkube), which works by using git hooks. When you push to the git remote, a special worker in the k8s cluster builds, packages and deploys the code. He demonstrated the whole setup in a live demonstration. Very cool!


Distributed Microservices Metrics and Tracing with Istio and OpenCensus

Sandeep Dinesh from Google talked about the microservice hype, metrics, tracing and Istio. Turns out that microservices, for all their glory, have downsides: They increase the complexity in infrastructure, development and introduce more latency. Also tracing the request through multiple services and metrics collection gets a lot harder.
For distributed tracing, you need in essence: a trace id generator, passing of the trace id to downstream services, span (service and method calls) to collector sending and finally some data processing of these traces and spans in the collector. An example of such a processing tool is Zipkin (https://zipkin.io/). As you don‘t want vendor locking to one tracing tool, Google created a new initiative called OpenCensus (https://opencensus.io/). This decouples the implementation (e.g. Zipkin) from the API to which the service is compiled to.
Istio, which uses sidecars to instrument and trace services on k8s, also supports OpenCensus. Istio takes care, for example, of monitoring the incoming and outgoing traffic. It also creates the trace id, if none is available. As Istio can‘t look in the k8s service, you need to call the OpenCensus API to create the spans. Istio then merges the spans from OpenCensus with its own observed behavior and reports it to the collector.
Sandeep showed all of that in a live demonstration and also emphasized that the whole stack is early development and should not be used in production.


Applying (D)DDD and CQ(R)S to Cloud Architectures

Benjamin Nothdurft talked about domain driven design (DDD) and high level architecture. He explained a technique to find the bounded contexts of your domain and gave an introduction into CQRS.
CQRS is essentially splitting your models into two: one for querying, one for updating. In the software Benjamin presented, they used JPA and a relational database for the update model and an ElasticSearch instance for the querying model. They also split the service into two, one for updating, one for querying. When updating the data, the update event is put in a queue. The querying service processes the events from the queue and applies the updates on the querying model.
This, of course, complicates the whole system and makes sense when you have an asymmetric load – in this case the querying side had to be scaled independently from the updating side.


Containers on billions of devices

Alexander Sack from Pantarcor talked about containers on devices. In his talk, a device can be a router, a drone, a tablet etc. He excluded the smaller things, like embedded sensors or actors.
The way these devices are built today is as follows: Design the hardware and the firmware, then develop this software, assemble the hardware in the factory, put the software on it and then never touch it until end of life. One of the things lost this way is security. A better way would be to pick general stock hardware and peripherals, assemble them in a chassis and put some general purpose software on it. The device specific software can then be developed and updated even after the device has been shipped to the customer.
One way to do that is - you guessed it - with containers! There are already some products which do this, for example resin.io using Docker containers. The problem using Docker containers is that they are really heavy on the disk and not that suitable for smaller devices, with, say, 8 MB of flash space.
Pantarcor developed a solution which is completely open source. They are packaging the whole system in containers, with a small hypervisor to orchestrate the containers and to update the base system. They are using LXC containers under the hood, which lowers the space consumption. PantaHub (https://www.pantahub.com/) is their UI to manage the devices.


Secret Management with Hashicorp's Vault

Daniel Bornkessel from innoQ talked about Vault (https://www.vaultproject.io/). Vault manages application secrets like encryption keys, database credentials and more. It also does credential rolling, revocation of credentials and auditing.
He explained the architecture and concepts of Vault: A client authenticates to the Vault server, Vault uses a connector to read (or generate) the secret (e.g. the database credentials). The authentication is pluggable and supports multiple plugins, like hard-coded secrets, k8s security or AWS credentials.
Vault also supports generating credentials on the fly. It can, for example, log into your PostgreSQL database, create a user name and password on the fly and hand it to the client. When the client authentication expires, it cleans up the user in PostgreSQL. That way there are no hard-coded passwords. This is definitely a big win for security!

Jun 14, 2018

Impressions from SEACON 2018 - Part 2

by Harald Störrle

"Domain Driven Design" and "Taylorism"

Henning Schwentner (wps solutions GmbH) presented the concepts behind Domain Driven Design (DDD, see [2,6,8] for general references, and [7] for the slides of the talk). The general idea behind DDD is to structure applications vertically rather than horizontally into domains: Design small, self-contained portions of an application domain rather than attempt to get (only?) the big picture. It doesn’t stop there, though: The domain-structure ought to be established, says Schwentner, not just in the design (aka. models), but likewise in the architecture, code structure, organisation structure, and tooling (e.g. repositories). The “Design” in DDD refers to domain-level models (mostly conceptual, it appears) that constitute the ontology of a (sub-) domain and allow to define the boundaries ("Bounded Context") which are reflected in the interfaces at code level.

At first hearing, DDD reminds me a lot of the Role-Modeling approaches of the late 1990’s [1,3,4] (then absorbed into UML), or the Business Objects from the early 1990’s [5], or, even earlier, of the vision and promise of OO technology in general: closing the “semantic gap” between application and technology. Of course, DDD offers modern(ized) terminology, and there certainly is a lot of technical progress since the early days of OO, but the idea is not as new as it might seem… Still, it is a good idea, and it easily survives being renamed, rephrased, and repackaged (again). Maybe, this time around, we will finally see the convergence of application needs and technology opportunities.

Obviously, vertical structuring organisations is all the rage today. The main benefit is obviously the increased agility of small scale teams, hopefully not loosing the capability to tackle large scale problems, or maybe even upgrading organisational capabilities from solving complicated to solving complex problems, never mind wicked problems. Clearly, introducing proper modules into Java 9 is an important contribution towards this goal. And it makes perfect sense to me to bet on this one, even though “module” is not quite a brand new concept either…better late than never. I remain cautious, though, since vertical structures have downsides, too (ever heard the term “information silo”?). And I can’t see the reasons of having horizontal really going away for good (synergy, reuse, integration).

Having said that, I do like the idea of starting at an (elevated) level of abstraction. In my experience, this is difficult enough at the level of models, let alone code. What I find truly interesting, though, is the breadth and prominence that the social or organisational persepective has gained in IT conferences. A side topic in Schwentner's talk, it took center stage in a talk by Frank Düsterbeck. He spoke about leadership in learning organisations ("Taylor ist tot, es lebe der Mensch – Führung in der lernenden Organisation"; "Taylor is dead, long live the human - leadership in learning organisations"). He pointed out that there, in fact, are two types of problems:
  • Complicated problems can be tackled by applying diligence, systematic procedure and delegation. Such problems can be solved by mechanical steps in the end.
  • Complex problems, on the other hand, are by definition beyond what one person can grasp. Only self-organised teams can hope to conquer them.

Of course, in today's highly dynamic market places the latter abound. With the threat of disruption just around the corner, agility is key for thriving as an organisation. So, the call to take teams seriously, is perfectly plausible to me. Not many organisations have embraced this idea, and many more should. Düsterbeck's plea strikes me as somewhat shallow, though. As he points out himself, a tree has fewer edges than a (connected) graph. If every edge corresponds to a communication link, then the overhead for self-organised teams increases much stronger with increasing numbers than it does for hierarchical organisations (Brooks' Law: adding people to a late project makes it later). He observes that there are two types of communication:
  • Steering communication: This is unavoidable, but it is also the smaller part and is thus not the key factor contributiong to communication overhead.
  • Knowledge dissemination: This can, at least to some degree, be replaced by converting fluid and tacit knowledge into a more static form (aka. "documentation").

I am not sure, how much slack this distinction cuts a team. And what about those problems that are too big for one (small) team? DDD will answer: Create another subdomain and establish interfaces. However, the overall picture must be established, too, and "emergent interfaces" is boud to create friction, duplication and defects of every sort. Düsterbeck also highlights that the usual T-shaped profile in technology (broad coverage with deep, deep rooting in some place) is not enough. It must be complemented by the second dimension of domain-knowledge, again T-shaped. What is more, he wants a third dimension in this picture, the social dimension of individuals and teams (see figure below as taken from https://twitter.com/fduesterbeck). Indeed, the times of Taylorism are over.



[1]   Epstein, Pete, and Ravi Sandhu. "Towards a UML based approach to role engineering." Proc 4th ACM Ws. Role-based Access Control. ACM, 1999.
[2]   Evans, Eric: “Domain-driven design: tackling complexity in the heart of software” Addison-Wesley, 2004.
[3]   Halpin, Terry "Object-role modeling (ORM/NIAM)" Handbook on architectures of Information Systems. Springer, Berlin, Heidelberg, 1998. 81-103.
[4]   Halpin, Terry, and Anthony Bloesch "Data modeling in UML and ORM: a comparison" Journal of Database Management (JDM) 10.4 (1999): 4-13.
[5]   Sims, Oliver “Business objects: Delivering cooperative objects for client-server McGraw-Hill, Inc., 1994.
[6]   Schwentner, Henning: Domain Storytelling Website http://domainstorytelling.org/
[7]   Schwentner, Henning: Models, Modules, and Microservices” Speakerdeck.com/hschwentner
[8]   Vernon, Vaughn: “Domain-driven design distilled” Addison-Wesley, 2016.

Jun 12, 2018

Impressions from SEACON 2018 - Part 1

by Silke Tautorat

An extraordinary keynote

The first day’s keynote “Umgang mit Komplexität lernen” (learn how to deal with complexity) was held by two 13-year old pupils of the Evangelische Schule Berlin and has been a very interesting start for a Software Engineering & Architecture Conference.
Romy Randel and Rosalie Hermann presented in a very professional way the everyday life at their school, in which each student can decide on their own at what pace they want to learn and work on projects.
Of course, the students have a fixed time-table, but in certain units called "Lernbüro" (learning office) they can choose if they want to do Maths, German or English and on what topic they want to work on.
Every few months the students together with their tutor, usually a teacher, agree on goals for the upcoming month and check on the achievement of the past time.
One day of the week is reserved for a project that the whole class, which comprehends of students from 7th to 9th grade, is working on, for example improve the neighbourhood or work in an old people’s home.
Another difference to other schools is the yearly “Projekt Herausforderung” (project challenge). All students must plan and organize a three-week project with a budget of 150€. This can be done all alone or in a group of students and can be everything from a stay on a farm to work with animals, a canoeing trip to a journey to Italy.
Both girls seemed to be very happy with this type of school and, according to the way they presented the keynote and answered questions, this model encourages self-confidence and self-organisation.

Eleven Lessons Learned from a large agile project bei EOS

Maik Wurdel summarized the twelve lesssons learned (he spontaneously added the twelfth one) he and his team made in a large agile project. They have been developing a major core system, which is supposed to replace the existing system shortly. They set up the first agile teams within their company EOS. Here is the list in my own words:

  1. Try to establish start-up conditions: not a lot of processes, cross-functional teams, all in one place. And make sure to have an effective communication on what you are doing in this new agile team to the rest of the company, otherwise a lot of misunderstanding and irritation might happen.
  2. Never start without a customer. The correct understanding on what the customer really needs and wants is very important.
  3. Get an agile coach to support the teams and the transition. The coach should not be just another Scrum Master.
  4. Go live early, to learn as soon as possible.
  5. Conway is right: Organizations are constrained to produce designs which are copies of the communication structures of these organizations. Establish Feature Teams.
  6. Autonomous decisions need clear objectives, aim for high transparency.
  7. Define and measure your KPIs.
  8. Developers in agile teams do not develop faster, they learn faster.
  9. If you are using cutting-edge technology, check regularly if it is still the right choice or if you need to adjust. Start small.
  10. Agility is nothing you just do, agility is a state of mind and has its reasons. You don't do agile, you are agile - with good cause...
  11. Agility has its price: transparency and sprinting is strenuous.
  12. Agile transformation starts with the first team member. Everyone is different and has diverging experiences.