Nov 30, 2016

Continuously delivering a Go microservice with Wercker on DC/OS

Currently, I am really into the field of building cloud native applications and the associated technology stacks. Normally I would use Java as a primary language to implement such an application. But since everyone seems to be using Go at the moment, I figured it's about time to learn a new language to see how it fits into the whole cloud native universe.

So let's implement a small microservice written in Go, build a Docker image and push it to Docker hub. We will be using the Docker based CI platform Wercker to continuously build and push the image whenever we change something in the code. The complete example source code of this article can be found on Github here.

Before you start

Make sure you have all the required SDKs and tools installed. Here is the list of things I used for the development of this showcase:
  • Visual Studio Code with Go language plugin installed
  • The Go SDK using Brew
  • The Docker Toolbox or native Docker, whatever you prefer
  • The Make tool (optional)
  • The Wercker CLI, for easy local development (optional)

Go micro service in 10 minutes

If you are new to the Go language, make sure you read the Go Bootcamp online book

To build the micro service, we will only be using the 'net/http' and 'encoding/json' standard libraries that come with Go. We define the response structure of our endpoint using a plain Go struct. The main function registers the handler function for the '/api/hello' endpoint and then listens on port 8080 for any incoming HTTP requests. The handler function takes two parameters: a response writer and a pointer the original HTTP request. All we do in here is to create and initialize the response structure, marshall this structure to JSON and finally write the data to the response stream. Per default, the Go runtime will use 'text/plain' as content type, so we also set the 'Content-Type' HTTP header to the expected value for the JSON formatted response.

package main

import (

// Hello response structure
type Hello struct {
    Message string

func main() {
    http.HandleFunc("/api/hello", hello)
    http.ListenAndServe(":8080", nil)

func hello(w http.ResponseWriter, r *http.Request) {

    m := Hello{"Welcome to Cloud Native Go."}
    b, err := json.Marshal(m)

    if err != nil {

    w.Header().Add("Content-Type", "application/json;charset=utf-8")

Now it is time to trigger the first Go build for our micro service. Open a terminal, change directory into you project folder and issue the following command:

go build -o cloud-native-go

You should now have an executable called 'cloud-native-go' in your project directory which you can use to run the micro service. You should also be able to call the '/api/hello' HTTP endpoint on localhost, e.g. curl http://localhost:8080/api/hello. Done.

Go CI/CD pipeline using Wercker

Wercker is a Docker native CI/CD automation platform for Kubernetes, Marathon and general microservice deployments. It is pretty easy to use, allows local development and is free for community use. For the next step, make sure you have the Wercker CLI tools installed. The instructions can be found here.

Create a file called 'wercker.yml' in the root directory of your project and add the following code snippet to it to define the local development build pipeline. We specify the Docker base box to use for the build as well as the commands to build and run the app.

  # The container definition we want to use for developing our app
    id: golang:1.7.3-alpine
    cmd: /bin/sh
    - internal/watch:
        code: |
          CGO_ENABLED=0 go build -o cloud-native-go
        reload: true

In order to continuously build and run our Go microservice locally, and also watch for changes to the sources, you only have to issue the following Wercker CLI command:

wercker dev --publish 8080

This will download the base box, and then build and run the app inside the container. Om case of changes Wercker will rebuild and restart the application automatically. You should now be able to call the '/api/hello' endpoint via the IP address of your local Docker host and see the result message, e.g. curl

Once the application and the development build are working, it is time to define the pipelines to build the application and to push the image to Docker hub. The first pipeline does have 3 basic steps: first call Go Lint, then build the application and finally copy the build artifacts to the Wercker output folder for the next pipeline to use as inputs. The following code excerpt should be pretty self-explanatory.

  # The container definition we want to use for building our app
    id: golang:1.7.3-alpine
    cmd: /bin/sh
    - wercker/golint
    - script:
        name: go build
        code: |
          CGO_ENABLED=0 go build -o cloud-native-go
    - script:
        name: copy binary
        code: cp cloud-native-go "$WERCKER_OUTPUT_DIR"

The final pipeline will use the outputs from the previous pipeline, build a new image using a different base box and then push the final image to Docker hub. Again, there is not much YAML required to do this. But wait, where is the Dockerfile required to do this? If you pay close attention you will notice that some of the attributes of the 'interna/docker-push' step resemble the different Dockerfile keywords.

  # The container definition we want to use to run our app
    id: alpine:3.4
    cmd: /bin/sh
    - internal/docker-push:
        author: "M.-L. Reimer <>"
        username: $USERNAME
        password: $PASSWORD
        repository: lreimer/cloud-native-go
        tag: 1.0.0 $WERCKER_GIT_COMMIT latest
        entrypoint: /pipeline/source/cloud-native-go
        ports: "8080"

Once you have saved and pushed the 'wercker.yml' file to Github, create a new Wercker application and point it to this Github repo. Next, define the build pipeline using the Wercker web UI. Also make sure that you define the $USERNAME and $PASSWORD variables as secure ENV variables for this application and that you set them to your Docker Hub account. After the next 'git push' you will see the pipeline running and after a short while the final Docker images should be available at Docker Hub. Sweet!

Wercker is also capable of deploying the final Docker image to a cluster orchestrator such as Kubernetes, Marathon or Amazon ECS. So as a final step, we will enhance our pipeline with the automatic deployment to a DC/OS cluster running Marathon.

    - script:
        name: generate json
        code: chmod +x && ./
    - script:
        name: install curl
        code: apk upgrade && apk update && apk add curl
    - wercker/marathon-deploy:
        marathon-url: $MARATHON_URL
        app-name: $APP_NAME
        app-json-file: $APP_NAME.json
        instances: "3"
        auth-token: $MARATHON_AUTH_TOKEN

First, we execute a shell script that generates the Marathon service JSON definition from a template enhanced with some Wercker ENV variables. Then we install 'curl' as this tool is required by the next step and it's not included in the Alpine base image. Finally, we will use the built-in Wercker step to deploy 3 instances of our microservice to a DC/OS cluster. We use several ENV variables here, which need to be set on a deployment pipeline level. Important here are $MARATHON_URL and $MARATHON_AUTH_TOKEN, which are required to connect and authenticate to the Marathon REST API.

Summary and Outlook

Implementing simple microservices in Go is pretty straight forward. However, things like service discovery, configuration, circuit breakers or metrics aren't covered by the current showcase application yet. For real cloud native Go applications we will have a closer look at libraries such as Go-Kit or Go-Micro in the next instalment.

Stay tuned. To be continued ...


GOTO Berlin 2016 – Recap

I recently returned from Berlin where I attended the GOTO Berlin 2016 conference. Here are some of the insights I brought with me.

Diverse keynotes
There have been some amazing keynotes on important topics like prejudices, (neuro)diversity and algorithms gone wrong (producing biased, unfortunate and hurting results). I liked these talks a lot. Make sure you check out the talks done by Linda Rising, Sallyann Freudenberg and Carina C. Zona.

The Cloud is everywhere
This is no surprise. There have been many talks about cloud native applications and micro services. Mary Poppendieck did a good keynote, why these applications are so important now and in the future. On a more technical side IBM presented OpenWhisk as an alternative to Amazon's Lambda for building serverless architectures. It supports JavaScript, Swift, Python and Java right out of the box. Additionally, arbitrary executables can be added using Docker containers. What's especially notable about OpenWhisk is that it is completely open source (see So you could think about switching your provider or even host it by yourself. Of course IBM offers hosting on their very own cloud platform BlueMix.

UI in times of micro services
There have been a lot of talks covering the idea of using micro services and splitting up your application in different parts with potentially different independent development teams. Most of the time this is all about the backend. On the front end side you still end up with a monolithic, maybe single page, web application that uses these micro services.
Zalando introduced it's open source framework ‘Mosaic’, a framework for microservices for the frontend, that should tackle these problems. They do this by replacing placeholders in a template with HTML fragments. This happens during the initial page request on the server side (asynchronous replacements via AJAX are supported). The HTML fragments can be provided by the same team that developed the backing micro service.
Mosaic currently offers two server side components. One written in Go and one in Node.js.
Side note: to make the different application fragments look the same, they still have to provide some shared library code (in their case React components).

New ways to visualize data with VR/AR/MR
There was a talk and some demos about the new Microsoft HoloLens. Philipp Bauknecht put the HoloLens in the space of ‘mixed reality’ (as only existing device, Pokemon Go was the example for Augmented Reality). His talk covered some basics about the hardware, possible usage scenarios, existing apps and how to develop new applications.
The interesting part were some completely new possibilities of displaying data, which could result in amazing new kinds of applications. This is (with VR) one of the first really new output device for quite some time! Very exciting.

This and that

  • Ola Gasidlo mentioned PouchDb, an open-source JavaScript database inspired by Apache CouchDB. Interestingly, it enables applications to store data locally while offline, and then synchronize the data with CouchDB or compatible servers when the application is back online.
  • Ola introduced the phrase ‘Lie Fi’ to me: Lie Fi - Having a data connection, but no packages are coming through ;-)
  • Martin Kleppmann did an interesting talk about his algorithm for merging concurrent data changes. He did this with the example of a text editor like Google Docs. The project he is currently working on is actually about using cloud technology but with encrypted data (so you don't have to trust the cloud provider that much). The project is called Trve Data.

Nov 7, 2016

Modular Software Systems with Jigsaw - Part II

With version 9, Java has finally got its long-awaited support for building software modules. The Jigsaw module system becomes part and parcel of the JDK and JRE runtime environment. This article describes how to set up statically and dynamically interchangeable software based on Jigsaw in order to design modular and component-oriented applications. Java itself uses the Jigsaw Platform Module System [JSR376] for internal modularization of the previously monolithic runtime environment (rt.jar). Applications use Jigsaw to ensure the integrity of their architecture. Moreover, applications can be deployed with a minimal JRE runtime environment, which only contains the JDK modules needed by application. Jigsaw also allows, similar to OSGi, to write plug-in modules which provide applications with new functions not available at compile time.


Modules are independently deployable units (deployment units) hiding the implementation from the user. The core of the modularization is based on the information hiding principle: Users do not need to know the implementation details to access the module. These details are hidden behind an interface. In this way, the complexity visible to the user is reduced to the complexity of the interface. All a user needs to know about a module is contained in the module's public classes, interfaces and methods. Details of the implementation are hidden. Modules transfer the public/private principle of object orientation to entire libraries. The principle of inconspicuous implementation has been known for a long time. David Parnas described the visibility principle at module level and its advantages back in 1972 [Par72].

Fig 1: Library vs Module

A module consists of an interface and an implementation part in a single deployment unit/library. (See Fig 1.) The benefits of this way of encapsulation are the same as with object-orientation.

  • Implementation of a module can be changed without affecting the user. 
  • Complex functionality is hidden behind a simple interface. 

The result is improved testability, maintainability and understandability. Today, in the age of cloud and microservices, a modular design is mandatory! If you package the parts needed for microservice remote communication in separate modules and define module interfaces solely by application functions, then local and distributed deployment are just a mouse click away. If you want to exchange module implementations at runtime or to choose one of alternative implementations (plug-in), it’s necessary to separate interfaces and implementation into two independent modules, yielding an API module along with a potentially interchangeable implementation module. Modules exchangeable at runtime are known as plug-in modules. This in turn requires absolute separation of interface and implementation in various deployment units.

Fig II: Separation of Interface and Implementation for Plug-In Modules

Designing modular applications has long been a tradition with Java, and there are many competing approaches to designing software modules. But they all have one thing in common; a module is mapped as a library. Libraries can be realized in Java as a collection of classes, interfaces and additional resources in JARs. JARs are just ZIP files, completely open to whatever access. Therefore, many applications define their components by a mix of several different approaches:

  • Mapping to package structures by naming conventions
  • Mapping to libraries (JARs)
  • Mapping to libraries, including meta information for checking dependencies and visibility (e.g. OSGi) 
  • Checking dependencies using analysis tools (e.g. SonarQube or Structure101)
  • Checking dependencies using build tools (e.g. Maven or Gradle) as well as
  • Using ClassLoader hierarchies for controlling visibility at runtime (e.g. Java EE) 
All of these approaches have advantages and disadvantages. However, none of them has solved the core problem: as it is, Java has no module concept. That changes with Java 9: with Jigsaw, modules can be designed which control visibility and dependencies at JAR level. Modules make some of their types available as interfaces to the outside world. The interfaces of a Jigsaw module consist of one or more packages. Compiler and JVM ensure that no access occurs past the interface directly to private types (classes, interfaces, enums, annotations).

Jigsaw provides the necessary tools for analysis and control of dependencies. With the analysis tool jdeps, dependencies between JARs and modules can be analyzed and illustrated (with DOT/GraphViz). The Java 9 runtime libraries themselves are based on Jigsaw. The previously monolithic runtime library rt.jar is now split up in Java 9. Cyclic dependencies among modules have been removed. They are forbidden in Jigsaw because they would prevent interchangeability at module level. With the jlink tool, applications can be built with minimal Java Runtime. These applications only contain the effectively utilized modules from the set of JDK modules. The core of Jigsaw is the descriptor module, to be compiled by the Java-Compiler into a class module-info.class and is found on the top level package in every Jigsaw JAR archive.
This file contains a module with a name and an optional version number. With requires, a module indicates its dependencies on other modules. With provides, a module indicates that it implements the interface of the specified module. With exports, the interface is indicated as a package name. permits makes a module visible only for the specified modules. With the view section, multiple views on a module can be declared. This mechanism is necessary for downward compatibility. A module can thus support multiple versions of an interface module and remain compatible in spite of further development of old modules.

Sending Email with Jigsaw 

The simple application developed in the following sends emails. It consists of two modules:

  • The Mail module consists of one public interface and one private implementation. The interface of the module consists of one Java interface as well as the types of parameters and exceptions. It contains, in addition, a factory interface (Factory Pattern) for creating the implementation module.
  • The MailClient module uses the Mail module. It may only use the interface; direct access to the implementation classes is forbidden. 
Fig III: The most Simplest Module for Sending Mails

Java 9 Jigsaw now ensures that:

  • The MailClient module only accesses exported classes/packages of the Mail module. Direct access with Jigsaw leads to compiler and runtime errors when trying to get round this restriction using Reflection-API.
  • The Mail module only uses the specified dependencies on other modules. This decouples the module implementation from the client and makes it exchangeable. Along with the support from internal and external view into a module
Jigsaw also prevents
  • cyclic dependencies among modules. Dependency of the Mail module on the MailClient is thus forbidden and is checked by the compiler and the JVM. 
  • uncontrolled propagation of transitive dependencies from the Mail component on to the MailClient. It is possible to control whether or not the interface of dependent modules are visible to the user of the interface. 

The Mail Module Example in Jigsaw Source Code 

Jigsaw introduces a new directory structure for modules in the source code. The source path is now located at the top level, defining modules, together with their sources. The directory corresponds with the module name. So the Java compiler can also find dependent modules in the source code with no cumbersome path declarations required for each module.

|–– Mail
| |–– de
| | |–– qaware
| |   |–– mail
| |     |––
| |     |––
| |     |–– impl
| |       |––
| |––
|–– MailClient
|   –– de
|     |–– qaware
|       |–– mail
|         |–– client
|           |––
|           |––

Of course, with Jigsaw, modules can be stored in any directory structure. But the chosen layout has the advantage that all modules can be compiled in one compiler run, and only one search path needs to be declared. Modules in Java 9 Jigsaw contain a special file, the Module descriptor, called in the default package of the library. In our example, the Mail component exports only one package.

The associated file looks like this:

module  Mail {
    exports  ME;

The exports instruction refers to a package. Across multiple export instructions, multiple packages can be defined as part of the interface. In our example, all types in the de.qaware.mail package are visible to the user while subpackages are invisible. The export instruction is not recursive. Types in the de.qaware.mail.impl sub-package are not accessible from any other module. One user of the Mail module is the MailClient.

The module descriptor looks like this:

module  MailClient  {
    requires  Mail;

The requires instruction takes a module name and optionally supports the information of whether the Mail module is visible at runtime (requires … for reflection) or just at compile time (requires … for compilation). By default, the requires instruction refers to the Java compiler as well as the JVM at runtime. As will be shown in the following, the source code of the MailClient component uses the interface of the Mail component. Part of the interface is the Java interface MailSender as well as a factory which creates an implementation object on demand. In this example, the parameters for the Mail address and the message are simple Java strings. Every Jigsaw module automatically depends on the Java base module, modul java.base. Base packages such as java.lang or are found in this module. For this reason, the use of the String class is not explicitly declared in the module.

package  de.qaware.mail.client;

import  de.qaware.mail.MailSender;
import  de.qaware.mail.MailSenderFactory;

public  class MailClient  {
    public  static void main(String  [] args) {
        MailSender  mail = new MailSenderFactory().create();
        mail.sendMail("",  "Hello  Jigsaw");

Let us remind ourselves: access to private implementation classes is not possible. Any attempt to create an instance of the MailSenderImpl class directly with new or via Reflection without calling up the factory would fail with the following error message:

    ../  error:  MailSenderImpl is not visible because 
    package de.qaware.mail.impl  is not visible 
    1 error.

That is exactly what we want. No one but the exported artifacts in the "de.qaware.mail" package can externally access a class in the MailSender module. Non-exported packages are invisible. In order for modular Java programs to be compiled without an external build tool like Ant, Maven or Gradle, it is necessary that the javac Java compiler can find dependent modules, even if they are present in the source code only. Therefore the Java compiler has been expanded with the declaration of the module source path. With the new option, -modulesourcepath, the Java compiler search path for dependent modules is shared. For experienced Java programmers it is very unusual to see multiple modules in the "src" sub-directories, which are named after the modules. If one were to follow JDK conventions, then these directories would be named by packages (e.g. de.qaware.mail). That can become very confusing, yet has the advantage that the module names are globally unique. This, however, plays no role in projects that are not public. Therefore, we use technically descriptive names such as Mail, Mail-Client or MailAPI. The great advantage of this new code structure, however, is that one single command can compile all modules.

From Mail module to Mail plug-in

In the above example, the interface of the Mail module is closely coupled with the implementation. Jigsaw knows no visibility rules within a module between interface and implementation. Bidirectional dependencies are permitted here. As it is, the Mail module is not exchangeable at runtime but it becomes so if interface and implementation are separated into different modules (see Ill. 4). This conventional plug-in design is necessary whenever there are multiple implementations of one interface:

// src/MailClient/
module  MailClient  {
    requires  MailAPI;
// src/MailAPI/
module  MailAPI  {
    exports  de.qaware.mail;
// src/WebMail/
module  WebMail  {
    requires  MailAPI;

The MailClient module now depends on the new module, MailAPI. The MailAPI module exports the interface but has no implementation of its own. This interface is implemented by a third module, WebMail, which implements the interface rather than exporting something. The client and the implementation module would declare the API module via requires, and this is what the compiler needs to know at compile time. Ill. 4: The Mail module as an exchangeable plug-in But now, we have a problem at runtime because the implementation classes are inaccessibly hidden in the WebMail module, and another one because the factory must be located in the MailAPI module in order to be visible to the client. Unfortunately, this leads to a cycle and a compiler error because the factory depends on the implementation. The question is how to create a hidden implementation class? With JDK9 the amended ServiceLoader class in the java.util package comes in handy: a service interface can be connected with a private implementation class using the provides information in the module descriptor of the implementation module. So the ServiceLoader can access the implementation class and instantiates it. Creation using reflection with class.forName().newInstance() is not possible any more. This decision impacts all dependency injection frameworks, such as Spring or Guice. Today's implementations of these frameworks must be adapted for Jigsaw’s new ServiceLoader mechanism. The client module declares the use of a service by means of the uses clause. The implementing module declares via provides which implementation may be created by the ServiceLoader, and that allows instantiation in a client module via ServiceLoader:

// src/MailAPI/
module  MailAPI  {
    exports  de.qaware.mail;

// src/MailClient/
module  MailClient  {
    requires  MailAPI;
    uses de.qaware.mail.MailSender

// src/WebMail/
module  WebMail  {
    requires  MailAPI;
    provides  de.qaware.mail.MailSender 
        with  de.qaware.mail.smtp.SmtpSenderImpl;

// src/MailClient/de/qaware/mail/client/

// OK: Create  implementation  by using the java.util.ServiceLoader
MailSender  mail = ServiceLoader.load(MailSender.class).iterator().next();

// NOK:  Reflection  is not allowed:
// mail = (MailSender)  Class.forName("de.qaware.mail.impl.MailSenderImpl").getConstructors()[0].newInstance();

// NOK:  Direct  instantiation  is not allowed:
// mail = new de.qaware.mail.impl.MailSenderImpl();

Declaration of a service in the META-INF directory is no longer necessary. Direct use via Reflection is still forbidden, and will be signaled by a runtime error. Likewise the implementation class is of course private and cannot be directly utilized. The module path and automatic modules Java 9 supports the declaration of new modules at runtime. For reasons of downward compatibility, a new loading mechanism has been introduced for module JARs: the module path. Just like with the class path, JARs and/or entire directories can be declared from which modules are loaded. For JARs in the module path with no module descriptor, a default descriptor will automatically be generated. This descriptor exports everything and adds the module as a dependency to all other modules. Such a module is called an "automatic module". This approach guarantees coexistence between Jigsaw modules and normal JARs. Both can even be stored in the same directory:

# run
java -mp mlib -m MailClientBuilding, packetizing and executing modules 
With one single command all modules of an application can be compiled and neatly stored in an output folder.

# compile
javac -d build -modulesourcepath  src $(find  src -name  "*.java")
This command compiles the module under the root path src and saves the generated classes in an identical directory structure in the ./build path. The contents of the .build directory can now be packed into separate JAR files. Declaration of the start class (--main-class) is optional:

# pack
jar --create
--file mlib/WebMail@1.0.jar
--module-version  1.0
-C build/WebMail  .

jar --create
--file mlib/MailAPI@1.0.jar
-C build/MailAPI  .

jar --create
--file mlib/MailClient@1.0.jar
--module-version  1.0
--main-class  de.qaware.mail.client.MailClient
-C build/MailClient  .
Three modules are now in the mlib output directory. JVM is able to start the application when this path is given as a module path:
# run
java -mp mlib -m MailClient

Delivering modular applications

In the past, in order to deliver a runnable application, the complete Java Runtime (JRE) had to be included. Start scripts defining the class path were necessary for the application itself, in order to be able to start them correctly with their dependent libraries. The JRE always delivered the full Java functionality, even if only a small part of it was effectively needed. Now there is the jlink command in Java 9 which allows building applications linked only with the necessary parts of the JDKs. Only required modules are included, minimizing the Java runtime environment. If, for example, an application uses no CORBA, no CORBA support would be included.

# link
jlink --modulepath  $JAVA_HOME/jmods:mlib  --addmods  MailClient,Mail
--output  mailclient
The application can now be started with a single script. Knowledge of modules and their dependencies is not necessary. The output directory generated by jlink looks like this:
 |——  bin
 |     |——  MailClient
 |     |——  java
 |     |——  keytool
 |——  conf
 |     |——
 |     |——  security
 |     |——  java.policy
 |     |——
 |——  lib

The directory tree shows the complete minimal runtime environment of the application. In the bin directory, you can find the generated start script with which the application can be started without any parameters. All utilized modules are automatically packed into one file. The application can now be started by calling up the MailClient start script in the bin directory.

cd mailclient/bin
Sending  mail to:  message:  A message  from JavaModule  System


The team around Marc Reinhold at Oracle has done an excellent job. Using Jigsaw, modular software systems can be developed solely on the basis of built-in Java resources. The impact on existing tools and development environments is significant. Therefore it requires some effort to make popular development environments and build Jigsaw-compliant systems. But this will happen before long because Jigsaw is part and parcel of Java 9. Unfledged tool support, as was the case with OSGi, probably belongs to the past. Jigsaw does not relieve us of the task of designing, implementing and testing sound modules, and is therefore no panacea against monolithic, poorly maintainable software. But Jigsaw makes good software design easier and reachable for anybody.

Links and Literature

  • [JSR376] Java Specification Request 376: Java Platform Module System,
  • [Par72] D. L. Parnas, On the Criteria To Be Used in Decomposing Systems into Modules, in: CACM, December, 1972,

Aug 19, 2016

Analyzing performance issues in a large scale system with ELK

Application overview

I’m working on a large project here at QAware. Besides a lot of source code, we have our project running on an extensive amount of servers. 
The following figure gives you a brief overview. The rectangles are servers and the strange A* items are the best magnifying glasses I can paint by myself (and they represent analysing points).

All servers exist at least as pairs (up to larger clusters). Ignoring the relational database servers this leaves us at about 54 servers in production.
The customer has a separate operations team running the application, but in the following case we were asked for help.

Problem report

Users of a 3rd party application using our services reported performance issues. It was not clear on which occasions this happens or which services are affected.

Unfortunately the operations team does not run performance monitoring.

Analysing the problem

Fortunately we were prepared for these kinds of problems:
  • We have log files on each server and include them centrally in Elasticsearch.
  • We have Kibana running to search and visualize the data.
  • We log performance information on all servers.
  • We can relate log statements from each server to one specific request.
Visualizing this information with charts in Kibana was a huge help to track down the problem. Here are some key figures (I left out some steps).

A1 - Check if we have a problem

I searched for incoming service calls on point A1 (see application overview) and created a pie chart. Each slice represents a duration range for how long the request took.
A1 is our access point for service calls and it is therefore the best spot to determine if services are slow. I chose the pie chart to get a fast overview of all request and the distribution of their runtime duration.

Only the large green slice represents service calls below 5s duration.

Steps in Kibana:

  • Choose Visualize
  • Choose 'Pie chart'
  • Choose 'From a new search'
  • Enter <Query for all Service on A1>
  • Under 'buckets' click on 'Split slices'
  • Set up slices as follows
    • Aggregation: Histogram
    • Field: <duration in ms field>
    • Interval: 5000
  • Press Play

We clearly had a problem. There are complex business services involved, but a response time above 5s is unacceptable. In the analysed time slot 20% of the service calls took longer!

A2 - Show the search performance

I choose the most basic search (more or less an ID lookup) which is performed inside the application (see point A2 in the application overview) and created a line chart for the request time.
By choosing a point between application server and database, I basically split the application in half and checked where the time was lost.

This time I used a line chart with date histogram, to show if there is any relation between slow service calls and the time of the day.

Steps in Kibana:
  • Choose Visualize
  • Choose 'Line chart'
  • Choose 'From a new search'
  • Enter <Query for the basic search on A2>
  • Set up 'metrics' -> 'Y-Axis' as follows
    • Aggregation: Average
    • Field: <duration field>
  • Under 'buckets' click on 'X-Axis'
  • Set up X-Axis as follows
    • Aggregation: Date Histogram
    • Field: @timestamp
    • Interval: Auto
  • Press Play

As you can see the duration time skyrockets in some hours and you could see the same graph on every work day. Conclusion: There is a load problem.

A3 – Check the search performance of different SOLRs

I made another visualization for the different SOLRs we run. We have one for each language. I basically took the line chart from A2 and added a sub bucket. This way you can split up the graph by a new dimension (in our case the language) and see if it is related to the problem.

Steps in Kibana:

  • Choose Visualize
  • Choose 'Line chart'
  • Choose 'From a new search'
  • Enter <Query for all searches on A3> 
  • Set up 'metrics' -> 'Y-Axis' as follows
    • Aggregation: Average
    • Field: <duration field>
  • Under 'buckets' click on 'X-Axis'
  • Set up X-Axis as follows
    • Aggregation: Date Histogram
    • Field: @timestamp
    • Interval: Auto
  • Click on 'Add sub-buckets'
  • Click on 'Split Lines'
  • Set up 'Split Lines' as follows
    • Aggregation: Terms
    • Field: <language field>
    • Top: 20 (in our case)
  • Press Play

We could see the load problem equally distributed among all languages. Which makes no sense, because we have minor languages that never get much load. A quick look on some query times in the SOLRs confirmed that. The queries itself were fast.


We knew it was a load problem and it was not a problem of the SOLRs or the application itself. Possible bottlenecks left were the apache reverse proxy or the network itself. Both of them wouldn’t have been my initial guess.

Shortly afterwards we helped the operations team to track down a misconfigured SOLR reverse proxy. It used file caching on a network device!


  • Visualizing the data was a crucial help for us to locate the problem. If you only look at a list of independent log entries in text form, it is much harder to make the correct conclusions.
  • Use different charts depending on the question you want to answer.
  • Use visual log analysing tools like Kibana (ELK stack). You can use them for free and they can definitely help a lot.

Jun 15, 2016

Locking alternatives in Java 8


To provide synchronized data cache access, I discuss three alternatives in Java 8: synchronized() blocks, ReadWriteLock and StampedLock (new in Java 8). I show code snippets and compare the performance impact on a real world application.

The use case

Consider the following use case: A data cache that holds key-value pairs and needs to be accessed by several threads concurrently.

One option is to use a synchronized container like ConcurrentHashMap or Collections.synchronizedMap(map). Those have their own considerations, but will not be handled in this article.

In our use case, we want to store arbitrary objects into the cache and retrieve them by Integer keys in the range of 0..n. As memory usage and performance is critical in our application, we decided to use a good, old array instead of a more sophisticated container like a Map.
A naive implementation allowing multi-threaded access without any synchronization can cause subtle, hard to find data inconsistencies:
  • Memory visibility: Threads may see the array in different states (see explanation).
  • Race conditions: Writing at the same time may cause one thread's change to be lost (see explanation)
Thus, we need to provide some form of synchronization.

To fix the problem of memory visibility, Java's volatile keyword seems to be the perfect fit. However, making an array volatile has not the desired effect because it makes accesssing the array variable atomic, but not accessing the arrays content.

In case the array's payload is Integer or Long values, you might consider AtomicIntegerArray or AtomicLongArray. But in our case, we want to support arbitrary values, i.e. Objects.

Traditionally, there are two ways in Java to do synchronization: synchronized() blocks and ReadWriteLock. Java 8 provides another alternative called StampedLock. There are propably more exotic ways, but I will focus on these three relatively easy to implement and well understood ways.

For each approach, I will provide a short explanation and a code snippet for the cache's read and write methods.


synchronized is a Java keyword that can be used to restrict the execution of code blocks or methods to one thread at a time. Using synchronized is straight forward - just make sure to not miss any code that needs to be synchronized. The downside is, you can't differentiate between read and write access (the other two alternatives will). If one thread enters the synchronized block, everyone else will be locked. On the upside, as a core language feature, it is well optimized in the JVM.

public class Cache {
  private Object[] data;
  private final Object lock = new Object();

  public Object read(int key) {
    synchronized (lock) {
      if (data.length <= key) {
        return null;

      return data[key];

  public void write(int key, Object value) {
    synchronized (lock) {
        ensureRange(key); // enlarges the array if necessary
        data[key] = value;



ReadWriteLock is an interface. If I say ReadWriteLock, I mean its only standard library implementation ReentrantReadWriteLock. The basic idea is to have two locks: one for write access and one for read access. While writing locks out everyone else (like synchronized), multiple threads may read concurrently. If there are more readers than writers, this leads to less threads being blocked and therefor better performance.

public class Cache {
  private Object[] data;
  private final ReadWriteLock lock = new ReentrantReadWriteLock();

  public Object read(int key) {
    try {
      if (data.length <= key) {
        return null;

      return data[key];
    } finally {

public void write(int key, Object value) {
    try {
      ensureRange(key); // enlarges the array if necessary
      data[key] = value;
    } finally {


StampedLock is a new addition in Java 8. It is similiar to ReadWriteLock in that it also has separate read and write locks. The methods used to aquire locks return a "stamp" (long value), that represents a lock state. I like to think of the stamp as the "version" of the data in terms of data visibility. This makes a new locking strategy possible: the "optimistic read". An optimistic read means to aquire a stamp (but no actual lock), read without locking and afterwards validate the lock, i.e. check if it was ok to read without a lock. If we were too optimistic and it turns out someone else wrote in the meantime, the stamp would be invalid. In this case, we have no choice but to acquire a real read lock and read the value again.

Like ReadWriteLock, StampedLock is efficient if there is more read than write access. It can save a lot overhead to not have to acquire and release locks for every read access. On the other hand, if reading is expensive, reading twice from time to time may also hurt.

public class Cache {
  private Object[] data;
  private final StampedLock lock = new StampedLock();

  public Object read(int key) {
    long stamp = lock.tryOptimisticRead();

    // Read the value optimistically (may be outdated).
    Object value = null;
    if (data.length > key) {
      value = data[key];

    // Validate the stamp - if it is outdated,

    // acquire a read lock and read the value again.
    if (lock.validate(stamp)) {
      return value;
    } else {
      stamp = lock.readLock();

      try {
        if (data.length <= key) {
          return null;

        return data[key];
      } finally {

  public void write(int key, Object value) {
    long stamp = lock.writeLock();
    try {
      ensureRange(key); // enlarges the array if necessary
      data[key] = value;
    } finally {


All three alternatives are valid choices for our cache use case, because we expect more reads than writes. To find out which is best, I ran a benchmark with our application. The test machine is a Intel Core i7-5820K CPU which has 6 physical cores (12 logical cores with hyper threading). Our application spawns 12 threads that access the cache concurrently. The application is a "loader" that imports data from a database, makes calculations and stores the results into a database. The cache is not under stress 100% of the time. However it is vital enough to show a significant impact on the application's overall runtime.

As benchmark I executed our application with reduced data. To get a good average, I ran each locking strategy three times. Here are the results:

In our use case, StampedLock provides the best performance. While 15% difference to Synchronized and 24% difference to ReadWriteLock may not seem much, it is relevant enough to make the difference between making the nightly batch time frame or not (using full data). I want to stress that by no means this means that StampedLock is *the* best option in all cases. Here is a good article that has more detailed benchmarks for different reader/writer and thread combinations. Nevertheless I believe measuring the actual application is the best approach.


In Java 8, there are at least three good alternatives to handle locking in a concurrent read-write scenario: Synchronized, ReadWriteLock and StampedLock. Depending on the use case, the choice can make a substantial performance difference. As all three variants are quite simple to implement, it is good practice to measure and compare the performance.

Apr 23, 2016

How to use Docker within intelliJ

A short tutorial on how to use Docker within intelliJ and with a little help from Gradle. You can find the sample code & the full description here:


  • install intelliJ Docker plugin (
  • check that there is a default Docker Machine: docker-machine ls. If there is no default machine create one: docker-machine create --driver virtualbox default.
  • start the default Docker Machine: docker-machine start default.
  • bind the env vars to the shell: eval "$(docker-machine env default)"
  • check if everything is correct: docker ps

Using Docker within intelliJ

1) Setup Docker cloud provider in intelliJ global preferences as shown below.

Tip: You can get the API URL by executing docker-machine ls and using the shown IP & port for the defaultmachine.

2) Check connection to Docker daemon in intelliJ "Docker" tab

3) Create new project from version control using github (

4) Create a new run configuration to deploy application to Docker as shown on the following screenshots:

  • Be sure not to forget to add the Gradle tasks as a "before launch" action as shown at the very bottom of the screenshot.
  • The same is also possible for Docker Compose files. Just point the "Deployment" dropdown to the Docker Compose file.

5) Run the configuration and inspect the Docker container. A browser will open automagically and point to the REST endpoint. Within intelliJ you can access the containers console output, environment variables, port bindings etc.


Mar 18, 2016

Building a Solr-, Spark-, Zookeeper-Cloud with Intel NUC PCs

Part 1 - Hardware

If you work with Cluster- / Grid- or Cloud technologies like Mesos, Spark, Hadoop, Solr Cloud or Kubernetes, as a developer, architect or technical expert, you need your own private datacenter for testing and developing. There are several ways to build such an environment, each with its own drawbacks. To test real world scenarios like a failsafe and resilient Zookeeper cluster or a clustered Spark/Hadoop installation, you should have at least three independent machines. For the installation of Mesos/DCOS it is recommended that you have five machines in minimal setup.
There a several ways to build such an environment, each with it own drawbacks:

1) A virtualized environment running on a workstation laptop or PC

You can easily create a bunch von virtual machines and run them an a desktop or workstation. This approach works fine, is fast and cheap but has some problems:
  1. Your laptop may have only 16 Gigabyte of RAM - so each VM could get only 2-3 Gigabyte. For frameworks like Apache Spark which heavily uses caching this does not work well. 
  2. The performance of a virtualized environment is not predictable. The problem is that some resources like disk, network or memory access are shared between all VMs. So even if you have a workstation with an octa-core Intel Xenon processor, IO will behave different.

2) A cloud environment like the AWS EC2

This is the way most people work with these technologies but has also some specific disadvantages. If you experience any performance problem, you are likely not able to analyze the details. Cluster software is normally very sensitive in terms of network latency and network performance. Since AWS can't guarantee that all your machines are in the same rack, the performance between some nodes can differ.  

3) A datacenter with real hardware

You can build your own cluster but it is normally far too expensive. But even if you can afford real server hardware, you will have the problem that this solution is not portable. In most enterprises, you will not be allowed to run such a cluster. For testing and development it is much better when you have your own private cluster like your own laptop. 

So what is a feasible solution?

I decided to build my own 4 node cluster on Intel NUC mini PCs. Here are the technical facts:
  • NUC 6th Generation - Skylake
  • Intel Core I5 - Dual Core with Hyper-threading 
  • 32 GB DDR4 RAM
  • 256 GB Samsung M.2 SSD
  • Gigabit Ethernet
The Intel NUC has to be equipped with RAM and a M.2 SSD disk. All these parts have to be ordered separately.

This gives you a cluster with amazing capabilities
  • 16 Hyper Threading Units (8 Cores)
  • 128 GB DDR4 RAM
  • 1 TB Solid State Disk
Since I needed a portable solution, everything should be packed into a normal business case.  I found a very slim aluminium attaché case at Amazon with the right dimensions to include the NUC PCs and the network switch.

I decided to include a monitor and a keyboard to get direct access to the first node in the cluster. The monitor is used for visualization and monitoring when the software runs. I ordered a Gechic HDMI monitor which has the right dimensions to include the monitor in front of the case.

The NUC package includes screws for mounting. This also works in such a case when you drill a small hole for each screw. For the internal wiring you have to use flexible network cables. Otherwise you will get problems with the wiring. You also have to have a little talent to mount connectors for power and network in the case, but with a little patience it works. 

You can see the final result here:

This case will be my companion for the next year on all conferences, fairs and even in my office. The perfect presenter for any cluster / cloud technology. 

In the next part I will describe how to get a DCOS, Solr/Spark/Zeppelin Cloud installed and what you can do on top of such a hardware.

Have fun. 

Johannes Weigend