Nov 15, 2019

A Field Report From the O’Reilly Software Architecture in Berlin

by Susanne Apel, Stephan Klapproth, Andreas Zitzelsberger

Last week, we were at the O’Reilly Software Architecture in Berlin. Apart from showing off our toys at our booth and chatting with the nice people at the conference, Leander Reimer presented a hitchhiker's guide to cloud native API gateways and we visited the conference talks. In this article we share our key learnings and takeaways with you. Slides or videos are provided where available.

Foto of our booth at the O'Reilly Sofware Architecture

Cognitive Biases in the architectures life - Birgitta Böckeler

[by Susanne Apel] [Video]

I was impressed of how honestly Birgitta spoke about past communications and decisions. She started her presentation by talking openly about how she feels about getting feedback to be ‘even more confident’ and having the immediate impulse to explain herself. This keynote was the profound answer that will hopefully get many minds contemplating about cognitive biases and confidence as a concept.
I was happy to see the topic of cognitive biases in the tech world underlined with good examples. To give one: Featuring the past decisions to use framework X should not be judged while disregarding the outcome bias. You do not know the future of framework X in advance (Reflux in this case). You should be aware of the source of a positive outcome: Was it the decision making or was it luck?
Birgitta is very much aware of the differences and encourages all of us to do the same. This will lead to the point where we will make fewer absolute and more relative statements.

The 3-headed dog: Architecture, Process, Structure - Allen Holub

[by Susanne Apel] [Video]

In addition to the three heads in the talk’s title, Allen also mentioned culture and physical work environment.
In agile teams, the real goal is to quickly add value to the product - and speed in gaining value can only be achieved by speeding up feedback cycles.
The teams should very much be autonomous and empowered to make decisions.
In my point of view, these are the underlying principles in agile software development, regardless of the particular framework used.
Allen points out the role of teams and offices and the real meaning of a MVP - a small solution that can be enlarged (as opposed to a throw-away product), demonstrated with impressive images of actual houses built this way. He emphasizes that if you want to change one head of the five-headed dog, you also have to change all of the other heads.

A CRDT Primer - John Mumm

[by Susanne Apel]

John explained conflict-free replicated data types (CRDT) with a clear motivation and a nice mathematical introduction providing an intuitive grasp of the topic.

From a computer science point-of-view, the talk seemed very mathematical, from a mathematical point of view it gave plausible explanations while leaving out the more subtle parts of definitions and proofs. The intuition is sufficient, the validity proven elsewhere.

John motivates the issue with a Twitter-like application where the number of likes is to be maintained 'correctly'. This is not a trivial task for a large scale application with many instances.
For the Twitter likes, assume that you cannot unlike a post after you liked it earlier. This gives the following implementation:
Each node maintains a local copy of the local likes of every node in the cluster. When the number of likes is requested, the node sums up the number of likes. If a user likes a tweet, the node 'n' answering the like request increases its own counter of likes. When there 'is time', the node 'n' broadcasts (gossips) the new local copy of its cluster-view. The other nodes will compare and see a higher number of number of ‘n’-likes and will incorporate this number in their own local copy. To be more precise, the node broadcasts its own internal state of all node. This makes the broadcasting more efficient. However, the principle of distribution just explains stays the same. The nice thing is that the broadcasting works very smoothly, and you do not have to think about order of events. It might be that the user sees old data, but there will be eventual consistency. And their own interactions are always reflected immediately.

Mathematics confirm that this works, also with data types other than counters - given that they do fulfill the mathematical relations. Roughly speaking, the relations can be put as following: You need to define lookup, update, merge and compare methods (or variations thereof. The CRDT Wikipedia page provides a good explanation).
If all of these functions together fulfill certain rules, you will get eventual convergence of the lookup value of the data type (monotonic join semi-lattice and the comparison function should be a partial order). Broadcasting is part of the very concept of CRDTs. The CRDTs provide the framework for the actual operations to be executed within the cluster.

The rise and fall of microservices - Mark Richards

[by Stephan Klapproth] [Presentation]

Mark talked about how these days microservices are everywhere. DDD, continuous delivery, cloud environments, agility in business and technology were some of the drivers of the rise of microservices. Unfortunately, projects that introduce microservice architectures often are struggling with the complexity. They often miss their project plans and budget.

So before jumping on the bandwagon, you have to be aware of the challenges such a highly distributed architecture comes with. In his talk Mark outlined several pitfalls and gave some best practices to stop the decline and fall of microservices.


How do we take architectural decisions in eBay Classifieds Group - Engin Yöyen

[by Stephan Klapproth]

In his talk Engin presents different approaches to cope with the challenges of a widely distributed team with hundreds of developers, forcing him to rethink the classical role model of a software architect.
Consensual high level architectures, empowering the engineers to lead, architects as supporting enablers (versus architects as governance), techniques like delegation levels and architecture decision records ensured the success of the project at eBay Classifieds Group.

Reactive domain-driven design: From implicit blocking to explicit concurrency - Vaughn Vernon

[by Andreas Zitzelsberger]

Vaughn Vernon took us on an elaborate journey to a reactive domain-driven world. I had two key takeaways:
1. An anemic model, that is a model consisting only of data types, is not sufficient for a reactive domain-driven world. Instead state changing actions should be properly defined. For instance, to provide a method Person.changeAddress instead of Person.setStreet, Person.setCity, … Vaughn pressed the point that this is a necessity for effective reactivity.
2. When migrating to reactive microservices, the strangler pattern is an effective approach. Vaughn pointed out two tools that can help to enable reactivity with the strangler approach: Debezium, which turns database changes into events and Oracle Golden Gate.

Nov 11, 2019

DevOps Enterprise Summit 2019

von Michael Rohleder (Bereichsleiter bei QAware)

Die IT-Transformation in Richtung DevOps beschäftigt aktuell sehr viele Unternehmen der Welt, unter anderem auch unsere Kunden BMW, Deutsche Telekom und Allianz. Grund genug also auf dem Enterprise DevOps Summit zu sehen, welche Trends die DevOps-Community beschäftigen und wie es den Unternehmen bei Ihrer Transformation geht. Gastgeber der Konferenz ist Gene Kim, Gründer von IT Revolution. Er ist bekannt als Autor von einigen erfolgreichen Büchern, z.B. The Phoenix Project, The DevOps Handbook und Accelerate. Brandneu erschienen ist sein Buch The Unicorn Project, welches auf der Konferenz heftig umworben wurde - so wie man das eben aus den USA kennt.

Dieser Artikel liefert eine Zusammenfassung meiner persönlichen Eindrücke zum Besuch des DevOps Enterprise Summits in Las Vegas und liefert Links zu Vorträgen und weiterführenden Inhalten.


Erfahrungsberichte


Die drei Tage der Konferenz waren geprägt von vielen anschaulichen und beeindruckenden Erfahrungsberichten von IT-Initiativen großer Unternehmen. So zeigte CSG wie sie in mehreren Jahren ihre veraltete Mainframe IT-Landschaft modernisiert und mit DevOps-Praktiken fit für die Zukunft gemacht haben. Walmart zeigte, wie sie ihren so kritischen und schwierigen Anwendungsfall zur Prüfung der Artikelverfügbarkeit in ihrer breiten Systemlandschaft umsetzen konnten. Grundlage war eine Umstellung ihrer synchron-orientierten Message-Architektur auf eine event-orientierte Message-Architektur. Viele weitere Initiativen wurden von Führungskräften der Unternehmen vorgestellt, z.B. von Adidas, John Deere, Optum, uvm.

Besonders gefreut hat mich der Vortrag unseres Kunden BMW. Ralf Waltram und Frank Ramsak stellten Ihre 100% Agile Journey bei BMW vor. Wunderbar zu sehen, wie sich unser Kunde in der IT weiterentwickelt. Ich hatte zudem den Eindruck, dass auch die Zuhörerschaft sehr beeindruckt von der Story war.


Psychological Safety


Der Erfolg der Unternehmen hängt immer maßgeblicher davon ab, wie sicher und wohl sich die Mitarbeiter im Unternehmen fühlen. Das wurde in vielen Vorträgen, aber auch Podiumsdiskussionen von Führungskräften untermauert. „Psychological Safety“ ist der Begriff, der in der Community schon länger als Erfolgsfaktor genannt wird. Nicht verwunderlich, dass der Begriff auch im neuen Buch von Gene Kim “The Unicorn Project” ein Thema ist. In diesem spricht er von “The Five Ideals”:
  • The First Ideal - Locality and Simplicity
  • The Second Ideal - Focus, Flow, and Joy
  • The Third Ideal - Improvement of Daily Work
  • The Fourth Ideal - Psychological Safety
  • The Fifth Ideal - Customer Focus

Auch Deloitte wirbt mit dem Slogan „better value sooner safer happier“ von Jonathan Smart, der in seinem Vortrag Risk and Control is Dead, Long Live Risk and Control eindrucksvoll erklärte, wie wichtig „Psychological Safety“ ist, um angemessen mit Risiken im Unternehmen umzugehen.


Produktivität


Der Accelerate State of DevOps Report repräsentiert die Forschungsergebnisse und Daten von mehr als 31.000 Umfrage-Teilnehmern weltweit. Er zeigt welche DevOps-Praktiken und Methoden zu mehr Software-Delivery- und Operational-Performance (SDO) führen. Dr. Nicole Forsgren, verantwortlich für den DevOps-Report, stellte die neuen Ergebnisse und Erkenntnisse auf dem Summit vor und zeigte was “DevOps Elite Performers” von “Low Performers” unterscheidet. Grundlage dafür sind genau vier Metriken “lead time”, “deployment frequency”, “mean time to restore (MTTR)” und “change fail percentage”. Thoughtworks hat diese vier Metriken auf ihrem Technologie Radar im April diesen Jahres von “Trial” auf “Adopt” gestellt, was einer Empfehlung zum Einsatz dieser Technik entspricht. Genauere Informationen zu den Ergebnissen des DevOps-Reports und zum wissenschaftlichen Vorgehen erläutert das Buch Accelerate. Den State of DevOps Report findet man bei Google, dort sind auch hilfreiche Beschreibungen zu DevOps-Praktiken und Methoden hinterlegt, die der Community bei der Umsetzung von DevOps in Ihrem Unternehmen helfen sollen.

Ein weiteres Finding auf der Konferenz ist für mich die Open Practice Library, die eine Sammlung von aktuellen DevOps-Tools und -Praktiken enthält, die durch die Community selbst entstehen.


“Project to Product” Bewegung


Die Umstellung der IT-Organisation von projektorientiertem Vorgehen hin zur Produktorientierung ist ein immer mehr gehörter Baustein in der IT-Transformation vieler Unternehmen. Hierzu gab es neben den Erfahrungsberichten der Unternehmen auch spannende Vorträge. Mik Kersten stellte in seinem Vortrag Project to Product sein Konzept des Flow Framework™ vor. Ein neuer Ansatz, der es Unternehmen ermöglichen soll, den “Flow of business value” im Software-Entstehungsprozess auf eine Weise zu messen, die sowohl IT als auch der Fachbereich verstehen soll. Weiterführende Informationen dazu gibt es in seinem Buch Project to Product.

Dominica DeGrandis versuchte Hilfestellung bei folgender Frage zu geben: “Do You Have the Right Teams to Manage Work by Product?”. Dabei zeigte sie auf, weshalb es “Full Stack Teams” braucht und keine “Full Stack Engineers”, welche neue Rollen man bei der Produktorientierung berücksichtigen sollte und wie man mit dem Team-Skillset umgeht.

Dominica ist auch bekannt als Autorin des Buchs Making Work Visible, in dem sie fünf Zeitdiebe in der Softwareentwicklung aufzeigt und erklärt, wie man sie abstellen kann.


Die Folien und Video-Aufzeichnungen zum DevOps Enterprise Summit sind erfreulicherweise öffentlich gestellt:


Interessante und spannende drei Tage in den USA, auf denen man den Spirit und den enormen Fortschritt der DevOps-Bewegung bei den Vorträgen und Podiumsdiskussionen und auch bei den Gesprächen mit anderen Teilnehmern spüren konnte. Zum Abschluss gab es auch noch ein schönes Mitbringsel nach Deutschland: eine erste Ausgabe des Buchs “The Unicorn Project” als Geschenk vom Gastgeber und Autor Gene Kim mit persönlicher Widmung. Vielen Dank!


Mar 11, 2019

How to dispatch flux to worker in Reactor

This post shows how to dispatch a flux of items to services of separated functional domains when using Reactor in Java. The author encountered this problem while developing a larger reactive application, where a strict separation of different domains of the application is key to maintain a clean architecture.
Reactor is a library for developing reactive applications and its reference guide is a good read to understand the basic principles of reactive programming.
The examples the author found for Reactor, or for other reactive libraries, show how to deal with a flux of items without mentioning how to dispatch and share this flux among separated functional domains. When mapping a flux to one functional domain, there is no obvious way to obtain the original flux to map it to another function domain. In the following, an example will detail the problem and present a solution for the Reactor library.

An example application

This section introduces an example application which will be transformed later to a reactive one. It will dispatch some deletion tasks to independent services, which is a common feature of larger software systems.
A customer is represented by (the usual Java boiler plate such as getters, setters, equalshashCodetoString is omitted)
public class Customer {
    CustomerId id;
    AccountId account;
    Set<InvoiceId> invoices;
}
A customer has its own account and a set of associated invoices. The classes CustomerIdAccountIdInvoiceId here are simple wrapper classes to uniquely identify the corresponding entities.
A service supposed to delete a set of customers has the interface
public interface CustomerService {
    void deleteCustomers(Set<CustomerId> customerIds)
}
An implementation of CustomerService should take care of deleting the account and the invoices as well.
public class CustomerServiceImpl {
  @Override
  public void deleteCustomers(Set<CustomerId> customerIds) {
      Set<Customer> deletedCustomers = customerRepository.deleteCustomersByIds(customerIds);
      Set<AccountId> toBeDeletedAccounts = deletedCustomers.stream()
              .map(Customer::getAccount)
              .collect(Collectors.toSet());
      Set<InvoiceId> toBeDeletedInvoices = deletedCustomers.stream()
              .flatMap(customer -> customer.getInvoices().stream())
              .collect(Collectors.toSet());
      accountService.deleteAccounts(toBeDeletedAccounts);
      invoiceService.deleteInvoices(toBeDeletedInvoices);
  }
}
The deletion of the customers itself is delegated to an underlying customerRepository, which returns a collection of the deleted customers for further processing (this "find and delete" pattern is common for NoSQL databases, such as MongoDB).
Furthermore, the deletion of the associated accounts and invoices are delegated to the respective accountService and invoiceService, which have the following interface:
public interface AccountService {
    void deleteAccounts(Set<AccountId> accountIds);
}

public interface InvoiceService {
    void deleteInvoices(Set<InvoiceId> invoiceIds);
}
Note that this example application has clearly separated domains, which are the customers, the invoices and the accounts.

Reactive interfaces

Turning the service interfaces into reactive services is straight forward:
public interface ReactiveAccountService {
    Mono<Void> deleteAccounts(Flux<AccountId> accountIds);
}

public interface ReactiveInvoiceService {
    Mono<Void> deleteInvoices(Flux<InvoiceId> invoiceIds);
}

public interface ReactiveCustomerRepository {
    Flux<Customer> deleteCustomersByIds(Set<CustomerId> customerIds);
}

public interface ReactiveCustomerService {
    Mono<Void> deleteCustomers(Set<CustomerId> customerIds);
}
Note that returning a Mono<Void> is the reactive way of telling the caller that the requested operation has completed (with or without errors). Also note that the input to the ReactiveCustomerRepository stays non-reactive, as we want to focus on the reactive implementation of the CustomerService in combination with ReactiveAccountService and ReactiveInvoiceService.

Reactive implementation

A first attempt

A first attempt to implement CustomerService reactively could lead to the following code
@Override
public Mono<Void> deleteCustomers(Set<CustomerId> customerIds) {
    Flux<Customer> deletedCustomers = reactiveCustomerRepository.deleteCustomersByIds(customerIds);

    Flux<AccountId> toBeDeletedAccounts = deletedCustomers
            .map(Customer::getAccount);
    Mono<Void> accountsDeleted = reactiveAccountService.deleteAccounts(toBeDeletedAccounts);

    Flux<InvoiceId> toBeDeletedInvoices = deletedCustomers
            .flatMap(customer -> Flux.fromIterable(customer.getInvoices()));
    Mono<Void> invoicesDeleted = reactiveInvoiceService.deleteInvoices(toBeDeletedInvoices);

    return Flux.merge(accountsDeleted, invoicesDeleted).then();
}
However, when using the following dummy implementation for the reactiveCustomerRepository,
@Override
public Flux<Customer> deleteCustomersByIds(Set<CustomerId> customerIds) {
    Flux<Integer> generatedNumbers = Flux.generate(
            () -> 0,
            (state, sink) -> {
                System.out.println("Generating " + state);
                sink.next(state);
                if (state == customerIds.size() - 1)
                    sink.complete();
                return state + 1;
            });
    return generatedNumbers
            .doOnSubscribe(subscription -> {
                System.out.println("Subscribed to repository source");
            })
            .map(i -> {
                CustomerId id = new CustomerId("Customer " + i);
                return createDummyCustomerFromId(id);
            });
}
the following output is obtained:
Subscribed to repository source
Generating 0
Deleting account AccountId[id=Account CustomerId[id=Customer 0]]
Generating 1
Deleting account AccountId[id=Account CustomerId[id=Customer 1]]
Generating 2
Deleting account AccountId[id=Account CustomerId[id=Customer 2]]
Subscribed to repository source
Generating 0
Deleting invoice InvoiceId[id=Invoice CustomerId[id=Customer 0]]
Generating 1
Deleting invoice InvoiceId[id=Invoice CustomerId[id=Customer 1]]
Generating 2
Deleting invoice InvoiceId[id=Invoice CustomerId[id=Customer 2]]
This might be surprising as the reactiveCustomerRepository is requested twice to generate the customer. If the repository wasn’t a dummy implementation here, the account deletion would have consumed all those deletedCustomers, and the subsequent invoice deletion would have worked on a completed stream (meaning doing nothing at all). This is certainly undesired behavior.

Handling multiple subscribers

The reference documentation has an answer to this problem: Broadcasting to multiple subscribers with .publish(). The failing attempt should thus be modified as follows
@Override
public Mono<Void> deleteCustomers(Set<CustomerId> customerIds) {
    Flux<Customer> deletedCustomers = reactiveCustomerRepository.deleteCustomersByIds(customerIds);

    deletedCustomers = deletedCustomers.publish().autoConnect(2);
    Flux<AccountId> toBeDeletedAccounts = deletedCustomers
            .map(Customer::getAccount);
    Mono<Void> accountsDeleted = reactiveAccountService.deleteAccounts(toBeDeletedAccounts);
    deletedCustomers = Flux.merge(deletedCustomers, accountsDeleted).map(customer -> (Customer)customer);

    deletedCustomers = deletedCustomers.publish().autoConnect(2);
    Flux<InvoiceId> toBeDeletedInvoices = deletedCustomers
            .flatMap(customer -> Flux.fromIterable(customer.getInvoices()));
    Mono<Void> invoicesDeleted = reactiveInvoiceService.deleteInvoices(toBeDeletedInvoices);
    deletedCustomers = Flux.merge(deletedCustomers, invoicesDeleted).map(customer -> (Customer)customer);

    return deletedCustomers.then();
}
As .autoConnect(2) is used, the subscription to the repository publisher only happens if two subscriptions have happened downstream. This requires the reactiveAccountService and reactiveInvoiceService to return a Mono<Void> which completes once the given input flux is consumed completely, which ensures one subscription. The second subscription is achieved by merging the output together with original input flux.
The output is then as expected
Subscribed to repository source
Generating 0
Deleting invoice InvoiceId[id=Invoice CustomerId[id=Customer 0]]
Deleting account AccountId[id=Account CustomerId[id=Customer 0]]
Generating 1
Deleting invoice InvoiceId[id=Invoice CustomerId[id=Customer 1]]
Deleting account AccountId[id=Account CustomerId[id=Customer 1]]
Generating 2
Deleting invoice InvoiceId[id=Invoice CustomerId[id=Customer 2]]
Deleting account AccountId[id=Account CustomerId[id=Customer 2]]
At this point, the reactiveAccountService and reactiveInvoiceService could now also decide to .buffer their own given flux if they wanted to delete the given accounts or invoices in batch. Each implementation is free to choose a different buffer (or batch) size on its own. This is an advantage over the non-reactive implementation, where all items have been collected in one large list beforehand and are then given in bulk to the accountService and invoiceService.

Introducing a utility method

The above working solution has already been written such that a generic utility method can be extracted
public class ReactiveUtil {
    private ReactiveUtil() {
        // static methods only
    }

    public static <T> Flux<T> dispatchToWorker(Flux<T> input, Function<Flux<T>, Mono<Void>> worker) {
        Flux<T> splitFlux = input.publish().autoConnect(2);
        Mono<Void> workerResult = worker.apply(splitFlux);
        return Flux.mergeDelayError(Queues.XS_BUFFER_SIZE, workerResult, splitFlux)
                .map(ReactiveUtil::uncheckedCast);
    }

    @SuppressWarnings("unchecked")
    private static <T> T uncheckedCast(Object o) {
        return (T)o;
    }
}
Instead of Flux.mergeFlux.mergeDelayError is used which handles the situation better if the worker returns an error for completion. In this particular use case, it’s desired that deletion continues even if one worker fails to do so. The worker is also expected to return a Mono<Void> which completes once the input flux is consumed. The simplest worker function would thus be Flux::then.
The unchecked cast could not be removed, but in this circumstance it should never fail as the merged flux can only contain items of type T, as the Mono<Void> just completes with no items at all.
A usage example in a more reactive style of coding would be
return reactiveCustomerRepository.deleteCustomersByIds(customerIds)
        .transform(deletedCustomers -> ReactiveUtil.dispatchToWorker(
                deletedCustomers,
                workerFlux -> {
                    Flux<AccountId> toBeDeletedAccounts = workerFlux
                            .map(Customer::getAccount);
                    return reactiveAccountService.deleteAccounts(toBeDeletedAccounts);
                }
        ))
        .transform(deletedCustomers -> ReactiveUtil.dispatchToWorker(
                deletedCustomers,
                workerFlux -> {
                    Flux<InvoiceId> toBeDeletedInvoices = workerFlux
                            .flatMap(customer -> Flux.fromIterable(customer.getInvoices()));
                    return reactiveInvoiceService.deleteInvoices(toBeDeletedInvoices);
                }
        ))
        .then();
Note the pattern of using .transform together with the utility function. The output is the same as the working example above.

Conclusion

Reactive applications should still follow the overall architecture of larger applications, which are usually split into several components for each functional domain. This approach clashes with reactive programming, where usually one stream is mapped with operators and dispatching work to other services is not easily supported. This post shows a solution, although usage in Java of the presented utility function is still somewhat clumsy.
In Kotlin, the usage of extension functions would make this utitilty easier to use without the rather clumsy .transform pattern above.
It’s also open if there’s a better solution for the presented problem. Comments welcome!