Event-Driven Architecture

A simple event emitter

So I have some actual experience with Event-driven architecture myself but this week I followed an interesting webinar from ThoughtWorks about this specific topic. Here are my main takeaways.

It is nothing new

Events are nothing new. Events are everywhere and life itself is actually just a set of events.

It is valuable to the business

It is a really natural way to describe your business. For example, a customer adds an item, drops an item, and adds again a new item to their basket. Having these events stored, gives great insight into a customers behaviour.

It adds structure to the system

Events are a functional way to design workflows with immutable events. They provided a mechanism to easily extend other workflows in a loosely coupled way. Events also let you focus on what valuable information your domain has and what information is relevant for other domains.

Event-sourcing

A great example of event-sourcing is Git. Instead of storing the latest state, Git stores events that contain what changed. You can easily go back to a previous state or browse through all the changes that happened over time.

Event orchestration

Event orchestration is when you have a single service that is telling the other services what to do and when to do it. This is like the conductor in an orchestra who shows the musicians what to do.

Event choreography

Event choreography on the other hand is more decentralized. The services communicate with each other and know what they have to do in relation to the other. This does not require supervision and instructions.

Event streaming

Event streaming is taking it to the next level where you have real-time event streams and processes, and smaller systems to do other things like filtering streams, produce new streams, etc. What an event stream actually is, was not really clear after the webinar. But after doing some research I found out that an event stream has to do with persisting events and can be used to get a view of the events. For example, one can have two event streams, one which contains every change made in chronological order and one which provides the latest view of an entity.

Challenges of an Event-Driven Architecture

Every architecture has challenges and working with events requires a little bit of a different mindset.

Handling failure

Things will go wrong. Let’s say you store something in a database and then publish an event. Either of these processes can fail. What can we do in this case? Did subscribers already process our event? And what to do with missing events?

The first question you should ask yourself is, is it okay to miss the event or is it crucial and do we put in the extra work?

One pattern you can use is the outbox pattern. In this pattern you store transactions in the database and in a separate process you read from the database, scan for updates, and publish events in a reliable way to a broker.

Some frameworks also offer this out of the box. So service A publishes an event and service B is offline. When service B comes back online, it starts reading events from where it stopped.

Observability and debugging

In a complex system, it is crucial to be able to debug the system and have a way to see what’s happening inside. The developers from Thoughtworks came with a few options.

  • Have correlation ids on all events.
  • Have standardized log formats.
  • Have a good tool for searching logs with a correlation id.

Versioning

Business changes often and our software as well. Here are a few things to consider.

  • Have contracts in place. Contracts say what an event looks like.
  • Have good communication between teams, so that you know when things change.
  • Communicate with all the consumers about the change of the event update.
  • To avoid breaking or having to update other systems, you can also decide to publish using another schema.

Slicing the work

Generally, we want stories to center around the business value that they add. Because events often require changes in many different services, we could end up with monster user stories.

To avoid this we can slice the work in smaller parts and still keep the ability to show progress to the stakeholders. We can for example publish to the event broker without having the consumers. This gives us the opportunity to share progress and show what the change would be. The next step will be to actually build the consumers.

Legacy components

Event-driven architecture can be a useful tool to refactor legacy systems. Focus on getting the data out of the legacy system. You can do this by reading from the database and publishing events into a new event stream to integrate with the old system.

Commands

Commands tell us that we have to do something. Often people use passive-aggressive events. An example: the teapot is empty. Some subscriber picks up this event and fills the pot. What you actually want to say is: fill up the pot.

Events are just notifications, letting the consumers know something has happened. No one needs to do anything about it. You can do something about it, but I could just be shouting into the wind.

When we are talking more about the state, this is the state and this has to happen, then we are looking more at commands. By using passive-aggressive events we are losing the choreography of events.

Final note

When deciding if you are going to use techniques like CQRS or Event-sourcing, you do not have to follow them to the letter. It is ok to just pick what you need and what works for you and just drop the parts that are not useful.

It was an interesting webinar, where I learned some new terms and techniques that can start my further research into some topics related to Event-driven architecture. For example, having the right names for event orchestration, choreography, etc is really steering me in the right direction.

My MobX findings

For one of my React Native projects, I started to use MobX as a state management tool. I have used Redux before, but I have to say that MobX feels a lot simpler in terms of updating and managing state via code.

Mobx is using an observer pattern to signal state changes and on every change, it will send a message to every subscriber that is observing some observable variables.

Changing the state is as easy as just changing a property on your model. All subscribers of the property will be updated. This reactive model can be used in any JS framework or even in plain JS.

An example store

class GameStore {
  @observable
  loading: boolean = false;
  @observable
  games: Game[] = [];

  @action.bound
  async addGame(game: Game) {
    runInAction(() => {
      this.games.unshift(game);
    });
  }

  @action.bound
  removeGame(game: Game) {
    runInAction(() => {
      this.games = this.games.filter(g => g !== game);
    });
  }
}

This is the simplified version of my GameStore. It has a few actions and some observable properties.

Best practices for using MobX

While using MobX I follow to the following (best) practices:

Use many specific stores

In Redux you are used to having only one store for all your application state. MobX suggests to have as many stores as you want. So in my application, I have a GameStore, a NoteStore, a LocationStore, and a UserStore. A store is nothing more than a class holding reference to the piece of state that you want to manage.

So my GameStore has a simple Array of Game models and exposes some actions like addGame() or removeGame() that just push or pop the game on the internal Array. MobX takes care of updating all the observers.

Change state only via actions on the model

Because changing state in MobX is so easy, you could do that easily in your view components. However, this is, in general, a bad idea as it becomes hard to track where in your application the state is changed. A better practice is to define actions on your model. Actions are simple methods on your model that define behavior. So when I want to change the name of a user, I could simply set the name property of my User model in the Profile view. But instead, I will define a method called changeName() on my User model and just call that. This way we have all the state changes nicely together in our model.

A pattern I am just figuring out is to pass the GameStore to my Game model so that I can just call remove() on my model. This remove action will then tell the GameStore to remove itself from the list of games. This makes my code really easy to read.

About autorun

MobX makes it easy to react to state changes in your model by using reactions. Autorun is one of them. Reactions make it easy to observe some state and react to it. At the moment I am working on using autorun to implement a sync mechanism with a remote server. So when a game is changed, it is updated in the local database and in the API.

Conclusion

I really like this reactive model of MobX. It takes a lot less code to change the state compared to Redux and it’s reactions make it really easy to react to changes. I am sure there will be good reasons for when to use Redux or when to use MobX and that MobX probably has some drawbacks as well. I did not find any yet, but I will be looking out for them in the future when I come further in the development of the project.

https://mobx.js.org/README.html

Actions and reducers for a PHP developer.

Redux is a predictable state container for Javascript apps. It manages state via actions and reducers.

Actions

The state cannot be updated with setters. This is done so that your code will not update the state arbitrarily, causing bugs. To change the state you have to dispatch an action. An action is a Javascript object that describes what happened. For example:

{
   type: 'ADD_TODO',
   text: 'Go to the swimming pool'
}

Enforcing that the state only changes via actions, gives us a clear understanding of what’s going on in the app.

Reducers

To tie state and actions together, you have to write a function, called a reducer. It’s just a function that takes the state and an action, it applies the changes from the action and returns the new state.

export default (state = [], action) => {
    if (action.type === 'ADD_TODO') {
        return state.concat({text: action.text, completed: false});
    }

    return state;
}

The action contains only the minimum information about the new change. The reducer is responsible for making that change correctly. The reducer should manage other logic like whether the item is completed. It would be a bad idea to add the completed flag to the action. Keep actions as simple as possible.

View components in your app are connected to the Redux state. When the state is updated, the component needs some logic to update itself.

Xdebug issues on macOS

Recently I spend a lot of time trying to configure Xdebug on my local machine. I am working on macOS Mojave with PhpStorm.

The standard installation guide from JetBrains is a really good start and works 99% of the time. But this time I could not get it to work at all.

The problem

Whenever I started the debug button, the debugger would never hit any breakpoint in my code. I was getting the following error:

Connection with Xdebug 2.6.1 was not established.
Xdebug 2.6.1 extension is installed. Check configuration options. 

The solution

After troubleshooting the issue and a lot of Googling, I checked which other processes were listening to the port 9000.

sudo lsof -iTCP -sTCP:LISTEN -n -P

It turned out the php-fpm was listening to port 9000 and was receiving the debug connections from my editor. It turned out that I had installed php with php-fpm via brew and brew had configure the php service as autostart.

Stopping the brew php service fixed the issue.

brew services stop php           

Happy debugging!

What makes a good Unit test?

According to Clean Code, the three things that make a clean unit test are readability, readability, and readability.

Readability in tests is probably even more important than in production code. So what makes a test readable? The same standards that make other code readable: clarity, simplicity, and density of expression.

Clean tests will contain a lot of your domain language. That means it will not focus on the technical terms of your testing framework, but rather on the business case the code is implementing.

F.I.R.S.T.

Clean tests should follow this acronym.

Fast

Fast means fast. Tests should run fast because you want to run them as frequently as possible. If your tests are slow, you will not run them often enough and the feedback loop will be longer. Lately, I have been working on a codebase that has around 5000 tests and performs around 8300 assertions in less than 3 seconds. I think that’s a good example of fast tests.

Independent

Tests should not depend on each other. It should be clear why a test fails and where you should go to fix the problem. If tests depend on each other, one failing test can cause a lot of depending tests to fail which makes diagnosis more difficult.

Repeatable

You should be able to run your tests in all environment. In your production environment, in your QA environment, and on your local machine. They should not depend on a network connection. If your tests are not repeatable, you will have an excuse for when they fail. Plus you will not be able to run them when the environment is not available.

Self-Validating

A self-validating test is a test with a boolean output. It either fails or passes. You should know it directly. You should not have to go through a log file to see which tests failed.

Timely

Tests should be written just before the production code. If you write your tests after the production code, the change that you will write code that’s hard to test is much bigger. Writing your tests before the production code always results in code that’s easy to test. And code that’s easy to test is also often clean, clear and simple.

How to test protected and private methods

Testing public methods with any testing framework is easy because we have access to the public interface of the test subject. Unfortunately, we don’t have access to the protected and private methods of our test subject. They are hidden to us (the client) for a reason. They are implementation details the class does not want to leak to potential clients.

Generally speaking, it is a bad idea to test private and protected methods. They are not part of the public interface and they are usually executed via the public methods of a class. So when we test the public methods, they implicitly test private and protected methods.

In TDD we are testing the behavior of a class, not the specific methods. So when we make sure we test all the things a class can do, we can rest assured that the private and protected methods are tested.

If you find yourself having large private methods which need their own test, there is probably something wrong with your design. Always aim for designing small classes that do one thing.

If you still think this is nonsense and you really want to test those methods, you can take a look here on how to do that with PHPUnit.

 

How to handle time in your tests?

Handling time in Unit Tests

Often we face scenario’s where a process we are modelling is depending on time. Take for example an expiring coupon. A coupon is no longer valid when it has expired. In PHP we could implement this as follows:

class Coupon
{
    /**
     * @var \DateTimeImmutable
     */
    private $expiryDate;

    public function __construct(\DateTimeImmutable $expiryDate)
    {
        $this->expiryDate = $expiryDate;
    }

    public function isValid(): bool
    {
        $diffInSeconds = $this->expiryDate->getTimestamp() - time();
        return $diffInSeconds > 0;
    }
}

class CouponTest extends TestCase
{
    public function testANotExpiredCouponIsValid(): void
    {
        $coupon = new Coupon(new \DateTimeImmutable('2018-10-28T07:14:49+02:00'));
        self::assertTrue($coupon->isValid());
    }

    public function testAnExpiredCouponIsNotValid(): void
    {
        $coupon = new Coupon(new \DateTimeImmutable('2017-10-28T07:14:49+02:00'));
        self::assertFalse($coupon->isValid());
    }
}

The problem

When we run the test for a valid coupon someday after 2018-10-28 (the date it will expire), it will actually fail because the coupon has actually expired. That’s not what we want because we need to test coupons with an expiry date in the future. So we need a way to fix that.

One way to resolve this issue would be to just change our test and pass in a date that’s 500 years in the future. It will fail again in 500 years but it will not be our problem anymore.

A better way to resolve this is to actually pass the current time as an argument to the isValid() method. Look at the new implementation:

class Coupon
{
    /**
     * @var \DateTimeImmutable
     */
    private $expiryDate;

    public function __construct(\DateTimeImmutable $expiryDate)
    {
        $this->expiryDate = $expiryDate;
    }

    public function isValid(\DateTimeImmutable $now): bool
    {
        $diffInSeconds = $this->expiryDate->getTimestamp() - $now->getTimestamp();
        return $diffInSeconds > 0;
    }
}

class CouponTest extends TestCase
{
    public function testANotExpiredCouponIsValid(): void
    {
        $coupon = new Coupon(new \DateTimeImmutable('2018-10-28T07:14:49+02:00'));
        self::assertTrue($coupon->isValid(new \DateTimeImmutable('2018-10-01T00:00:00+02:00')));
    }

    public function testAnExpiredCouponIsNotValid(): void
    {
        $coupon = new Coupon(new \DateTimeImmutable('2017-10-28T07:14:49+02:00'));
        self::assertFalse($coupon->isValid(new \DateTimeImmutable('2018-10-01T00:00:00+02:00')));
    }
}

That’s better. Now we have frozen the current time and are no longer depend on the real system type. Our tests will always pass.

How to test code that needs sleep?

The problem

Let’s say we have an API that knows a rate-limit of 1 request per second. When we have to make multiple requests, we need to build in some timeout mechanism. This mechanism would check if the last request was shorter than 1 second ago, otherwise it delays the request for 1 second.

One way to achieve this in PHP is:

class Api
{
    private $lastTime = 0;
    /**
     * @var \GuzzleHttp\Client
     */
    private $client;

    public function __construct(\GuzzleHttp\ClientInterface $client)
    {
        $this->client = $client;
    }

    public function doRequest(): void
    {
        $now = time();
        if ($now === $this->lastTime) {
            sleep(1);
        }
        $this->lastTime = $now;
        $this->client->get('https://some.api');
    }
}

How to test it?

To verify the behaviour of this class, we can mock the http client and do assertions on it. We are going to do a request twice and we are going to assert that the api is called twice. It would look something like this:

class ApiTest extends TestCase
{
    public function testCallsTheApi(): void
    {
        $httpClient = $this->createMock(\GuzzleHttp\ClientInterface::class);
        $httpClient->expects($this->exactly(2))
            ->method('get');

        $api = new Api($httpClient);
        $api->doRequest();
        $api->doRequest();
    }
}

Now we have a few problems. First of all we just slowed down our tests. Our unit tests are supposed to be fast, but now we waste 1 second. Second of all we are not actually asserting the second request came at least 1 second after the other. We could assert this by measuring the execution time, but this would still keep our tests slow.

A solution

To make our tests run fast and actually assert that our sleep function is called, we could do the following:

class Api
{
    private $lastTime = 0;
    /**
     * @var \GuzzleHttp\Client
     */
    private $client;
    /**
     * @var callable
     */
    private $sleep;

    public function __construct(\GuzzleHttp\ClientInterface $client, callable $sleep)
    {
        $this->client = $client;
        $this->sleep = $sleep;
    }

    public function doRequest(): void
    {
        $now = time();
        if ($now === $this->lastTime) {
            \call_user_func($this->sleep, 1);
        }

        $this->lastTime = $now;
        $this->client->get('https://some.api');
    }
}

class ApiTest extends TestCase
{
    public function testCallsTheApi(): void
    {
        $httpClient = $this->createMock(\GuzzleHttp\ClientInterface::class);
        $httpClient->expects($this->exactly(2))
            ->method('get');

        $sleepCalled = false;
        $api = new Api($httpClient, function (int $sec) use (&$sleepCalled)  {
            $sleepCalled = true;
            self::assertEquals(1, $sec);
        });
        $api->doRequest();
        $api->doRequest();

        self::assertTrue($sleepCalled);
    }
}

So what changed? We made the sleep function configurable via the constructor of the Api class. In our test we pass a spy function that asserts the callback is called and it’s called with the correct amount of seconds. This test now runs fast and is actually testing the needed timeout logic.

In our production code we would instantiate the Api class like this:

new Api(new Client(), function (int $sec) {
    sleep($sec);
});

Conclusion

There are for sure other ways to handle this case, but this is surely an elegant solution. If you have idea’s or thoughts about another solution, please let me know in the comments.

What is a Software Craftsman?

Last week I started to read part 1 of The Software Craftsman: Professional, Pragmatism, Pride by Sandro Mancuso. This post is my attempt to summarise what it means to be a software craftsman.

A software craftsman does more than just a coder. A coder writes code. A software craftsman crafts a software product. Software craftsmanship is about raising the bar of our industry. We all know the stories of projects that exceeded all deadlines and budgets, went full of bugs into production and did not match the clients expectations very well.

The Software Craftsmanship Manifesto

“As aspiring Software Craftsmen we are raising the bar of professional software development by practicing it and helping others learn the craft. Through this work we have come to value:

Not only working software, but also well-crafted software.
Not only responding to change, but also steadily adding value.
Not only individuals and interactions, but also a community of professionals.
Not only customer collaboration, but also productive partnerships.

That is, in pursuit of the items on the left we have found the items on the right to be indispensable.”

Well-crafted software

We have all worked on ‘legacy’ applications. Applications that have no tests. No one knows exactly how the application works. The code is full of technical and infrastructure terms instead of expressing the business domain. Classes and methods have hundreds of lines. Developers working on these kind of applications are crippled by fear because they don’t know where their changes will break stuff.

So the software is working and is earning a lot of money. But is it good software? Well-crafted software means that everyone working on it can easily understand the application. It does not matter if it’s a brand new green field project or a legacy application of ten years old. It has high test coverage, has a simple design and the code is written in the language of the domain. Adding features should be as painless as in the beginning of the project.

Steadily adding value

It’s not only about adding new features and fixing bugs, but also about improving the structure of the code. This means refactoring parts of the application to keep the code clean, testable and easy to maintain. It is the job of a Software Craftsman to make sure that the older the software gets, the more the client will benefit from it. We do this by making sure we can easily add new features, so that our client can respond quickly to the changing market. Our software should not slow them down and become a source of pain. A great rule is the Boy Scout rule by Uncle Bob, that we should always leave the code cleaner than we found it.

The focus on well-crafted software is important if we want long-lived applications. Totally rewriting an application a few years after it has been developed is usually a bad idea and has a very poor return on investment.

Community of professionals

It’s about sharing the knowledge of building quality software with other developers. Learning from each other is the best way to become better. We can contribute to the greater good of our industry by sharing our code, contributing to open source software, writing blogs, starting local communities and pairing with other developers. Most major cities in most countries have already regular developer meetup’s. The meetings are great because they attract people with all different backgrounds.

Besides communities this item is also about the work environment. Good developers want to work with other great developers that are better than they are. They want to work for great companies with great projects and a passionate atmosphere of improvement.

Productive partnerships

Software craftsmen don’t believe in the usual employer/employee type of relationship. Just coming to work, putting in the hours, keeping your head down and just doing what you are told is not what we expect from a professional. What the contract says, is just a formality. If you are a permanent employee, you should treat your employer in the same way as a contractor or freelancer does. Employers should expect from their employees the same great service they are getting from consultant’s and contractors.

Software craftsmen want to actively contribute to the success of the project. They want to question requirements, understand the business and propose improvements. Good craftsmen was a productive partnership with their clients. The advantages for the employer are enormous, a highly motivated team has a much bigger change to make any project succeed. This is a big difference with the traditional approach where business people talk with clients and tell developers what they have to build.

Conclusion

Software craftsmen are expected to do much more than just writing code. Craftsmen help their clients by proposing alternative solutions and providing ideas to improve their processes. They understand their domain and give them good information and knowledge. Professionals question their clients requirements in relation to the value they provide. They provide value to their customers at all levels.

Thoughts on integrating external services

This week I was working on the backend of a new application. One of the requirements was to convert addresses to coordinates. This process is also known as forward geocoding. Reverse geocoding is exactly the opposite, where you convert coordinates to addresses. Since we don’t have that geographical data ourselves, we decided to look at an external service. We choose for a  competitor of Google Maps, but that does not really matter. What I want to share with you here, is the solution that we came up with for safely integrating that external service.

About integrating external services.

In my first years as a developer I would not think long about implementing the connection with this service. “How hard can it be? Just call the web service and use the response. What could go wrong?” Nowadays I tend to take a step back and consider a few things:

What happens when the external service is not available? Will this kill the process? What will the customer see? Should we just keep retrying until their web service is back? Should we cache the responses on our end?

The solution that we came up with.

When we receive a new address in our system, it needs to be forward geocoded immediately. In an event driven architecture this is easy to accomplish. We just listen to the AddressWasAdded event and dispatch a GeoCodeAddress command on the async command bus. This command gets picked up by one of the workers and is handled by the correct handler. The handler tries to connect with the external service and saves the coordinates in the database. Because the external web service is billing us per request, the GeoCoderService caches the given coordinates for about one day. So if the same address is requested, it will hit the cache and not the actual service. The real implementation is more nuanced but globally this is how it looks:

Diagram of the solution

Diagram of the solution

Takeaways.

The main takeaway of this solution is the introduction of the Async Queue. This is solving most of our problems. If an error happens in the Geo Coder Service, the command is requeued and will be retried at a later time until it’s handled successfully.

Because we work with an event driven architecture and CQRS, processes are triggered via events and commands. But this solution can be implemented in any architecture. Just make sure that you application publishes a message on an async bus and let a consumer handle the rest.

I am very interested in your thoughts on this solution and if I am missing important questions about integrating external services. Please leave a comment to share your thoughts.