Microservices and DTAP environments
This article describes a strategy on how to use environments to avoid some difficulties in a microservice landscape.
Environment Types
First let me give you a quick overview of some common environments I’ve seen in the past.
Local : This is where the real coding gets done. The environment exists on your local machine. Typically simulating your production environment with containerization or virtualization.
Dev : An environment where you want to make sure your merged branches work. This probably resembles the production environment a bit more (it might run on the same application server as production, but things like load balancing are sometimes left out).
Test : In this place automated and manual (regression) tests are run. Validating the product to make sure bugs don’t return and everything works as expected. Most of the time this environment is also used to expose functionality to other squads so they could test their integrations, demo new stuff to stakeholders and even test workflows or preview things for the business in a sandbox environment. (note: also frequently used by the tester of the team)
Acceptance : This environment is very similar to production and is used for performance testing. It has been used for UX or QA as well as the final go to launch the product.
Demo / Beta / Alpha / … : These environments are typically used in projects where big new websites are being released and runs next to the production environment for a limited audience.
Prod : On this environment flows the money.
It hurts!
In a company with many squads owning many microservices, using environments as mentioned above is a real pain. Let me explain..
Environment differences
When your dev environment isn’t an exact copy / properly aligned with your production environment, you’ll always have unforeseen difficulties when moving to production.
Manual approval
If you need manual approval to promote to the next environment (manual testing or acceptance approval) you are blocking your deployment pipeline. This will delay your continuous delivery and slow you down to getfeatures, technical improvements, bugfixes or security patches into production.
Communication overhead
Testing your service integration by linking up to another microservice’s test or acceptance environment will make it hard to detect whether the problem is yours or not. To prevent failures on their side you’ll have some serious communication overhead. SLA’s are very different from production, unexpected bugs are popping up more frequently, unexpected downtime due to redeploys. When testing an integration, you want to test your side of the application and guarantee the other side works as expected.
Business consumer testing Often I see business consumers working on a test environment because they want to test something before they do it in production. A great example is a preview of an article on a site. They copy the article from production to the test environment, publish it there so they can preview it on the website on the test environment to make sure the layout is okay. But often your test environment doesn’t work as expected, is down, has changed, reverted, cleaned data which gets business pissed off. (and they’re right)
So what can we do?
1. Keep your environments the same
Keep your environments exactly the same. Dev is just the same as test, as acceptance and production.
Infrastructure as code can help you with this. A good practice is to have your whole product in a git repository including your infrastructure description, firewall rules, VPC setup, load balancer setup and so forth.
To save costs, make sure your environments aren’t running when you’re not using them or go serverless.
2. Only production is reachable
Fully automate your pipeline and environments all the way up to production.
This means nobody can reach your Test, Acc or any other environment, not even your engineers should need to be there doing manual validations and certainly no other squad.
You can use feature toggles when business needs to see a feature that shouldn’t be available yet and previewing articles on websites might let you build a specific preview status. The goal is that the quality of your product is validated and ready for use, even for business testing.
3. Manual testing in production
So if everything is automated to production this means manual exploratory testing is done on production? Hell yeah! Exploratory testing can happen continuously on production. If you do find a bug, fix it, write a test so it never returns and automate yourself through the pipeline again.
In a microservice-world you’ll have to follow the chaos principles. To make sure you have a product that can cope with for network failures, latency issues or unexpected responses run the chaos monkey on…production.
Side note : If you haven’t read Toby Clemson’s article I recommend you read ‘Testing Strategies in a Microservice Architecture’ for a deep dive in that subject
Side note 2 : I plan to write some concrete scenario’s, so if you have difficulties adopting these patterns let me know and I can use them.
by the way, we’re hiring