Proudly written by Wannes Rosiers — Area Manager Data @ DPG Media
This post is dedicated out of gratitude to Erik Dolstra, Mathias Lavaert, Genzo Vandervelden, Vladislav Bakayev, and a few of our stakeholders who have managed to set up a production environment in less than one day!
Reports and their impact on morning rituals today
Almost all Business Intelligence (cloud) reports at DPG Media are currently built on top of Redshift, AWS’s analytical database. Yet it lacks a few features which our consumers desire, namely:
1. Ease of performance tuning — definitely regarding performance impact of concurrent users
2. (near) Real-time data ingests as it is mainly designed for bulk inserts.
For many people, a typical day at DPG Media starts with opening their most relevant dashboards. This brings huge morning usage peaks on Redshift, causing the average query to be queued for 15 minutes and run for 93 seconds. This means that you have more than 16 minutes between opening a dashboard and seeing the results. Fair to say that a typical day at DPG Media starts with opening a relevant dashboard, taking some coffee, going to the restroom, and, currently being at home, doing your laundry…
Snowflake to the rescue
That’s why the data area decided to perform a Proof Of Concept with Snowflake. A cloud-native analytical database, designed for autoscaling (both horizontally as vertically) and low maintenance. And of course, we love a low maintenance system! In only five days, we set up an environment containing our most significant data source (Snowplow tracking data), we did build a demo dashboard in Looker and did mock the morning peak. The results were impressive! Both in load times…
As in query and queue times.
“Load huge datasets was very easy with Snowpipe, I didn’t expect such a performance increase, even with an S warehouse! You exceeded expectations.”
Quote PoC Team
From over 16 minutes to less than 1, make sure that you have a fast coffee machine for the future, or your report will be waiting for you.
The choice to go for Snowflake seems, in this case, an easy one, yet still, we took some time to make the pricing estimation. On Redshift, you pay for storage and compute and have a small unknown in the pricing for concurrency scaling, and Snowflake is entirely unpredictable. Yet we managed to make a good enough estimation to trust that it will be in the same price range as Redshift. And then we decided to go for it!
Only two weeks after the Proof Of Concept
Two weeks after the PoC, we had finalized the contract with Snowflake and were ready to start using Snowflake, installing it via the AWS marketplace. As news travels fast, our stakeholders from ‘Werving’ (marketing business department) almost begged to be the guinea pig and allow them to switch from Redshift to Snowflake as soon as possible. Hence a few days later, we made a mixed team for a day with a clear purpose:
- Set up the Snowflake environment
- Connect to the ‘Werving’ data source, namely an S3 bucket
- Set up the connection to the dashboard tool Looker
- Create the first report to show end to end functionality
And they succeeded! That even brings some more confidence to the team, as they stated afterward:
“Now that we’ve moved Werving’s dashboards to Snowflake, we feel even more confident that Snowflake can handle our analytics wishes and demands.”
Quote Team
Sometimes it’s as simple and as fast as that, and the only task for a proud area manager is to say: Now, It is time to party! ;-)