Last year we had an incredibly successful Black Friday, with all of our e-commerce properties handling the load without a hitch.
One of our websites was responding in under 300ms with over 2,000 concurrent users.
In this article, I'll discuss our strategy for scaling for this year's Black Friday on 23 November, 2018.
Previously, we relied heavily on caching (telling our clients not to dare clear the cache on the day) as well as massive application and database servers.
While the websites handled the load, this came with obvious drawbacks.
- Those heavily specced servers are costly and were commissioned for the entire year, meaning that for the majority of the time, the client was paying for capacity that was completely unnecessary. A very expensive way to do things.
- Relying on cache is very problematic for some of our e-commerce clients that have as a business requirement to be extremely flexible and agile with their stock management and pricing, needing to change things often and have the website reflect these changes immediately.
So what will we do differently this year?
- Where possible, we are throwing off the shelf e-commerce systems in the bin, favouring a custom developed e-commerce platform using Vue.js and Elastic Search, with a custom MySQL database that is blazingly fast, even under tremendous load, without caching.
- Host using Docker and Kubernetes on Google Cloud.
1) Custom e-commerce system
As a company, we have tried almost every major off the shelf, self-hosted e-commerce system, including WooCommerce, CS-Cart, Magento, Expresso Store, VirtueMart, OpenCart and Craft Commerce.
After many years, we finally reached the conclusion that the benefits are outweighed by the costs for high quality, enterprise e-commerce websites. Sure, if you want something quick, dirty and template based, by all means choose WooCommerce or even better, go for a hosted solution such as Shopify.
If however, you are looking for an enterprise grade, unit tested e-commerce system that is flexible and integrates with your POS and can scale to over 5,000 concurrent users (and beyond), then all the aforementioned systems will fail you.
In summary, problems we've had with off the shelf systems are:
- Bloated database design
- Poor code structure
- Outdated technology
- Slow under load, unless heavily cached
- Extremely complex to customize and integrate 3rd party systems into
- Complex to implement unique designs (I'm looking at you Magento)
For many of our enterprise clients, we have now started rolling out a custom developed solution.
This has as its drawback that certain things can't be configured on the backend (and requires developer intervention) and as its benefits that:
- The datamodel is sensible, clean and lightweight
- The underlying application code is robust, with strong separation between front end and backend layers
- We were able to pick and choose best-of-breed technologies as opposed to having specific technologies forced upon us by the off the shelf system. For example we are utilizing Vue.js and Elastic Search heavily, both outstanding, battle tested, but "new" technologies
- The front end is exceptionally easy to change for each client, or even for each client's properties (think Multi Site Manager). This allows our designers and UX experts to focus completely on creating a unique user experience, without having to worry about sticking within the guidelines of a theme. Our front end developers also have no reason to touch backend code.
2) Docker, Kubernetes and Google Cloud
Previously, we put most of our clients on dedicated servers at Hetzner. The benefit is that one gets quite a lot of firepower at low cost, the drawback being that it's not possible to quickly scale up or down i.e the hardware configuration is pretty much set.
This year, we plan on doing things differently. For about nine months now, we have been hosting applications on Google Cloud, using Docker and Kubernetes, having successfully load tested our systems to over 10,000 concurrent connections.
For the layman, this means that we are in a position where our websites respond automatically to load. If traffic warrants it, the system gets more firepower, typically additional "virtual machine instances" (nodes).
So our Black Friday strategy is pretty much already set thanks to the Kubernetes based auto-scaling setup we are using, except that we will most likely manually scale the systems up the day before to prevent any scaling deadlocks as a huge amount of traffic hits the websites on an almost instantaneous basis (for example off the back of a mailer).
We will most likely be exceptionally aggressive in over provisioning CPUs and RAM since the cost of doing so will be for one day only (Black Friday).
We'd rather err on the side of caution and throw a large amount of hardware at the problem on this day – since our clients will only pay for this for one day before it's scaled down again.
Combining a well-designed, lightweight, purpose built e-commerce system with a Kubernetes based setup on Google Cloud, we are, for the first time in years not feeling any trepidation for this year's Black Friday, but are gleefully looking forward to it.
See you on 23 November! Make sure you don't fall over!