So you want to go Serverless, or do you?
For a “Serverless evangelist and enthusiast” hat that I am wearing these days, this is quite an anti-pitch. As much as I would love to have your business to do Serverless, I am also cautious about over-promising.
So, let me tell you why (or rather “when”) you should not do Serverless and then make a pitch to assess your system requirements and do Serverles where it is applicable.
Before I do that though, a little story:
Years ago, as a consultant, I was asked to provide an architecture proposal for a rather unusual project. An insurance company located in the UK wanted to dismantle its microservices architecture and become a monolith. I was intrigued and some of my colleagues even scoffed at the idea. Why would anybody want to do that at a time when microservices were so fashionable ?!
During discovery, the representatives of that company shared their frustrations: Transaction inconsistencies, too much storage required, and the added complexity of managing microservices. So much so, that their leadership and management now hated microservices. As is not uncommon, the blue-eyed boy of yesterday was today a villain.
Digging deeper into the requirements and SLAs, I came to appreciate their frustrations.
The requirements entailed complex, distributed transactions and undo functionality.
The microservices architecture employed the Saga and Event Sourcing|Event Store patterns. This meant:
- a number of compensating transactions triggered for each domain event that is reverted.
- In some use cases, a series of other events may have passed in the interim that were dependent upon the previous ones.
- Resurrecting all those events from the event store and undoing them was a Herculean task and seemed impossible in some cases, leading to inconsistencies.
- Add a scenario when one of the entities involved in the transactions becomes unavailable, and you have the recipe of a perfect storm.
- Highly denormalized data and recording every event meant ever increasing storage at a high rate.
- Orchestrating these micro-services and deployments using Docker and K8s and their monitoring was quite complex and demanding.
The decision to move away from microservices and towards monolith therefore, seemed well thought. Actually, a well after-thought (hindsight is always 20/20).
My proposal for the monolith contained portions about denormalizing the data using a traditional RDBMS along with a plan to migrate existing data. That we did not win the contract due to pricing and other factors is another matter, but this exercise turned out to be some very precious learning:
No, micro-services is not a villain. Just like monolith is not a villain and neither is the late comer on the block: Serverless.
They are all perfect technologies for their respective use cases.
It is about choosing the right tool for the job.
Talking of Serverless, not having to deal with infrastructure, procurement and maintenance and patching etc., and still getting cloud scaling, high availability, and disaster recovery all without a dedicated infrastructure team can appear irresistible to anyone. Not only that, Pay-as-go style pricing makes Serverless particularly enticing. AWS for example, gives 1 million free requests per month for its lambda services. Azure Functions offers a similar deal.
Is Serverless really the silver bullet, then? Why can’t you go ahead and replace anything and everything with Serverless, bidding goodbye to your infrastructure department?
Not so fast. Not at least before I play my broken record: The only silver bullet is the knowledge of architecture. Architecture that must employ the technology and design as per the requirements and the SLAs.
With Serverless, costs increase with memory usage. If your transactions require heavy payload, then it may not be cost-effective to use Serverless. There are limits on how much load a Lambda API can take and there is a 15 minute execution time limit, ruling out long running processes.
And this is where it gets really interesting: AWS Lambdas and Azure Functions have a problem of cold start (Actually, I don’t see it as a problem, it’s just by design. Others may not agree). Cold Start is the time taken by the Lambda or Function to warm up and start executing the code. This time is required because the Lambda/Function runs in a shared environment and needs compute resources to be allocated to it. Cold start can be mitigated to some extent by provisioned concurrency, but that comes at an added cost and there are no guarantees that even with provisioned concurrency, the application will meet the performance SLA.
So, if fast response time is critical for the application, I wouldn’t recommend AWS Lambda/Function from the start.
Another Serverless option in such cases is the AWS Fargate or Azure ACI. These container technologies abstract out the provisioning of containers, and do not have the memory and execution time limitations of the Lambdas. This is great when the application built on these technologies scales up, as horizontal scaling is assured. Problem is, that when scaling-in, Fargate starts killing the containers…even those that may have active user sessions! Mind you, conventional EC2 has something called “instance protection” to protect active user sessions. With EC2 autoscaling, during scale-in, the instance is killed only after the active user session is closed.
There are several other factors that can play a role in choosing a technology for a given set of requirements. For example, cost considerations, maintainability, resiliency, observability, available skill set and more. It is not possible to cover the entire scope in a blog, but I hope that a case has been made against hasty generalization and against applying one-size-fits-all when choosing a technology for your projects.