Serverless: From POC to Production

Posted by on December 17, 2019

Building serverless applications means that your developers can focus on their core product instead of worrying about managing and operating servers or runtimes, either in the cloud or on-premises.

Amazon Web Services

As FreeAgent begins to move to AWS, there are plenty of opportunities to take advantage of cloud-native technologies such as AWS Lambda. At the beginning of September, one such opportunity presented itself. We decided to create a new serverless implementation of our thumbnail previews for file attachments and company logos.

On-demand processing of thumbnails is well suited for serverless functions and gives the following benefits:

  • Increased agility
  • Reduced storage costs
  • Resilience to failure

We based our implementation on this excellent AWS blog post and CloudFormation Stack. Having this solid foundation gave us a significant boost in efficiency when writing our POC (proof of concept). We were able to integrate an on-demand serverless thumbnail preview service into parts of the development and integration environments in a matter of days. It took another ten weeks to finish the project. It certainly didn’t feel like an increase in agility.

Integration

Our POC was far from feature-complete, nor was it fully integrated into the FreeAgent application. Integrating the new serverless solution was one of the pit-falls we continued to fall in. Our application had previously generated thumbnails ahead of time, and this knowledge had leaked across multiple models, jobs, and views.

We used the following techniques to integrate the new solution:

  • Create new models to abstract business logic.
  • Use feature flags to divert the behavior to the new functionality.
  • Use a common interface to reduce the need for change in other places.

Environments

We made the Thumbnail Service (as it became known) a fully independent, albeit small, system. We removed any FreeAgent business logic from it and treated it as if it were a third party system.

Treating the system this way allowed us to:

  • Iterate changes quickly
  • Deploy Independently
  • Set a good precedent for future projects

Practically speaking, this decision forced all FreeAgent application environments to talk to the production Thumbnail Service. Doing this ensured:

  • Stability across environments
  • Strong authorization*
  • Fewer surprises when deploying to production

* We used IAM Authentication with API Gateway to ensure requests from each environment were correctly authorized to access the thumbnails they were requesting.

Visibility

Creating a separate service or application requires an understanding of how the service will perform in production. We’re using new technologies in AWS Lambda and API Gateway, two serverless resources from AWS to offer high scalability. Nevertheless, we must understand how our system is performing in the production environment and what impact that has.

We took the following steps to ensure the visibility of the system:

  • Define a Service Specification
  • Create Alarms based on our Specification
  • Ensure logging is accessible
  • Create business-focussed dashboards, e.g Requests per Logo
  • Create system-focussed dashboards, e.g API Latency

Previously, we had a few of these metrics, but not to the granularity which exists now. This visibility was vital in ensuring we delivered a stable and reliable service.

Ownership

One of the key benefits of the project was that it was run with a handful of people. Reaching out to key stakeholders and the operations team only when it was necessary.

A single team managed deployments, application code, infrastructure, and testing. Help was still available when needed, but bottlenecks between teams were reduced considerably.

AWS Services such as Lambda and API Gateway make strives to remove the complexity of managing servers (as the name suggests). However, it would be disingenuous to state moving to this new serverless stack removed all complexity. Servers may not be being managed (by us) anymore, but the knowledge the team needed to acquire was broad across multiple domains.

Stability, Improvements & Changes

Generating on-demand thumbnails on AWS Lambda gave us more benefits:

  • Isolated environments increased stability
  • Fixing bugs benefit past and present thumbnails

However, swapping to an on-demand serverless implementation requires a mind shift.

One example was rendering invisible thumbnails. Thumbnails not shown were costing us money. Actual money. With a tangible cost. To tackle this issue, we made some improvements which reduced calls to our new service by 600%.

Conclusion

Overall, the project was a success. The new technologies we used created business value and lessons learned for the future. Serverless technologies may reduce the need for managing servers, but there is still a lot of peripheral work in ensuring a service is ready for production.

Leave a reply

Your email address will not be published. Required fields are marked *