As of Monday, 24th of September 2018 FreeAgent is now running on the latest version of Rails. We were inspired by Eileen’s blog post about how GitHub upgraded from Rails 3.2 to 5.2 and we wanted to share with you the challenges we faced and how we managed to overcome them. We faced similar challenges to GitHub and we believe it is worth reiterating them to highlight their significance.
Considering that each upgrade requires considerable efforts that span across multiple teams, some people might ask “is it even worth upgrading?”. That is an excellent question and we would like to answer with a list of key points that drove our decision:
- Security Patches — Security is taken very seriously at FreeAgent. We aim to apply security fixes as soon as they become available and have been tested thoroughly. Running an older version of Rails can mean that it may not be supported to receive these fixes.
- Performance Improvements — The community constantly pushes the limits to improve the performance of libraries and frameworks. Rails is a great example where a new version can provide a significant performance boost to your application.
- New Features — Each major and minor version of Rails usually comes with a host of new features. Just recently Rails added the ability to manage file attachments via ActiveStorage which is an excellent feature that allows us to migrate away from the deprecated solution which is paperclip.
- Ruby Language Improvements — Major version changes tend to support newer versions of the Ruby language. This allows us to benefit from performance improvements as well as new language features.
- Ecosystem (gems) — Ruby’s ecosystem is simply incredible, providing thousands of individual libraries that can aid you in your day to day job. Whilst library maintainers do an amazing job at supporting multiple versions of Rails, there is a great chance that not all features and fixes will be available to you if you run an outdated version of Rails.
Due to the structure of our development teams, changes that impact the entire codebase usually fall within the remit of our Core Services team. We think of the Core Services team as a team that provides the glue between the other units within our application.
We usually start by getting a good understanding of the changes that were introduced by a new version. A good place to start are the release notes for the individual versions. It turned out to be beneficial to also check the changelogs for the individual gems. Changelogs provide a more detailed list of changes. Keep in mind that incremental version changes are essential as skipping one version can mean that you miss valuable deprecation warnings that just result in errors.
At this point we are trying to address as many deprecation warnings as possible. A benefit is that the commit introducing the upgrade only includes changes that are actually related to the upgrade. Furthermore, addressing deprecation warnings can be distributed to the entire development team. We added a set of tests that would prevent developers from reintroducing deprecation warnings into the system. This is important as a large codebase can have several of these deprecation warnings that need addressing. This also results in smaller changes to the application which can be tested in isolation.
Any changes that can be addressed prior to the upgrade will be addressed at this stage as well. This could include new framework defaults that don’t manifest as deprecation warnings but could be implemented prior to the version change. The aim should be to be as compatible as possible with the version you are targeting.
Once we are confident that we addressed all things that can be done prior to the actual upgrade we would continue with the upgrade process by creating a new branch that contains changes that are exclusively related to the upgrade and can’t be addressed upfront. In an ideal world that would just be a change to your dependencies. An initial run against your CI tool will normally reveal the areas that you need to focus on. Over the next weeks we would fix any issues that were flagged by our CI tool and periodically integrate changes that went into our master branch.
Finally, we perform a number of manual tests to ensure that the application is still operating as expected. This can take several days to complete. Any issues that are found during these tests need to be addressed and some subsequent tests need to carried out to validate the fixes.
Rolling out the changes
Once we are happy with the changes, we start to communicate our intentions to roll out the upgrade with the other teams. This includes our Support team to make them aware of potential requests from our customers that need to be fed back to the Engineering team. We usually make several announcements via Discuss (our internal Discourse-powered discussion forum), Slack and Email.
On the day of the upgrade we implement a merge freeze that prevents other changes from interfering with the upgrade. The merge freeze is expected to last an entire day under normal circumstances. We keep our engineering teams up to date and lift the merge freeze earlier if everything runs smoothly.
A number of tools support us to validate the health of our applications. We use NewRelic, HoneyBadger and some custom Grafana dashboards to see the impact of any deployments.
At this stage it is important to monitor the application and to decide how to react in the event of an issue. Based on the severity of the issue we consider to roll back to a previous version. However, due to the increased awareness of the change, we have engineers, support teams and managers ready to discuss and address any issues. We aim to move forward and unless an issue is very severe we will not rollback the version upgrade.
- Large codebase — FreeAgent’s codebase is relatively large in size. There are over 300 controllers and nearly 1000 models to look after. Upgrades to the underlying web framework can therefore affect many areas of the application.
- Old codebase — This can be considered relative but the origins of our codebase are over 12 years old, and the app has been upgraded many times starting from Rails 1.1.2 all the way up to Rails 5.2 in incremental steps.
- Many contributors — We have many engineers working on our (majestic) monolithic Rails application. Having long-running branches to upgrade Rails and merge freezes are not ideal as we would like to continue to add more value for our customers and improve our application. It is challenging to keep branches up to date and to continuously integrate changes (and fixing conflicts).
- Code Coverage — Adequate code coverage will help you identify any problems that could come from an upgrade early on in the process. On the contrary the lack of test coverage can have a detrimental impact when rolling out the upgrade.
- Accumulated some technical debt — There is no codebase out there that is clear of technical debt but it is important to understand that the higher your technical debt is at this stage the more difficult it is to perform large scale upgrades like these.
- Use of private APIs — Sometimes it is necessary to use a private API to solve a particular problem. It’s important to keep in mind that these APIs are private for a reason and therefore subject to change. We identified a number of these places where we were using private APIs and it caused us problems.
Rails upgrades are a difficult undertaking and our process is by no means perfect. We strive to continuously review and improve this process. Below is a list of lessons that we learned during the updates we have performed since Rails 1.1.2:
- The use of private APIs is harmful — The use of private APIs causes more problems than it solves. What seems like a great idea at the time will come back and cause you problems further down the line. Review what options you have and see if you can provide a solution that doesn’t involve the use of private APIs.
- Incremental updates are essential — We considered going straight from Rails 5.0 to 5.2 but quickly changed our minds on this. Even though only two minor versions apart, the changes were significant enough to cause a lot of work. Skipping a minor version not only removes valuable deprecation warnings it also almost always will result in a longer upgrade process.
- Long-running branches are not ideal — Long running branches are not ideal and this is an area where we would like to improve. We already do a good job at integrating as many upfront changes as possible but limited testing resources can still sometimes mean that a branch runs longer than expected. For larger upgrades we have some strategies to avoid long running branches, which we plan to discuss in another post. GitHub have shown an alternative approach to a long running branch that we will explore.
- Code coverage is crucial — Code coverage is your best friend to understand what is working and what is not when making large scale application changes like upgrading Rails. We identified some areas where we lacked code coverage and have since mitigated some risks by adding some additional tests.
Rails upgrades are challenging and take significant development efforts to perform but we believe it is worth it. We hope you found our findings useful and we could help you in making your next upgrade easier.