Shopping for data: How thinking about supermarkets might help you to manage your Looker implementation

Posted by on March 26, 2021

Cameron Diaz once said:

I can spend hours in a grocery store. I get so excited when I see food, I go crazy. I spend hours arranging my baskets so that everything fits in and nothing gets squashed. I’m really anal about it, actually.

I like to think that if Cameron worked at FreeAgent then she would feel the same about data (Cameron – please feel free to look at our current vacancies). At FreeAgent a lot of our data is queried and analysed using Looker, and I believe that there are similarities between a successful supermarket and a successful Looker implementation …

Carry varied stock

A good supermarket stocks a wide range of ingredients, which enables its customers to cook a wide range of meals. Similarly, we need to ensure that Looker has a wide range of data, in order to answer a wide range of questions. This often means combining data from various sources. For instance, by combining support ticket information with the number of article views on our knowledge base, we can better understand how to improve the information available to our users.

Keep fruit and vegetables fresh

The fruit and vegetables on display in a supermarket are seldom out-of-date. Supermarkets (presumably) manage this by a) carefully managing stock levels, and b) throwing out any expired food. We ensure that our Looker content remains “in date” by doing something similar to b) – we delete any unused looks and dashboards via our fully automated Looker Content Cleaner (LCC). Think of the LCC as supermarket staff methodically working to check each piece of fruit every day, then removing anything that looks rotten so that everything is fresh and tasty!

Store everything in its place

Supermarkets are structured to encourage you to spend as much money as possible (see: sweets displayed by the tills), but this organisation is also there to make it easier for customers to find things. So, the “cleaning” aisle will have sprays, detergents, bin bags, rubber gloves, and so on – all side-by-side. We try and replicate these “structured aisles of related produce” in Looker, by having many models (the “aisles”) which contain one or more explores of related content (the “related produce”). For instance, our A/B Testing model contains the explores which show the results of our A/B Tests.  This means that any user who wants to understand how our tests are performing only needs to head to that “aisle” and take a look. Simple!

Cater for all (but encourage people to cook)

Not everyone enjoys cooking, and most supermarkets sell both “ready meals” and base ingredients. In Looker, you might think of pre-built reports as ready meals – someone “cooked” this dashboard so I don’t have to! However, I believe that not learning to explore the data is a bit like only eating pre-packaged sandwiches – it’ll keep you alive, but you’ll benefit from (and enjoy!) learning to cook and exploring new flavours. That’s why we run a Looker onboarding training course on a regular basis – we want to teach our staff the skills they need to “cook up” their own insights. This also makes Looker feel like their local supermarket – they know what’s on each aisle, and can quickly “find the milk” when they’re in a hurry.

Offer the convenience of delivery

Some people like visiting the supermarket, and strolling around the aisles to see what takes their fancy. Others prefer to have their grocery shop delivered straight to their door. At FreeAgent, we use many different mechanisms for data delivery. We send data from Looker to S3, we schedule dashboards to land in email inboxes each morning at 9am sharp. But my personal favourite is our use of Looker’s Slack integration, which we use to send analyses and dashboards from Looker directly into the conversations happening in Slack.

Hire great staff

Supermarkets have staff on hand stocking shelves, which can be handy for helping people locate the item on their shopping list which they are struggling to find. The Analytics team members perform comparable tasks. They stock the shelves (manage our ETL process), source new produce (new data being asked for) and help our customers find what they’re looking for (answer questions in our Slack requests channel).

So, whether you’re designing or maintaining your Looker setup, maybe consider some of the things discussed above. Do you have the range of data necessary to build insights? Is your data “fresh”? Are you catering for all of your users? Can your users easily find what they need? Are staff on hand to help? By asking these questions, you should find that you are able to serve your users more efficiently and comprehensively. And don’t worry if that’s a lot to think about, do it one thing at a time – every little helps!

Timecop vs Rails TimeHelpers

Posted by on March 25, 2021

TL;DR – You probably can’t replace Timecop with Rails’ built in TimeHelpers, as TimeHelpers only recreates Timecop’s freeze method, and can’t handle nested travelling.

Timecop is the go-to gem for testing time-dependent code, as it allows you to manipulate the current time during test runs. This is important because without control over the time, flakey tests can emerge in your codebase.

A very simple example is testing the created_at attribute of an ActiveRecord model:

# ActiveRecord will set the model’s created_at field
# to the current time on create
model = SomeModel.create!

expect(model.created_at).to eq(Time.now)

Most of the time the above test would pass without issue. However if this test happened to run just as the system clock ticked over to the next second, the model’s created_at timestamp would be one second earlier than Time.now in the expect comparison, causing the test to fail. A classic hard to reproduce flakey test.

This flakiness can be eliminated by including Timecop and calling Timecop.freeze at the beginning of this test. This will cause the current time to be constant throughout the test, regardless of what happens with the system clock. 

Timecop serves us well at FreeAgent, we use it extensively with over 1,300 calls to Timecop across more than 500 spec files.

It’s not just us though. Timecop is well used, and well loved, throughout the Ruby community. Since its release in 2009 it has been downloaded over 81 million times according to rubygems.org, which is particularly impressive considering it has one single maintainer.

Seemingly less well known, however, is that Ruby on Rails includes its own Timecop alternative with the equally catchy name of “ActiveSupport::Testing::TimeHelpers”.

TimeHelpers was released with little fanfare in Rails 4.1, by way of a brief sentence at the bottom of the release notes in April 2014. So it’s no surprise that not many Rails developers are aware of its existence. And that includes me, as I only discovered it this week.

While searching for the Timecop documentation, I happened upon a blog post entitled “Replace Timecop With Rails’ Time Helpers in RSpec”. This instantly grabbed my attention as we have over 300 gems in our majestic monolith and any opportunity to remove a dependency is a good thing.

I spent some time getting acquainted with the TimeHelpers documentation, and it seemed on the surface that the rumors were true – TimeHelpers does indeed contain its own versions of almost all of Timecop’s features:

Timecop methodTimeHelpers equivalent
freezefreeze_time
traveltravel / travel_to
returnunfreeze_time / travel_back
scaleN/A

The only difference is Timecop has Jean‑Claude Van Damme a scale method, which allows you to change the speed at which time passes. But three out of four ain’t bad.

Unfortunately, all is not as it seems, as I discovered upon digging deeper into TimeHelpers.

Ignoring the scale method, for which there is no equivalent in TimeHelpers, both libraries offer two ways to manipulate time: freeze and travel.

These two concepts seem self explanatory: freeze stops time, and travel offsets the current time by a given amount. That’s how Timecop works at least.

TimeHelpers sees things a little differently. Freeze does indeed freeze time. However TimeHelpers’ travel methods also freeze time with the addition of an offset.

There is no way to just travel in time without also freezing it when using TimeHelpers. This is because all the TimeHelpers’ methods really do is stub Time.now, Date.today, and DateTime.now to return the date and time specified.

Consider the following example:

include ActiveSupport::Testing::TimeHelpers

travel_to(Time.parse("2020-01-01"))

Time.current
# => 2020-01-01 00:00:00 +0000

sleep(10)

Time.current
# => 2020-01-01 00:00:00 +0000

I’d have expected the second call to Time.current to return a time 10 seconds later than the first, something like 2020-01-01 00:00:10 +0000. After all, I didn’t ask for time to be frozen, just to travel through it. It’s a reasonable assumption that time will keep on ticking after the traveling, especially as that is how Timecop works.

Unfortunately, this isn’t the only place that TimeHelpers fails to live up to Timecop. Timecop allows you to nest changes in time, eg:

Timecop.travel(Time.parse("2020-01-01")) do
  puts "First travel: #{Time.current}"

  Timecop.travel(1.day) do
    puts  "Second travel: #{Time.current}"
  end
end

# => First travel: 2020-01-01 00:00:00 +0000
# => Second travel: 2020-01-02 00:00:00 +0000

Where trying the same with TimeHelpers raises an error:

include ActiveSupport::Testing::TimeHelpers

travel_to(Time.parse("2020-01-01")) do
  puts "First travel: #{Time.current}"

  travel(1.day) do
    puts  "Second travel: #{Time.current}"
  end
end

# => RuntimeError (Calling `travel_to` with a block, when we have previously already made a call to `travel_to`, can lead to confusing time stubbing.)

Much to my disappointment, TimeHelpers is not a drop-in replacement for TimeCop.

Having said that, I don’t think TimeHelpers is completely useless. In the majority of situations, freezing time is enough to reliably test time-dependent code. So if you’re starting a new Rails project, you can most likely forgo installing Timecop, and use TimeHelpers instead.

However, if you’re already using Timecop, it’s unlikely that you’ll be able to replace it with TimeHelpers.

This was the case for us at FreeAgent, as we make extensive use of Timecop’s travel functionality, as well as nested time travelling, so until TimeHelpers is updated to include those features, we’ll be sticking with Timecop.

There is still a happy ending to this story, as we were able to replace Timecop with TimeHelpers in our Dev Dashboard app, which is a separate Rails codebase for managing access and authentication with our API, and now has one less gem.

We’re Gonna Need a Bigger Boat

Posted by on March 9, 2021

Earlier this year, the FreeAgent marketing website www.freeagent.com was the target of a volumetric Distributed Denial of Service (DDoS) HTTP flood attack.

This was a relatively unsophisticated attack in that it targeted a particular static endpoint of our website with a massive number of HTTP GET requests from multiple remote IP addresses around the globe, as visualised on the map below.

Predominantly serving the UK small business base, FreeAgent wouldn’t normally get such a level of love and attention aimed at our website. The level of traffic directed at us was unexpected and so our standard content service and network protection resources were eventually overwhelmed, resulting in downtime and the website being unavailable for a short period.

The graphs below show a typical average level of traffic requests per minute, followed by a massive x40 jump in traffic at 22h45, triggering our automated alerting.

Our initial response was to drop traffic to the specific endpoint with a 404 Not Found HTTP response code at the load balancer layer, protecting the overall integrity of the rest of the website, while increasing the compute resource available to absorb additional capacity. This got us over the hump until traffic dropped back to normal levels at 2am – three hours after the attack began, with just the initial hour or so (77 minutes to be precise) offline.

An exact three hour attack window is revealing in itself. Taking some of the malicious IP addresses and entering them into shodan.io reveals a pattern of vulnerable MikroTik routers being exploited. Smells like a botnet-for-hire to me.

Around the same time, our Social Media Team received an inbound Tweet which seemed suspicious and certainly not coincidental. Further analysis of the Twitter account showed little activity, apart from “following” FreeAgent, and a previous history of trying to elicit a ransom demand from other online entities, using the same wording and tactics.

So, at what point does paying for a botnet-for-hire become financially unviable? Well, at an average of often less than $10 per hour, it’s a relatively cheap and easy attack to mount. However, if the victim does not engage (and neither they should!), and readily mitigates the attack, then any potential for a return soon evaporates.

But hey! Let’s up the traffic and try once more. So the attacker did later that morning, increasing the volume of traffic and pivoting to a different endpoint, including hitting our main customer application on an unauthenticated endpoint – the forgotten password page, as many a provider will have.

Only the static website was affected. As we had already completed the migration of our customer facing application to AWS in December, it remained unaffected with enough resource capacity and scalability available to meet increased traffic demands. We also already have AWS Shield in place. For the website, we increased resources further (luckily available having vacated the application from the hosted data centres earlier).

Prior to this DDoS attempt, we had already begun a project to migrate our website out of our hosted data centres. As a result of the second attack, we decided to expedite this process, so the website is now hosted with a new incumbent technology hosting provider, Netlify Edge with active DDoS mitigation in place.

The Response Team considered, and test implemented other alternatives, should our current mitigations be insufficient. A sterling effort, coolly and calmly considered during a time of potential crisis and stress – a great example of our Incident Response process working like clockwork. 

We have  alternatives in our back pocket in case of another rainy day. Some of these include:

  • Blocking non-UK IP addresses at the Layer 3/4 level.
  • Leveraging our Runtime Application Self-Protection (RASP) technology to block at the Layer 7 level.
  • Additional traffic routing via AWS, availing of AWS Shield.
  • Hosting directly in AWS, availing of AWS Shield. 
  • Implementing alternative CDN technology.

And back to our attacker. Whatever became of them? Well, we can only surmise based on the same Twitter handle and who they followed after realising that FreeAgent wouldn’t succumb to their games. But their antics have been reported to the Police Scotland Cybercrime Operations Unit and reported on the NCSC CyberSecurity Information Sharing Partnership (CiSP) portal. Even if they have “changed their name” (but not their number!).

FreeAgent employs a number of organisational and technical measures to help protect our systems and the data of our customers. We employ a secure coding framework and strategy, with both static and dynamic security testing, regular automated and manual penetration testing and runtime application protection.

But, nothing is perfect and we are always looking to learn, iterate and improve. We operate a Responsible Disclosure program, so if you are aware of a specific vulnerability, please do reach out.

Managing Python dependencies across multiple Data Science projects with Poetry

Posted by on March 5, 2021

Python is the programming language of choice for running analysis, building models and running machine learning services in production for the Data Science team at FreeAgent. A key reason we chose Python is the great ecosystem of packages available: NumPy, pandas, SciPy and scikit-learn, deep learning frameworks like TensorFlow and more bespoke options for specific tasks like Click for developing CLIs.

This wealth of options is a great strength of Python but also leads to difficulty. Managing dependencies in Python can be tricky, especially when you have multiple environments to work with (local development, cloud-based jobs and production deployments).

Our setup at FreeAgent

In the second half of last year, we made a concerted effort to rationalise how we manage Python. The goal was to make it easy to get up and running with Python in general or for any of our Data Science projects and to ensure that our environments could be fixed across different members of the team. The toolchain that got us to this point consists of three parts:

The first two tools allow us to ensure that the install of Poetry is effectively isolated from the system Python and that we can easily manage which version of Python we use. While these two tools are important, the real champion of our dependency management system is Poetry itself.

Using Poetry

Right away, Poetry offers a very intuitive interface similar to Bundler and Yarn, which are used elsewhere in FreeAgent to manage Ruby and javascript dependencies. You can construct a new project that uses poetry with poetry new <package-name> and this creates a folder with the structure of a simple Python package including a pyproject.toml.

Poetry uses the pyproject.toml to define the dependencies of your project and you can manually add dependencies (with version constraints) under the [tool.poetry.dependencies] section. Alternatively, you can specify the dependencies interactively with poetry add <package> (with version constraints if you want).

Using poetry add gives you the first glimpse of the true power of Poetry. First, Poetry doesn’t just immediately try to install the requested package and any dependencies as it has its own dependency resolver. This will look for any version of a package that will satisfy the requirements without affecting the currently installed packages. If it cannot find such a package it will give a helpful error message that should let you sort the problem.

By way of an example, let’s suppose I have a simple project that has NumPy version 1.15.4 and Matplotlib version 3.3.4 installed. Now, there is some functionality in the latest SciPy release (1.6.1 at the time of writing) that I really want to use. I run poetry add scipy~1.6 add get the following message:

SolverProblemError
  Because no versions of scipy match >1.6,<1.6.1 || >1.6.1,<1.7
   and scipy (1.6.0) depends on numpy (>=1.16.5), scipy (>=1.6,<1.6.1 || >1.6.1,<1.7) requires numpy (>=1.16.5).
  And because scipy (1.6.1) depends on numpy (>=1.16.5), scipy (>=1.6,<1.7) requires numpy (>=1.16.5).
  So, because example-env depends on both numpy (~1.15) and scipy (~1.6), version solving failed.

This immediately tells me what went wrong and how I can fix it. I need to update the NumPy version constraint. I can do this with poetry add numpy~1.20 to get up to date and check that the tests still pass. Then poetry add scipy~1.6 works without a hitch!

Part of the poetry add process is writing the solved version constraints to a lockfile, poetry.lock. This file should be checked into version control and it ensures that the same versions of packages are installed across environments. Someone checking out the project for the first time just needs to run poetry install and all the packages, and the same versions, will be installed. This is the key to why Poetry is so good across multiple environments.

A corollary of the dependency resolver and lockfile is that Poetry is a great way to keep transitive dependencies in check. If you install the dependencies through Poetry then an upstream change to a dependency of your dependency cannot cause unexpected issues. Poetry also has a tool to help you understand the dependency graph of your project so you can understand why each package is installed: poetry show --tree. Here is an example output from the simple project with NumPy, SciPy and Matplotlib installed: 

❯ poetry show --tree
matplotlib 3.3.4 Python plotting package
├── cycler >=0.10
│   └── six *
├── kiwisolver >=1.0.1
├── numpy >=1.15
├── pillow >=6.2.0
├── pyparsing >=2.0.3,<2.0.4 || >2.0.4,<2.1.2 || >2.1.2,<2.1.6 || >2.1.6
└── python-dateutil >=2.1
    └── six >=1.5
numpy 1.20.1 NumPy is the fundamental package for array computing with Python.
scipy 1.6.1 SciPy: Scientific Library for Python
└── numpy >=1.16.5

So, why Poetry?

We want to ensure that our Python environments are consistent across different data scientists’ local development environments, our cloud-based jobs and any production environments that we have. While this is achievable in many different ways, Poetry makes it simple. Since rolling out Poetry, we are now using it in our Bank Transaction Classification work and across multiple ad hoc investigations. The response from the team has been positive – it has never been easier to move between different projects!