Type checking in Ruby – Part 1

Posted by on September 9, 2022

Over the course of a career in software engineering, we learn to love elements of our tooling and dislike others – that’s perfectly natural. 

As requirements change, including our own need to improve as engineers, so does what appeals to us when reaching for a new framework, language or library. 

A common path for lots of engineers will have been to learn something like C or C++, both strongly typed languages where you need to define what kind of value a variable will be. Then onto something else, maybe Java or C#; these are more modern takes with huge standard libraries to accelerate us towards our desired results. 

Then, maybe we move onto more dynamic languages like Ruby, or JavaScript. All of a sudden you can just build stuff with your whole focus on the logic. The nuts and bolts, and with fewer constraints to worry about things feel quicker. 

Eventually though, pieces of your once loved toolset will appeal again, and you’ll maybe appreciate what they gave you before moving on to the next thing. 

If you’re writing Ruby…why not have both? 

Ruby type checking tools

Various tools have emerged over the last few years to add a level of type safety to Ruby.

One of the most popular is Sorbet which can add analysis and run-time type checking to your code, and it’s an amazing tool. 

More recently RBS (Ruby Type Signatures) has been added to Ruby 3.0, with other tooling on the way from the Ruby core team members, such as Steep. It’s an exciting addition to the Ruby landscape, so let’s take a closer look at type checking in Ruby using RBS and Steep…

RBS (Ruby Signatures)

What is RBS? Well, it’s not a type-checker – it’s just a language that lets you describe your classes and methods.

Other tools like Steep or Sorbet perform the actual checking based on its signatures.

Steep along with another tool, Typeprof, are being built by the Ruby team to make adding types to your codebase easier. Typeprof is a new interpreter which can evaluate your Ruby code and output a candidate .rbs file for you. 

Some IDEs (like Rubymine) have included RBS support to implement some hinting – like the squiggly lines you see under a method invocation that isn’t correct. 

So, why is type checking useful?

Type checking can serve as a low level unit test for a method and its use in a system, but also provides accurate documentation for your API. Take this code for example: 

# lib/greeting.rb
class Greeting
  def say_hello(arg)
    puts "Hello, #{arg}"
  end
end

# main.rb
require "./lib/greeting"

greet = Greeting.new
greet.say_hello("FreeAgent")

But what is arg? It’s easy to see by looking at the code in this example that it’ll be output as a string, and thanks to Ruby it doesn’t really matter what we pass in as that parameter – it’ll work out how to output something, even if it’s just the class name/object id. 

So the type here isn’t too much of a concern. We can, however, leave each other clues as to what it is, and be more helpful to our team and future selves. A comment, YARD docs, a more descriptive name or a spec would all improve this.

But over time one or more of those things may change and fall out of sync with what the method is doing or what it requires.

Let’s change things a bit so that we can call that method with a 2nd parameter but implement it incorrectly…

say_hello("FreeAgent", "Hi!")

You might assume that the 2nd parameter is some sort of prefix for the output. But this example is even more contrived than that!

def say_hello(name, count)
  count.times do
    puts "Hello, #{name}"
  end
end

We’ve given arg a more useful name, and added a new count parameter – the type of this new parameter is important since we’re calling a method on it. That method, #times, is only defined on Integer types, so calling it on a string will lead to problems.

Despite that, say_hello("FreeAgent", "Hi!") looked correct at first glance, and it is valid Ruby after all.

Without actually taking a look at the code to see what that 2nd parameter does, you wouldn’t know it was wrong until runtime when you would see a NoMethodError exception because #times is undefined on instances of String. So, how can we avoid this and make things easier?

Enter type checking with RBS and Steep

Steep is a tool for parsing RBS files, and checking that the corresponding Ruby code has matching invocation and return signatures. 

With Ruby 3 and the Steep gem installed, a Steepfile like this one is all you need to get started:

target :lib do
  signature "sig"   # signatures in the sig/ folder
  check "lib"       # type check ruby code in the lib/ folder
  check "main.rb"   # type check our main/entrypoint file
end

With Steep configured and watching our project (steep watch .), and by giving our Greeting class a type signature like this:

# sig/greeting.rbs
class Greeting
  def say_hello: (String, Integer) -> void
end

updating main.rb…

# main.rb
require "./lib/greeting"

greet = Greeting.new
greet.say_hello("FreeAgent", "5")

you’ll see this output from Steep in your console:

[error] Cannot pass a value of type `::String` as an argument of type `::Integer`
Diagnostic ID: Ruby::ArgumentTypeMismatch
greet.say_hello("Test", "5")
                        ~~~

It’s important to note that our main.rb will still run (and raise an exception) – think of the type checking step like a sort of spec run. A failing spec won’t stop you running the Ruby code it describes…but there may be trouble ahead if you ignore it. 

Let’s fix up the code…

greet.say_hello("Test", 2)
# 🔬 Type checking updated files...done

No errors. Nice! And now running ruby main.rb gives us:

Hello, Test
Hello, Test

Wrapping up

So, there we have it – a quick introduction to type checking in Ruby using RBS and Steep. 

The future of type checking in Ruby looks really exciting with some mature tools already having widespread adoption. Give it a try and see what works best for you. Have fun!

My experience as an Analytics Intern

Posted by on September 8, 2022

I was really excited to begin interning at FreeAgent and after 9 weeks in the Data Analytics team I feel I’ve learnt a lot about working in a team inside a company, and about the culture here.

I thought I would write a bit about how it was getting set up, working on my project and communicating my findings to the rest of the company.

Onboarding/set-up

There were a lot of onboarding events (especially on the first day!) but I think they mostly provided some interesting insights on FreeAgent, especially valuable were the sessions explaining the app and its users’ motivations. I was particularly impressed by the event run by the CEO, Roan. Having the CEO meet with our small onboarding group and be open to questions made a great first impression. 

The technical set-up also was surprisingly painless. I’ve had much more difficult times setting up dev environments in the past but the IT team was really helpful and had everything moving smoothly. I was pleased by how much I ended up liking the Python package manager Poetry as this was my first time using it. The Notion page was really helpful in getting it set-up and was written in a comprehensible way. Post set-up I was able to quickly settle into my workflow and rarely had any issues.

Meeting the team

It was great meeting and working with the team. As I was in a smaller team it let me get to know all of my teammates well and meant that it felt like I could turn to them when I needed help. As the team was close it felt a bit tough at first to break into the flow but after a bit I was also able to chat comfortably with them. Having an assigned “buddy” was also very useful as I felt I had someone that I could ask lots of questions to (even though the other team members assured me that I can ask them as many as I want to!)

Our daily morning stand-up meetings meant that I was in constant contact with them which was great for feedback and feeling connected, even as our team was working mostly remotely. Meeting members of the other teams was also easy, as I could join the weekly intern sprint demos or one of the several forums, such as the accessibility forum, which let me get a good view of the company.

Grabbing the data

My first task was grabbing the data I wanted to analyse from our Redshift data warehouse. I did this using SQL queries in Postico. It was exciting to be able to apply SQL skills I had learned throughout my education to industry scale data. My queries were also not without bugs! But with support from my team I was able to get the correct data and run some interesting analyses. I think it really improved my confidence with SQL.

Initial Struggles

My project was focussed on clustering customers, finding larger groups which customers belong to that are then easier to analyse or reveal trends. I faced quite a few difficulties at first, for example, some of the techniques I wanted to use weren’t quite suited to the data. A member of the Data Science team showed me a better technique, but it was still difficult to interpret the results due to the sheer number of data points. My manager then let me know about some work that had been done on grouping industries, this inspired me to shift to clustering industries. With some help on how to convert the data, this made for a much more easily manipulated dataset and considerably easier to interpret results. It also remained applicable to customers as customers are labelled with industries.

Clustering

After switching to the industry representations I was able to effectively use PCA to visualise them and remove an outlier. I then used hierarchical clustering to group the industries. It produced a really interesting result, similar industries were seemingly being clustered together. This suggested that similar industries are actually using the app in similar ways. The computer generated clusters had some similarities but were largely quite different to the previously human made groupings. It was exciting to be able to apply techniques such as PCA and clustering to actual industry level data and it was also cool to see the produced labels being pushed to the database for others to use.

Classifier

As a side project I did some work on a subscription predictor. For this I took the first few weeks of usage from a customer and tried using it to predict whether they would be subscribed in the future. This led to some interesting considerations such as how many initial weeks do we need to feed into the model and what does it mean for a customer to be subscribed in the future. The answers also were not obvious as I did this work in two different customer acquisition channels. It was nice to follow strategies I learnt about on machine learning courses such as splitting the data into training, validation and test sets and in the end I produced several classifiers with accuracy around 72-74%. I was very pleased with this as I had experienced difficulties with using the customer data previously. 

Presenting findings

Another important lesson I learned at FreeAgent is that the work is not over once the result has been found. The important next step is communicating that result to other people! I regularly gave presentations to members of other teams to show the work that I had done, but I think the highlight of this aspect was my Town Hall presentation as I was able to give a whole company presentation on my findings. Members of my team and the Data Science team provided lots of help which let me feel really prepared on the day of the presentation. It was exciting to be able to present to so many people and I think it really improved my skills as I also had to tailor the presentation to the less data focussed members of the audience. It was great to be able to interact with members of other teams through these presentations.

Conclusion

Overall, I really enjoyed my time at FreeAgent. I think it really helped mature my skills not only in applied machine learning but also in presenting the results of those techniques. I was also able to have a good time thanks to the great members of my team and all the lovely people I met across the company. I would really recommend working at FreeAgent for anyone who’s thinking about it. Thanks to the Data Analytics team and FreeAgent for a great internship!

Fixing my first bug – the experience

Posted by on September 1, 2022

As a computer science student, I obviously fixed bugs in code before. However, the first one I had to fix at FreeAgent felt significantly different from the corrections I had to make in my university assignments and other small projects. I think that was because whenever I worked on a project, I either wrote the whole code or the project contained just a few files. That meant I could skim through the code, fully understand it and know what each method and variable was. After a few hours of working on it, I could probably recall all that information from memory. However, working on a large codebase like FreeAgent’s was a completely different experience.

Starting with a bug

Opening the code felt a bit overwhelming. There were many files, all in different folders within folders and named using a system I didn’t know. I quickly learnt not to worry about most of the files, just the ones I was interested in modifying. If I needed any information about some variable, it was easier to check it when I needed it. I also learnt how much time I can save and how much simpler it is to search for the file or keyphrase I need instead of looking for it manually. That might not be significant with five files containing a couple of lines of code each, but as soon as that increases, the time it takes to search grows rapidly.

The process

The projects I usually work on at university are small, so I wasn’t worried if I did something wrong. If I messed up the code too much, I would have to recreate a few lines of code in the worst-case scenario. Maybe a bit annoying but not too time-consuming. When working on a bug in FreeAgent, I was worried about not ruining anything by adding or deleting the wrong line of code. However, there are lots of processes in place to prevent you from crashing something by accident. I ran into a lot of problems because it was the first time I had used most of the programs, and I hadn’t programmed in Ruby before. However, all my co-workers helped me through any problems I encountered. 

Finished and fixed

Finally, my changes passed all the required tests. It felt very satisfying and exciting when I managed to send off the code to fix the bug and moved the card with the task to the “Done” column. After a short celebration with virtual confetti falling down my screen, I was ready to face the next problem.

Lessons and observations

The whole experience showed me that programming as a job is even more different from university assignments than I had thought. However, I believe both have their appeal. University assignments allow you to discover and try out a wide range of algorithms in smaller, more isolated environments. And coding during an internship is more satisfying because of the real-life problems. It also showed me how significant the coding rules are, which didn’t seem that important when I was working on the small projects at university.