data science – Grinding Gears

All posts tagged with 'data science'

Combining text with numerical and categorical features for classification

Posted by Delphine Rabiller on 17 May 2024

Classification with transformer models A common approach for classification tasks with text data is to fine-tune a pre-trained transformer model on domain-specific data. At FreeAgent we apply this approach to automatically categorise bank transactions, using raw inputs that are a mixture of text, numerical and categorical data types. The current approach is to concatenate the input features for each transaction into a single string before passing to the model. For… Continue reading

➼ Read other posts about AWS or BERT or data science or fine-tuning or hugging face or machine learning or NLP

Using API Gateway, Lambda, SageMaker and DynamoDB to build a categorisation service in AWS

Posted by Ed Berry on 19 January 2024

I’ve talked previously about the value of combining rules-based and machine learning approaches to categorisation. In short, rules-based approaches make it easy to do customer-level personalisation that complements a machine learning model trained to find patterns across customers. In this post I’ll talk about how we used AWS to build an expense categorisation service that combines machine learning with a rules-based approach. This service forms part of the Smart Capture… Continue reading

➼ Read other posts about analytics or AWS or data or data science or machine learning

Combining machine learning with rules-based personalisation

Posted by Ed Berry on 27 November 2023

One of the ways we use machine learning at FreeAgent is to help automate data entry. Keeping on top of your accounts can involve slightly tedious manual tasks like categorising bank transactions or managing your expenses. Machine learning can help here by automating aspects of these tasks so our users can nail their daily admin and focus on bigger things. Personalisation with rules In 2020 we launched our first operational… Continue reading

➼ Read other posts about data science or machine learning

Combining data from different sources with SageMaker pipelines

Posted by Delphine Rabiller on 2 August 2023

Generating datasets for machine learning Preparing data and generating datasets is a crucial step to train a machine learning model. If you are lucky your data might come from a single .csv file. However in most cases pulling together the input features to train your machine learning model will require combining datasets from different sources. Combining data from different sources manually can be a time consuming process, prone to errors. … Continue reading

➼ Read other posts about athena or AWS or data or data science or redshift or SageMaker

The Data Science Internship Chronicles: A Starfleet-worthy Tale of Numeric Exploration

Posted by Pinar Batat Buke on 30 March 2023

In the vast expanse of the universe, I, a humble data science intern, set out on a mission to improve a classification model. As I delved deeper into the data, I encountered anomalies and outliers that threatened to disrupt my analysis. But with the guidance of my mentors and the help of advanced data tools, I navigated through the stars and uncovered the hidden patterns that led to breakthrough insights.… Continue reading

➼ Read other posts about data or data science or internship

Mindfulness with GitHub

Posted by Pinar Batat Buke on 20 March 2023

I was a researcher in chemistry in my previous career, so I have a habit of labelling everything. It is important in chemistry to be organised; you don’t want to mix unknown liquids in unlabeled beakers. Can you guess why? BOOM! I apply this habit in every area of my life. Now, everything has a place and is clearly labelled. I have a place for vertically striped socks and a… Continue reading

➼ Read other posts about data science or git or github

What a data science degree doesn’t teach you

Posted by Anna Cunningham on 28 July 2022

When I enrolled on my data science master’s degree I had limited statistical and coding knowledge. This course was designed to teach these skills from the bottom up. Having now worked as a software engineering intern, I have come to realise a lot of things were missed. Moving beyond ‘if it works… it works!’ Learning to code can seem very daunting. There are so many resources and even languages. Where… Continue reading

➼ Read other posts about data platform or data science or internship or software engineering or university

Getting started with Jupyter Notebook

Posted by Ferdinand Becker on 12 July 2022

Jupyter Notebook is a development environment that runs in your web browser and can be used with several languages, including R and Python. In this blog post, we’ll look at some of the benefits of using Jupyter Notebook and how to start using it with Python. Benefits of Jupyter Notebook Chunking code into cells Instead of having to write code in large flat files, developers can use Jupyter Notebook to… Continue reading

➼ Read other posts about analytics or data science or internship or jupyter notebook or python or tools or tutorial

How we structure our data teams at FreeAgent

Posted by Ed Berry on 3 June 2022

Since joining FreeAgent back in April I’ve been both impressed and interested with how the Data organisation is structured. I’ve come from an enterprise world where you have lots of Data Engineers, a team of dedicated Data Architects and a separate Business Intelligence org. A few things that immediately struck me at FreeAgent were: No one has the title ‘Data Engineer’Data Analytics are part of the Engineering orgNo one has… Continue reading

➼ Read other posts about analytics or data or data science or platform

Trading the lab coat for the computer – my journey to data science

Posted by Delphine Rabiller on 8 February 2022

I became a data scientist just over two years ago. It’s not that long since I traded my lab coat for a computer job, and a few people have asked me how I made the transition, if I could help someone get into data or if I could just answer some questions about what it’s like to work in data. So I figured I would put it all together in… Continue reading

➼ Read other posts about career change or data or data science or general

Grinding Gears

Tales of code crunching from the FreeAgent Engineering team