All posts written by Delphine Rabiller
Combining text with numerical and categorical features for classification
Classification with transformer models A common approach for classification tasks with text data is to fine-tune a pre-trained transformer model on domain-specific data. At FreeAgent we apply this approach to automatically categorise bank transactions, using raw inputs that are a mixture of text, numerical and categorical data types. The current approach is to concatenate the input features for each transaction into a single string before passing to the model. For… Continue reading
Combining data from different sources with SageMaker pipelines
Generating datasets for machine learning Preparing data and generating datasets is a crucial step to train a machine learning model. If you are lucky your data might come from a single .csv file. However in most cases pulling together the input features to train your machine learning model will require combining datasets from different sources. Combining data from different sources manually can be a time consuming process, prone to errors. … Continue reading
Trading the lab coat for the computer – my journey to data science
I became a data scientist just over two years ago. It’s not that long since I traded my lab coat for a computer job, and a few people have asked me how I made the transition, if I could help someone get into data or if I could just answer some questions about what it’s like to work in data. So I figured I would put it all together in… Continue reading
Being an Introvert in a Meeting
A toolbox to get your voice heard Introvert: a typically reserved or quiet person who tends to be introspective and more comfortable with a small group of people. Context This November, Lea [1], Lana [2] and myself went to the Women of Silicon Roundabout conference. There were a few talks around developing your soft skills to be better in the world of tech. One talk that particularly interested me was… Continue reading