Introduction
Data is exploding, and so are the tools to manage it. From generating and collecting, to cleaning and analyzing, these tools help create valuable products for customers and give stakeholders decisive insights. As Data Engineering at Freeagent continues to evolve, we’re focusing on providing more reliability and quality in our data products. For data pipeline building, we’ve started to move from a no-code approach toward a software engineering focused approach. With so many data orchestration tools available in 2025, we decided to investigate and share our opinions on what is out there. The tools we explored are: Prefect, Dagster, Airflow (MWAA), and Mage.
Evaluation and Criteria
We began by defining a list of criteria that would be used to evaluate each tool. This wasn’t about picking the shiniest tool, but the most appropriate for us, at this time. It should help solve the problems we are facing now, and shape our future work and opportunities. These requirements may change over time, and might re-evaluate these tools and others as a result of that. For each tool we spent a few weeks researching, learning about them and using them for simple data pipelines. We also talked directly to the companies where appropriate. We wanted to be explorative, but pragmatic.
You may have different requirements, and opinions on the tooling. Luckily, there’s a lot of different options out there.
The following is a subset of the criteria that we used to evaluate the tools. In reality we had about 30 different criteria.
- Support Software Engineering (SWE) practices (Code Review, CI/D, testing, local development, multiple environments etc.)
- Cost effective
- Integrates with other data tooling (especially DBT, the primary tool being used performing transformations in our pipelines)
- Secure (SSO support, RBAC, network control, isolated environments, etc.)
- Supports complex orchestrations (schedules, reactive, DAGs)
- Supports data quality checks
- Supports unification of tooling across all data engineering
- Data Visibility and lineage
- Increased Reliability (alerts, SLO’s, retries, etc.)
Mage
⚠️🚫⚠️❌✅✅⚠️⚠️✅
Mage’s Unique Hybrid Notebook/IDE Approach:
Mage is data-centric and has a dedicated UI for a hybrid Notebook/IDE that runs in the browser (The pro version does have a VSCode integration, but we were unable to test this). This Notebook approach makes it a different proposition from the others.
Pipeline and Block Concepts:
Mage uses pipelines and blocks. Blocks are granular components that are used to build Pipelines. Blocks have corresponding source files, and can be one of numerous types depending on what they do. E.g. an exporter, loader, or transformer. They have templates which provide some boilerplate, including a test placeholder. Blocks are visualised in Notebooks which can be independently run in the UI. Pipelines can be triggered, and run via a schedule, event or the API. Pipelines are yaml files under the hood, but you primarily work them via the UI.
Ease of AWS Setup via IaC:
Mage provides Infrastructure as Code (IaC) code that allows us to spin it up in AWS with ease. It’s always very helpful when tools provide guidance on setting up their system in the cloud.
Convenient Data Integrations but Limited Lineage:
The Data Integrations with Mage were also convenient. DBT could be run as a single model, or multiple using the DBT CLI options. The UI provides a tree view of the pipeline to view the dependencies between blocks, but can also show charts and visualisations of the data. Their docs mentioned data lineage, but we could really only see that at the pipeline level. We could not see a global lineage between all assets.
Concerns Regarding Documentation, Versioning, and Security:
A number of data focused features (such as backfilling, and retries) had missing or WIP documentation. It also has its own file versioning system, as well as working with git, which we found a little confusing. We have had some bad experience with this type of versioning (UI based which syncs to a git repo) in the past with other products, and their multiple ways to version put us off. We also managed to get into a situation where Mage raised an error showing all our secrets 🥵 . Ultimately, this made us question whether Mage was right for us in a production environment.
Pros | Cons |
Easy setup in AWS due to maintained Terraform modules | Security concerns around the environment secrets being logged on errors |
Friendly UI | Github integration required broad permissions, and versioning was complicated |
Reusable components | Difficult to pass data between SQL and Python block |
Data focused | Notebook driven development |
When to use?
When you want a low-ish code/Notebook solution, that’s data-focused and easy to deploy to your cloud provider.
Airflow (MWAA)
⚠️✅❌✅✅⚠️✅⚠️⚠️
Airflow: The Industry Standard:
Apache Airflow is the data orchestration tool. It has been around a long time, has a large community and is the industry standard. We were specifically looking at Amazon Managed Workflows for Apache Airflow (MWAA) as we run our infrastructure on AWS. Similar to the other tools in this post, it’s Python based. However, with Airflow you generally use a Python context manager to construct your DAGs. The other tools use decorator functions as their Python interface.
Suboptimal DBT Integration via BashOperator:
The ‘official’ way to run DBT (core) from Airflow is via the BashOperator. This calls out to the DBT CLI to run the project. Unfortunately, it doesn’t provide us with a nice integration. We get the logs, but nothing else. You can find libraries that integrate with DBT and provide nicer interfaces (visually, and in the code). However, it still didn’t get us close to what we were experiencing with Dagster’s asset graph. We could see which models ran in a DAG. But we couldn’t see which models depended on each other outside of that. I.e., it was still a task based pipeline, not a data based one.
Deployment to MWAA with Inconsistencies:
Deploying to MWAA was fairly trivial, but slightly inconsistent and constrained. To update the source code you upload objects to S3 which automatically makes them available to MWAA within about 30 seconds. However, if you want to update your python libraries this follows a slightly different process (uploading to S3, then updating the MWAA environment), and takes between 5-10 minutes.
Dated Dependency Management on MWAA:
Dependency management on MWAA, and Airflow in general, is quite dated. Airflow recommends avoiding using poetry to manage dependencies (which we currently use for our python projects), and MWAA requires a constraint definition. This severely limits the developer experience, and which libraries can be used. Unfortunately, none of the DBT wrapper libraries were compatible with the constraints.
Anticipation for Airflow 3’s Modernization:
Airflow 3 has just been made generally available, bringing a more modern UI, as well as Task Isolation and Event Driven Workflows. These all sound like great additions, but are following other tools (Prefect, and Dagster) that already have them.
Pros | Cons |
Mature DAG framework | Clunky / Legacy UI |
Easy to set up in AWS | DBT Integration |
Security fits with existing IAM setup | More general DAG – Lack of specific Data visibility and lineage |
When to use?
Choose Airflow when you want to use the industry standard with a proven track record to work at scale.
Prefect
✅⚠️⚠️✅✅✅✅⚠️✅
Overview of Prefect’s Core Concepts:
Prefect is a modern Python based orchestration tool. It has a minimal Python interface to get started quickly. You create flows, and tasks by wrapping standard Python functions in corresponding decorators. A flow is the entrypoint of the workflow and signifies a piece of work to be done. Flows can call other flows, or they can also call tasks (which are more granular pieces of work). You can deploy flows to workpools. When you deploy a flow you can configure it with a cron schedule or an event trigger. Workpools will determine where flow runs (in process, on a cloud provider, etc.). These can be dynamic and flexible in terms of their resources (cpu, memory) required, it could also be a local/in-process environment.
Standout Feature: Event-Driven Triggers:
A stand-out feature of Prefect was their events and triggering actions. You can trigger actions (run a deployment, sending notification etc.) based on events (such as run states, and deployment status). You can also trigger actions of events not occurring. All Prefect objects (such as flows and tasks) publish events. You can also ingest events through their API, which can then be used to trigger actions. We saw a really nice example of consuming webhooks from the Snowflake status page, which triggered an action to disable their flows that required Snowflake.
Challenges with Prefect’s Terminology and Development Workflow:
From our initial testing, we found the terminology used in Prefect quite difficult to follow. We found it unclear how best to run a flow in development as there were a number of options. It could be done by running a Python script that just runs the decorated function, deploying to a local in-process work pool, or deploying to Prefect cloud. I couldn’t help but feel that Perfect’s abstractions were exposing their internals more than they needed to.
Focus on Platform Components and Lack of Data-Centric Tooling:
Prefect placed a lot of emphasis on their platform components. As a result there were a lot of flexible and powerful tools for failure handling, provisioning dynamic infrastructure, and events. However, we felt there was a lack of data centred tooling. For example, there was a lack of data visibility or lineage out of the box. Prefect did integrate with DBT. However, due to the lack of data lineage we preferred the integration offered by other tools. The Prefect team did tell us that lineage within a flow was on the horizon for them – so definitely one to watch.
Pros | Cons |
Very flexible, modern orchestration framework | More general DAG – Lack of specific Data visibility and lineage features (although this was being worked on) |
Hybrid deployments / Bring Your Own Infrastructure | Lack of opinionated structure |
Simple to get started (decorator interface) | Script focused / Lack of uniform dev environment |
Expressive event based triggers | |
Error handling / retry mechanisms |
When to use?
Choose Prefect when you require robust failure handling, and comprehensive event based pipelines.
Dagster
✅✅✅✅✅✅✅✅✅
Dagster’s Data-Centric Approach
Dagster is another modern Python based orchestration tool. It involves more initial setup than Prefect, but is also more opinionated and more data centric. Dagster’s novel idea is that you should shape your data pipelines around the data they produce, instead of the steps you take to build them. It allows Dagster to create powerful data lineage visualisations that allow you to clearly see asset dependencies.
Strong DBT Integration and Data-First Focus:
The DBT integration with Dagster works really well because of how aligned the DBT models are with Dagster assets. This means that with a few lines of code you can have a complete asset map of your DBT models. The data-first approach doesn’t stop there. Partitioning, and quality checks are built into Dagster and it’s easy to see which assets are in what state.
Flexible Deployment and Developer Experience:
Dagster’s Hybrid Deployment model (where they host the UI and meta data and you host the agent and your own code) provided us with flexible, secure isolated environments. When coupled with their CI/D for branch deployments, it gave us a strong developer experience, following Software Engineering best practices. Furthermore, Dagster provides a simple dev environment setup with code reloading.
Effective Training Resources Available:
Dagster’s University provided us with a foundational understanding of the platform, and content we could share with our peers allowing us to onboard easier, with less burden on us to provide the training. There are currently three courses (foundational, DBT, and testing) which help you get up and running.
Weakness in Event-Based Triggering:
One of the weaker features of Dagster was the lack of an event-based trigger for jobs. Dagster has Sensors, which allow you to simulate event-like behaviour by creating frequent interval jobs that check if something needs to be run. For example, checking for objects in S3. The end result should be the same, but the lack of a push based mechanism is less satisfying, and more complicated, in my opinion.
Pros | Cons |
Data focused | Reactive triggers via Sensors instead of events |
Hybrid deployments / Bring Your Own Infrastructure | |
Opinionated (assets, resources, definitions) | |
Dev / Local environment | |
DBT Integration | |
Training |
When to use?
Choose Dagster when you want to focus on the data, and create a view of your entire estate.
Comparison
This table reflects our assessment based on the criteria outlined. The 🚫’s indicate that we did not come to a conclusion about that specific criteria, as we abandoned the evaluation due to a failure of meeting vital criteria. In this case, Mage failed the security check due to exposing credentials in the UI.
Legend: ✅ Meets requirements | ⚠️ Partial/concerns | ❌ Does not meet requirements | 🚫 Evaluation abandoned
Prefect | Dagster | Airflow (MWAA) | Mage | |
---|---|---|---|---|
SWE Practices | ✅ | ✅ | ⚠️ | ⚠️ |
Cost | ⚠️ | ✅ | ✅ | 🚫 |
DBT Integration | ⚠️ | ✅ | ❌ | ⚠️ |
Security | ✅ | ✅ | ✅ | ❌ |
Orchestration Capabilities | ✅ | ✅ | ✅ | ✅ |
Data Quality Testing | ✅ | ✅ | ⚠️ | ✅ |
Support Tooling Unification | ✅ | ✅ | ✅ | ⚠️ |
Data Visibility and Lineage | ⚠️ | ✅ | ⚠️ | ⚠️ |
Increased Reliability | ✅ | ✅ | ⚠️ | ✅ |
Broader Trends
We noted a number of similarities between the tools, highlighting trends across the data ecosystem. They include:
- Software Engineering practices (CI/D, testing, multiple environments etc.) are becoming more dominant in data tooling.
- Products tend to have an OSS core with a premium tier available (Prefect, Dagster, Mage, DBT, DLT). We did notice that most of the security features (SSO, and RBAC) are hidden behind paywalls which is disappointing, but understandable.
- Most included IaC for deploying their service to popular cloud providers. Although some weren’t always up to date or working.
- Embracing a data-first approach. Dagster is ahead of the pack here, with really powerful global lineage features out of the box. However, the others either have partial features already, or on their roadmap.
- Hybrid Architecture. We take security very seriously at FreeAgent, and seeing the rise of Hybrid Architectures which allow us to secure our platform, while offloading less sensitive components to SaaS providers is very convenient and seemingly the best of both worlds. In the case of Prefect and Dagster, the flexibility in defining ephemeral, flexible resources means we can scale each task appropriately.
Conclusion
With the plethora of tooling options available, there will be one that suits your needs. No matter the tool that you chose to build your data pipelines, you should understand how you intend to use that tool, and how it will benefit your situation.
We approached selecting a tool by first defining a list of criteria, then evaluating a number of prominent tools against them. This helped broaden our understanding of the data tooling ecosystem in 2025, and ultimately identified the tool that we believe is the best fit for our team based on this particular evaluation. In our case, for our current requirements, this was Dagster. Other teams with different priorities and contexts might make different choices, and we encourage readers to use our findings as one data point among many.
Let us know in the comments what factors are most important in your tool selection? And, what are your experiences with these tools?