Building a Data Mart with dbt, Lightdash, and Paradime

Recently, I participated in the dbt Fantasy Football Data Modeling Challenge, hosted by Paradime & Lightdash. Over 300 SQL and dbt experts tackled real NFL and fantasy football data. My approach, which placed 2nd overall, focused on building a self-service data mart for dynamic exploration of scoring trends and player performance. I used dbt for data transformation, Lightdash for visualization, and Paradime.io as the IDE for development. This analytics stack made complex fantasy football data accessible to everyone.

I’ll walk through how I structured this data mart, the challenges I faced, and how dbt and Lightdash empower self-serve analytics. Whether working with fantasy sports data or business intelligence, the same principles apply to designing an intuitive, flexible data platform.

Structuring Common Objects for Intuitive Data Exploration

In the debate of One Big Table (OBT) vs. Star Schema, Star Schema comes out ahead in most contexts. Building a semantic layer with dbt and Lightdash is no exception, but it requires planning to ensure users can intuitively navigate and pull data without friction.

To create a logical and user-friendly data model, I structured core objects—teams, players, games, and plays—in a way that made sense for end users. This meant:

Ensuring commonly joined tables (e.g., Players → Teams) were straightforward and required minimal transformations.
Maintaining consistent data granularity, so each view had a single level of detail—avoiding confusion over perceived “duplicate” rows.

Granularity is crucial but often overlooked in the end-user experience. Imagine a user querying NFL teams. They select City and Stadium from the Team table but mistakenly pull TeamName from the NFLPlayer table. The result is a dataset with one row per player instead of one row per team.

While this might seem like an easy mistake to fix for a data expert, we can’t assume the average end user will recognize the issue. Instead of expecting users to troubleshoot these problems, we should proactively design our models to prevent them.When structuring dbt models, we have two options:

Abandon the star schema and move to a one big table (OBT) approach—losing the benefits of a well-structured schema.
Enforce granularity rules, ensuring that joins only happen at appropriate levels.

In the dbt model below, we take the second approach. We prevent joins when a user selects the Teams table in Lightdash, effectively forcing users to avoid one-to-many joins from base tables. This design choice helps maintain data integrity and ensures a smoother self-service experience.

 - name: teams
    description: "A table containing information about sports teams"
	...
 - name: players
    description: Aggregates the latest player stats for each game from the source table.
    meta:
      joins:
        - join: teams
          alias: Team
          sql_on: ${players.team} = ${Team.abvr}
	...

While selecting the Teams table, the user can’t pull in any information which might create rows they don’t expect.

Selecting the Players table allows for the user to to bring in team information as this sits at a higher level of granularity.

Using paradime.io’s lineage view, which is built right into the IDE, makes the design process a bit easier. If you have used Altimate’s Power User VSCode extension this will come naturally to you. Visualizing relationships, refreshing downstream tables, and viewing the mart’s catalog all can be done directly in your IDE.

Enhancing Usability Under the Hood

URLs should be hyperlinked, team colors preloaded for plotting, and small details handled automatically for end users. Leveraging underutilized features in Lightdash and dbt makes analytics more intuitive.

To connect raw data with external resources, I embedded clickable hyperlinks to sites like Pro Football Reference, Sleeper, and ESPN. This let users cross-reference websites without manual searches.

You’re not limited to a column’s value—you can also pull information from other columns in the same row.

 - name: pfr_id
        description: Pro Football Reference identifier for the player.
        tests:
          - unique
        meta:
          dimension:
            type: string
            groups:
              - Identifiers
            urls:
              - label: Open in Pro Football Reference
                url: >-
                  https://www.pro-football-reference.com/players/${
                  row.table_name.player_name|first }/${ value.raw }.htm

Additionally, I defined team colors in the mart’s .yml file, making visualizations more engaging and consistent. Instead of manually searching for hex codes every time, dashboards automatically applied the correct team colors, streamlining the visualization process.

While you can set the theme at an organizational level in Lightdash, don’t forget you can also set colors at the column level.

 - name: abvr
        description: "The abbreviation of the team"
        meta:
          dimension:
            label: 'Team Abbreviation'
            type: string
            colors:
              DET: '#0076B6'
              NYJ: '#17B03B'
              LAC: '#1D64B6'
              HOU: '#03202F'
              TEN: '#4B92DB'
              KC: '#E31837'
 ...

Pre-Built Dashboards for Less Technical Users

Not all end users will be able to conceptualize how to pull data by tables, and fewer will be able to use Lightdash’s SQL runner. To help less technical users get started, I built out-of-the-box dashboards focusing on:

Team Performance: Key stats, trends, and scoring breakdowns for each NFL team.
Quarterback Insights: Deep dives into QB performance, highlighting passing trends and efficiency metrics.

Pinned on the homepage these dashboards act as a launchpad for deeper exploration, empowering users to gain insights without technical knowledge.

Both the team and quarterback snapshots utilize dashboard level filters. Given the joins we built above applying the filters to each visualization doesn’t require restructuring out dbt models to add fields.

For the purpose of this project, I saved each of the above charts and tables directly to the dashboard to keep the Space clean. This helps keep public spaces clean. End users can always hit “Explore from here” on any chart to derive the original object.

Data Quality & Testing for Future NFL Seasons

A well-designed data model should not only serve the current dataset but also be future-proof. Data teams must proactively implement validations and tests to ensure new data doesn’t break existing models. A strong data mart requires rigorous data validation. I implemented ~50 automated tests in dbt to maintain data accuracy and consistency across seasons. Key tests included:

Primary key validations to prevent duplicates in core tables.
Team name checks to avoid inconsistencies (e.g., “SF” vs. “San Francisco 49ers”).
Jersey number validation ensures values stay within the 0-99 range.

While numerical validation is relatively common, utilizing the accepted_values out of the box test for categorical variables is just as important.

- name: abvr
    description: "The abbreviation of the team"
    meta:
        dimension:
            label: 'Team Abbreviation'
            type: string
    tests:
          - unique
          - not_null
          - accepted_values:
              values:
                - DET
                - NYJ
                - LAC
                - HOU

Paradime’s anomaly detection wasn’t crucial for this project, but I’d recommend it for most others. It monitors key metrics like row counts, null rates, and averages to detect unusual patterns.

The process groups data into time-based buckets (e.g., daily or hourly) and analyzes a training period to set a baseline. The test then compares recent data within a detection period to this historical baseline.

If we extended this project with live game data, this feature would be essential for maintaining high data quality.

Building a Scalable, User-Friendly Data Mart

Turning raw data into a self-service data mart means building something that’s easy to use, scalable, and reliable for everyone. The approach I used for the dbt Data Modeling Challenge applies far beyond fantasy football. Whether you’re analyzing sports data or business metrics, the same principles can help you build a robust, flexible analytics platform with dbt, Lightdash, and paradime.io. These tools played a crucial role in streamlining development, improving collaboration, and ensuring data accuracy throughout the process.

At Driftwave, we help organizations unlock the potential of self-serve analytics by hosting and managing powerful BI tools like Lightdash. For teams already using dbt, we offer three months of free Lightdash hosting, making it easier to transition to a decentralized analytics model without the hassle of infrastructure management.

With Driftwave, you get a strategic partner dedicated to enabling modern, scalable analytics strategies. Our focus is on empowering small to medium-sized organizations to work independently while maintaining consistency, governance, and reliability across their data ecosystem.

If you’re ready to take control of your analytics stack, let’s talk!

Get 3 Months of Lightdash Free For teams using dbt