Column Level Lineage

Trace column-level lineage and search your full data catalog directly from DinoAI. Understand upstream and downstream dependencies and find related assets across your data stack.

The Column-Level Lineage Tool allows DinoAI to trace how individual columns flow through your data pipeline and search across your full data catalog, giving you precise visibility into data dependencies directly from the Code IDE.

This tool lets DinoAI answer questions about where a column originates, what it feeds into downstream, and which assets in your warehouse relate to the work you're building.

Capabilities

The Column-Level Lineage Tool exposes two underlying operations:

get_column_level_lineage fetches the full lineage graph for a specific model and column combination. It returns a JSON structure of nodes and edges, each carrying the file path, model name, column name, and upstream or downstream direction. Requests time out after 60 seconds.

search_catalog searches the data catalog across all dbt assets β€” models, sources, tests, and macros β€” as well as third-party integrations including Looker, Tableau, and Fivetran. It supports a text query, a result limit between 1 and 10 (default 3), and offset-based pagination. Results include asset metadata such as descriptions, tags, relationships, and lineage. Catalog data is sourced from the latest manifest.json and catalog.json; third-party data refreshes daily.

Using the Column-Level Lineage Tool

  1. Open DinoAI in the right panel of the Code IDE

  2. Provide the model name and column you want to trace, or describe the asset you want to find

  3. Add your prompt describing what you want DinoAI to do with the lineage or catalog results

  4. Review DinoAI's findings and apply them to your development work

Example Use Cases

Tracing a Column's Upstream Sources

Prompt

Get the column-level lineage for the revenue column in the fct_orders model.

Result: DinoAI fetches the full lineage graph for that column, identifying every upstream model and source that contributes to it so you can understand data origin and spot potential issues.

Understanding Downstream Impact

Prompt

Result: DinoAI maps the downstream nodes in the lineage graph, surfacing every model and column that would be affected by a change to that field.

Searching the Data Catalog

Prompt

Result: DinoAI queries the catalog across dbt models, sources, and third-party integrations, returning matching assets with their descriptions, tags, and lineage relationships.

Working with Other Tools

The Column-Level Lineage Tool works well alongside DinoAI's other capabilities to support your full development workflow:

  • Combine with the SQL Execution Tool to validate query logic against columns you've traced through lineage

  • Combine with the Google Docs Tool or Notion Tool to cross-reference lineage findings against your data modeling specs

  • Use alongside Git Lite to commit and push lineage-informed changes before opening a pull request

Best Practices

Be specific with model and column names β€” Provide the exact model name and column name to ensure DinoAI retrieves the correct lineage graph.

Use catalog search to explore before you build β€” Run a catalog search before creating new models to check whether a similar asset already exists in your workspace.

Account for refresh cadence β€” Lineage data reflects the latest manifest.json and catalog.json; third-party integration data refreshes daily, so very recent changes may not yet appear.

Verify complex graphs β€” For models with many upstream dependencies, review DinoAI's lineage summary carefully before making structural changes.

Last updated

Was this helpful?