Latest Insights
Blog

Cortex Code in Practice: Maintaining dbt Projects as They Scale 

When Snowflake introduced Cortex Code, the appeal was immediate for teams already deep in the Snowflake ecosystem. With Cortex Code (CoCo), Snowflake provides an in-platform capability that can reason over SQL, schemas, dbt projects, and metadata, without moving data outside the platform. That lines up closely with how Passerelle’s analytics engineering teams already work. 

The question wasn’t whether the concept made sense. It was whether the capability would hold up inside real projects, where active development, operational constraints, and accumulated complexity are the norm. 

Instead of evaluating Cortex Code in isolation, our team put it directly into active Snowflake AI Data Cloud and dbt environments. Passerelle consultants Harris Plaisted, Alexis Richer, and Mark Pelletier used ongoing client work as the test case to show how CoCo can support enterprise-grade analytics engineering workflows. 

What stood out 

What emerged was a tool that can systematically evaluate dbt project health, helping engineers stay focused on the complex, judgment-based decisions that show up as projects scale. 

Cortex Code doesn’t behave like a general assistant that can absorb an entire repository and return a single clean answer. The most effective pattern looked more like an evaluation layer: grounded in metadata and project conventions, run repeatedly as the system changes. 

That distinction shaped how our team approached the work. 

Broad prompts led to generic results 

Our first pass followed a familiar pattern. Passerelle engineers pointed Cortex Code at an entire project and asked: 

“Look at this project and tell me what’s wrong.” 

The response was reasonable, but not especially actionable. It surfaced general issues without enough context to prioritize fixes, and the signal blended into the noise. 

Better results came from narrowing the scope instead of endlessly refining the wording. Looking at specific directories (rather than the entire repo), grounding evaluation in dbt structure and naming conventions, and rerunning checks as the project evolved all changed the outcome significantly. 

With a clear frame of reference, results shifted from broad observations to specific findings the team could act on. 

Real value showed up in dbt project health 

The most consistent value came from evaluating dbt projects that had already been in use for a while. 

In these environments, it was clear the project needed cleanup, such as multiple contributors, changing requirements, and day-to-day development tend to leave behind small inconsistencies that compound over time. 

Running Cortex Code against those projects surfaced familiar issues, but did so systematically: 

  • Models missing primary key or relationship tests 
  • Sources without freshness checks 
  • Models with incomplete or missing documentation 
  • Configuration patterns drifting across layers 

The challenge isn’t awareness, it’s tracking these issues consistently as the project grows. 

Key findings from using Cortex Code in enterprise dbt projects 

The team used Cortex Code to separate mechanical fixes from decisions that require judgment. Low-risk changes, such as adding missing tests, filling in empty descriptions, and standardizing simple configuration gaps, could be generated and reviewed quickly. More complex work, like incremental logic, model grain, and transformation design, stayed entirely in human hands. 

That boundary made the workflow practical: automated suggestions improved consistency, while engineers retained control over anything that changed how the business is modeled. 

Internally, the team started referring to this pattern as a “dbt project health agent.” In practice, that doesn’t mean an autonomous developer, it means a repeatable evaluation workflow that checks a project against known expectations: tests, documentation, freshness, configuration consistency, and convention drift. 

The goal is to make issues visible and actionable before small inconsistencies become expensive cleanup work. Cortex Code didn’t design new models or replace engineering judgment, but it helped teams maintain project integrity as changes accumulated. 

Developer insight: “Cortex Code works best when you turn repeatable checks into a guided workflow—using skills (and artifacts like dbt project evaluator outputs) to consistently steer model work instead of relying on one-off prompts.” — Harris Plaisted 

A second-order effect showed up in how often these checks ran. Manual reviews tend to happen periodically, usually triggered by a problem. With the effort reduced, health checks became something the team could run far more frequently, tightening the feedback loop. 

CLI vs. Snowsight: same capability, different workflows 

The interface shaped outcomes more than we expected. 

We ended up using both the CLI and Snowsight, but in different situations. To make that explicit, we mapped common workflows to each interface.

dbt scenarioCLICoCo (Snowsight)Notes
Active dbt development & refactoring Best fit LimitedCLI handles full repo traversal and fast iteration 
dbt project health & convention checks StrongStrongBoth effective when scoped to metadata 
Auto-fixing safe dbt issues PreferredGuardedCLI easier for batch fixes; UI adds safety 
Incremental strategies & grain logic Assist OnlyAssist OnlyHuman judgment required 
Large dbt repos / deep traversal Native StrengthUI FrictionSnowsight struggles with heavy traversal 
Governed client environments Often RestrictedBest FitSnowsight aligns with role-based governance 
Stakeholder reviews & demos Not IdealBest FitUI matters more than speed 

The distinction is less about capability and more about fit. 

For active development and refactoring, the CLI worked better. Full repository traversal, faster iteration, and fewer constraints mattered, especially in larger projects. Snowsight could support that work, but friction showed up quickly. 

For project health checks and convention validation, both interfaces performed well when scoped correctly. Once evaluation was grounded in metadata and clear expectations, results were consistent across both. 

Auto-fixing introduced a tradeoff. The CLI made batch fixes easier to generate and apply, while Snowsight added a layer of visibility and control that worked better in environments where changes needed review. 

Some areas didn’t change regardless of interface. Incremental strategies, model grain, and transformation logic still required human judgment. 

At scale, the differences became more pronounced. Large repositories highlighted the CLI’s strength in deep traversal, while Snowsight introduced friction under heavier workloads. In governed environments, that tradeoff often reversed; Snowsight aligned more naturally with role-based access and review patterns. 

For stakeholder-facing work, the split was clear: the CLI isn’t a good fit for demos or collaborative review, while Snowsight makes those interactions easier. 

Most teams will end up using both, depending on context. 

Engineering judgment still drives the work 

Cortex Code works best as a first pass, not a final answer. Generated changes were reviewed, and evaluations were rerun after fixes. Nothing moved forward without an engineer validating the outcome. 

Developer insight: “The real unlock is safe, in-platform learning. People can explore and generate the ‘how’ without moving data into external AI tools, then hand off anything risky to the data team.” — Mark Pelletier 

This approach keeps the division of responsibility clear. Cortex Code can identify patterns and inconsistencies quickly; engineers still own how data is modeled and how those models reflect the business. 

Used this way, the tool removes repetitive work without introducing unnecessary risk. 

Usage patterns shaped cost and value 

Because Cortex Code runs on a consumption model, usage patterns directly influence both cost and value. 

Early experiments that scanned entire environments or passed overly broad context produced more output, but not better outcomes. More targeted runs, focused on specific parts of a project and repeated over time, produced more useful results and more predictable usage. 

This aligns with how teams already manage compute in Snowflake: scoped workloads and repeatable processes tend to outperform large, one-off runs. Suggested workflows include: 

  • Run checks by directory or layer, not against the whole repo. 
  • Save useful prompts or workflows as repeatable checks. 
  • Rerun after fixes to validate whether drift was reduced. 
  • Treat broad scans as discovery only, not as the normal operating pattern. 
  • Track whether repeated checks reduce recurring review burden. 

Teams that treat project health evaluation as an ongoing workload, not an occasional exercise, tend to get more consistent value. 

dbt Project Quality Became More Concrete 

One of the more useful changes had nothing to do with automation. 

A consistent evaluation layer changed how the team talked about project quality. 

Instead of relying on general impressions, discussions started with specific observations, such as which models lack tests, where documentation is incomplete, and which configurations have drifted. That level of specificity made it easier to prioritize work and to explain technical debt in a way that stakeholders could understand. 

What We’re Watching Next 

The open question is not whether Cortex Code can generate SQL or summarize a project. Those capabilities are already expected. 

The more interesting question is can we productionalize the use of an AI supported engineering layer within Snowflake. Continuous evaluation of project health, earlier detection of risk, and tighter feedback loops between development and maintenance all become possible if the workflow holds. 

Developer Insight: “CoCo provides a fast way to prototype and compare data-quality checks inside Snowflake. It can speed up investigation and turn findings into repeatable steps, without needing to move context into an external AI tool.” Alexis Richer 

There are still boundaries to define. How much automation makes sense, where human judgment should always apply, and how teams integrate this into daily development practices. 

Cortex Code does not change what good analytics engineering looks like; it makes it easier to maintain those standards as projects grow, teams expand, and small inconsistencies accumulate. The value is not in a single run. The value shows up when project health becomes something that can be checked, improved, and revisited continuously.

Return to Blog