When Snowflake introduced Cortex Code, the appeal was immediate for teams already deep in the Snowflake ecosystem. With Cortex Code (CoCo), Snowflake provides an in-platform capability that can reason over SQL, schemas, dbt projects, and metadata, without moving data outside the platform. That lines up closely with how Passerelle’s analytics engineering teams already work.
The question wasn’t whether the concept made sense. It was whether the capability would hold up inside real projects, where active development, operational constraints, and accumulated complexity are the norm.
Instead of evaluating Cortex Code in isolation, our team put it directly into active Snowflake AI Data Cloud and dbt environments. Passerelle consultants Harris Plaisted, Alexis Richer, and Mark Pelletier used ongoing client work as the test case to show how CoCo can support enterprise-grade analytics engineering workflows.
What stood out
What emerged was a tool that can systematically evaluate dbt project health, helping engineers stay focused on the complex, judgment-based decisions that show up as projects scale.
Cortex Code doesn’t behave like a general assistant that can absorb an entire repository and return a single clean answer. The most effective pattern looked more like an evaluation layer: grounded in metadata and project conventions, run repeatedly as the system changes.
That distinction shaped how our team approached the work.
Broad prompts led to generic results
Our first pass followed a familiar pattern. Passerelle engineers pointed Cortex Code at an entire project and asked:
“Look at this project and tell me what’s wrong.”
The response was reasonable, but not especially actionable. It surfaced general issues without enough context to prioritize fixes, and the signal blended into the noise.
Better results came from narrowing the scope instead of endlessly refining the wording. Looking at specific directories (rather than the entire repo), grounding evaluation in dbt structure and naming conventions, and rerunning checks as the project evolved all changed the outcome significantly.
With a clear frame of reference, results shifted from broad observations to specific findings the team could act on.
Real value showed up in dbt project health
The most consistent value came from evaluating dbt projects that had already been in use for a while.
In these environments, it was clear the project needed cleanup, such as multiple contributors, changing requirements, and day-to-day development tend to leave behind small inconsistencies that compound over time.
Running Cortex Code against those projects surfaced familiar issues, but did so systematically:
The challenge isn’t awareness, it’s tracking these issues consistently as the project grows.
Key findings from using Cortex Code in enterprise dbt projects
The team used Cortex Code to separate mechanical fixes from decisions that require judgment. Low-risk changes, such as adding missing tests, filling in empty descriptions, and standardizing simple configuration gaps, could be generated and reviewed quickly. More complex work, like incremental logic, model grain, and transformation design, stayed entirely in human hands.
That boundary made the workflow practical: automated suggestions improved consistency, while engineers retained control over anything that changed how the business is modeled.
Internally, the team started referring to this pattern as a “dbt project health agent.” In practice, that doesn’t mean an autonomous developer, it means a repeatable evaluation workflow that checks a project against known expectations: tests, documentation, freshness, configuration consistency, and convention drift.
The goal is to make issues visible and actionable before small inconsistencies become expensive cleanup work. Cortex Code didn’t design new models or replace engineering judgment, but it helped teams maintain project integrity as changes accumulated.
Developer insight: “Cortex Code works best when you turn repeatable checks into a guided workflow—using skills (and artifacts like dbt project evaluator outputs) to consistently steer model work instead of relying on one-off prompts.” — Harris Plaisted
A second-order effect showed up in how often these checks ran. Manual reviews tend to happen periodically, usually triggered by a problem. With the effort reduced, health checks became something the team could run far more frequently, tightening the feedback loop.
CLI vs. Snowsight: same capability, different workflows
The interface shaped outcomes more than we expected.
We ended up using both the CLI and Snowsight, but in different situations. To make that explicit, we mapped common workflows to each interface.
| dbt scenario | CLI | CoCo (Snowsight) | Notes |
| Active dbt development & refactoring | Best fit | Limited | CLI handles full repo traversal and fast iteration |
| dbt project health & convention checks | Strong | Strong | Both effective when scoped to metadata |
| Auto-fixing safe dbt issues | Preferred | Guarded | CLI easier for batch fixes; UI adds safety |
| Incremental strategies & grain logic | Assist Only | Assist Only | Human judgment required |
| Large dbt repos / deep traversal | Native Strength | UI Friction | Snowsight struggles with heavy traversal |
| Governed client environments | Often Restricted | Best Fit | Snowsight aligns with role-based governance |
| Stakeholder reviews & demos | Not Ideal | Best Fit | UI matters more than speed |
The distinction is less about capability and more about fit.
For active development and refactoring, the CLI worked better. Full repository traversal, faster iteration, and fewer constraints mattered, especially in larger projects. Snowsight could support that work, but friction showed up quickly.
For project health checks and convention validation, both interfaces performed well when scoped correctly. Once evaluation was grounded in metadata and clear expectations, results were consistent across both.
Auto-fixing introduced a tradeoff. The CLI made batch fixes easier to generate and apply, while Snowsight added a layer of visibility and control that worked better in environments where changes needed review.
Some areas didn’t change regardless of interface. Incremental strategies, model grain, and transformation logic still required human judgment.
At scale, the differences became more pronounced. Large repositories highlighted the CLI’s strength in deep traversal, while Snowsight introduced friction under heavier workloads. In governed environments, that tradeoff often reversed; Snowsight aligned more naturally with role-based access and review patterns.
For stakeholder-facing work, the split was clear: the CLI isn’t a good fit for demos or collaborative review, while Snowsight makes those interactions easier.
Most teams will end up using both, depending on context.
Engineering judgment still drives the work
Cortex Code works best as a first pass, not a final answer. Generated changes were reviewed, and evaluations were rerun after fixes. Nothing moved forward without an engineer validating the outcome.
Developer insight: “The real unlock is safe, in-platform learning. People can explore and generate the ‘how’ without moving data into external AI tools, then hand off anything risky to the data team.” — Mark Pelletier
This approach keeps the division of responsibility clear. Cortex Code can identify patterns and inconsistencies quickly; engineers still own how data is modeled and how those models reflect the business.
Used this way, the tool removes repetitive work without introducing unnecessary risk.
Usage patterns shaped cost and value
Because Cortex Code runs on a consumption model, usage patterns directly influence both cost and value.
Early experiments that scanned entire environments or passed overly broad context produced more output, but not better outcomes. More targeted runs, focused on specific parts of a project and repeated over time, produced more useful results and more predictable usage.
This aligns with how teams already manage compute in Snowflake: scoped workloads and repeatable processes tend to outperform large, one-off runs. Suggested workflows include:
Teams that treat project health evaluation as an ongoing workload, not an occasional exercise, tend to get more consistent value.
dbt Project Quality Became More Concrete
One of the more useful changes had nothing to do with automation.
A consistent evaluation layer changed how the team talked about project quality.
Instead of relying on general impressions, discussions started with specific observations, such as which models lack tests, where documentation is incomplete, and which configurations have drifted. That level of specificity made it easier to prioritize work and to explain technical debt in a way that stakeholders could understand.
What We’re Watching Next
The open question is not whether Cortex Code can generate SQL or summarize a project. Those capabilities are already expected.
The more interesting question is can we productionalize the use of an AI supported engineering layer within Snowflake. Continuous evaluation of project health, earlier detection of risk, and tighter feedback loops between development and maintenance all become possible if the workflow holds.
Developer Insight: “CoCo provides a fast way to prototype and compare data-quality checks inside Snowflake. It can speed up investigation and turn findings into repeatable steps, without needing to move context into an external AI tool.” Alexis Richer
There are still boundaries to define. How much automation makes sense, where human judgment should always apply, and how teams integrate this into daily development practices.
Cortex Code does not change what good analytics engineering looks like; it makes it easier to maintain those standards as projects grow, teams expand, and small inconsistencies accumulate. The value is not in a single run. The value shows up when project health becomes something that can be checked, improved, and revisited continuously.