<< goback()

Snowflake is proving AI for data needs more than coding agents

Kostas Pardalis

Kostas Pardalis

Co-Founder
Snowflake is proving AI for data needs more than coding agents

This year’s Snowflake Summit was different.

First, we didn’t have any drama driven by acquisitions between Snowflake and Databricks, which is a good thing if you ask me. The focus was much more on product announcements, and what made it interesting was the huge focus on AI.

Second, the expo this year was quite an experience. It felt very familiar but also very different in a surreal way.

Familiar because the majority of the vendors there have been around Snowflake for a long time, so you would see many familiar logos.

Surreal because every one of them was rebranding into an AI company that sells some kind of infrastructure for agents to work with your data.

If you didn’t know what these companies were doing before, you would assume they were all competing with each other, although one might be selling a catalog and the other cloud-hosted pipelines.

It is very interesting to see the effect AI has on the industry.

But anyway, the most interesting part was not the expo. It was what Snowflake announced and what those announcements tell us about the future of AI for data.

The biggest takeaway for me was this:

To make AI effective on data, we need different infrastructure, workflows and validation loops than what coding agents are designed for.

Snowflake is moving in this direction and proves it by providing a lot of what is needed for retrieval and context. But this is not enough.

There is more needed in the retrieval itself, and one big piece is still not first-class: validation.

Without validation, we won’t be able to generate the same success that we see with AI in other engineering disciplines.

This is the problem Typedef is built around.

Typedef gives data teams the infrastructure to let agents work across fragmented enterprise context, reason through complex data tasks, validate their claims against governed evidence and leave behind an auditable trail that humans and agents can inspect.

Validation is a big part of it, but the broader goal is making AI on data reliable enough to use in real workflows.

Snowflake is moving from AI features to an agentic operating layer

Based on the AI-related announcements, Snowflake seems to be moving from “AI features inside the data platform” toward “an agentic operating layer on top of the data platform.”

This is important because it explains why the investment in the core platform is not irrelevant to the AI strategy. It is the complete opposite.

For agents to operate on top of the data platform, the platform has to transform into a system that is built with agents as one of the primary personas interacting with it.

This is exactly what is happening here.

Snowflake is not just saying “use LLMs on your data.” They are moving toward a world where agents can reason across governed context, invoke tools, build assets, trigger workflows and interact with the platform through different surfaces.

The announcements make more sense when you look at them through that lens.

The first important category is agent surfaces.

Snowflake announced CoCo and CoWork, which are basically two different agent experiences for two different personas. CoCo is the agent for builders and data teams. CoWork is the agent for knowledge workers and business users.

This matters because data work is not one generic workflow.

A business user, a data engineer and a data scientist might interact with the same underlying assets, but they operate in very different worlds. The agent has to change based on who is using it and what they are trying to do.

The second category, and probably the most important one, is context.

Snowflake announced Cortex Sense and Horizon Context, and this is where the strategy becomes more interesting.

Snowflake is trying to make business definitions, semantic meaning, lineage, query history, dashboards, operational knowledge and agent trajectories available to agents.

This is a very important signal.

Snowflake appears to believe that the way to make agents useful on enterprise data is not just by giving them a better model. It is by giving them better governed context.

The third category is governance and security.

If agents are going to act on enterprise data, then access, identity, policy, lineage and auditability become core infrastructure.

This is why announcements like Agent Identity, Horizon Catalog improvements and AI security controls matter.

The moment agents move from answering questions to taking actions, governance becomes much more than a checkbox. It becomes part of the runtime.

The fourth category is platform foundation.

Snowflake also announced a lot of improvements around streaming, Iceberg, zero-copy integrations, Dynamic Tables, dbt Projects, Openflow and Snowpark.

These might look like regular data platform announcements, but they are connected to the AI strategy.

Agents are only useful if they can operate on fresh, governed and connected data. So the platform has to become more real time, more open, more connected and easier to build on.

The fifth category is model access.

Snowflake will keep giving customers access to models from OpenAI, Anthropic and open-weight providers, and they will continue investing in Cortex Training and Snowflake ML.

But I don’t think this is the real moat.

The more interesting part is not that Snowflake can expose models.

The more interesting part is how those models are grounded in enterprise context, connected to workflows and governed inside the data platform.

There is an obvious new narrative for Snowflake, and it is all about agentic AI.

This is probably not surprising anyone. But what is important here is that there is also a strong direction toward platform continuity.

The AI side has the most new branding and positioning with things like CoCo, CoWork, Cortex Sense and Horizon Context.

The data platform side is mostly about making the existing Snowflake foundation more real time, more open, more governed, more connected and easier to develop on.

These two things are connected.

The platform has to change because the primary interaction model is changing.

Coding agents are not enough for data

The most obvious emerging theme is that agentic AI is becoming the main product narrative.

Instead of Snowflake saying “use LLMs on your data,” now we have agents positioned as the main interaction model with the platform.

This is very interesting if you consider the history of Snowflake.

One of the reasons they became what they are today was their obsession with user experience. They took a highly technical product category and made it approachable to a much broader audience.

Their SaaS and UI/UX obsession was a big part of that strategy, and now they have to reinvent themselves around a different interaction model.

It will be interesting to see how they execute on that.

The announcement of CoCo and CoWork is also interesting because it follows a broader industry pattern: agent specialization based on the user persona.

Different users and different use cases require different agents too.

The way this happens is by creating a different harness.

The models might remain the same, but the emerging agent changes based on who is interacting with it and for what reason.

This is important because it signals where things are going as models get even more commoditized and what it means to build products and software in this new world of agents.

It also makes something very clear:

Coding agents are not the right tool for AI in data by themselves.

A coding agent can be very effective when the task is contained inside a software engineering workflow. But data work is different.

It involves business definitions, metric meaning, lineage, freshness, governance, access controls, external systems, historical context and a lot of ambiguity.

A business user, a data engineer and a data scientist operate in completely different worlds, although they interact with the same underlying assets.

That requires a different architecture.

New harnesses, new workflows, new retrieval systems, new tools and most importantly, new validation loops.

Context is king

This is probably my favorite part.

Snowflake announced Cortex Sense, Horizon Context, improvements in the semantic layer, and spent a lot of time talking about lineage, query history, business definitions and agent trajectories.

Snowflake doesn’t want to compete on models, and probably not even on inference, although they will keep doing that as long as capacity is a hot commodity and they have the infrastructure.

Snowflake appears to believe that the way for them to win in enterprise AI is to take control over the context layer and provide governance over it.

Horizon Context is their attempt to make the catalog a semantic operating layer instead of just a metadata inventory.

They want to add another layer on top of the metadata that already exists, one that enriches the catalog with business meaning, definitions, relationships, authoritative metrics and more.

To do that, the catalog becomes capable of collecting, enriching and serving context.

Why is this important?

Because by centralizing the context, it can be governed.

And this is really important to the success of AI in enterprise data.

Cortex Sense seems to be more agent-runtime oriented.

It is a bit confusing from their content to understand how it differs from Horizon Context, but Sense seems to be more about retrieval and synthesis while the agent is running.

Another way to think about this is that Horizon Context is long-term context management, while Cortex Sense is what is exposed to the agentic harness so agents can retrieve the right context for each task.

What I find really interesting here is the numbers they shared.

CoCo and CoWork reached 83% accuracy on complex enterprise queries when using Cortex Sense, compared to 47% when not using it and 23% for frontier coding agents using Snowflake MCP.

This is a very strong signal that we need a different approach if we want to bring AI to data.

It is also a very strong signal that coding agents are not the right tool for this by themselves.

Enterprise context as the AI moat

The biggest pattern that emerges is that Snowflake is trying to make enterprise context its AI moat.

A lot of vendors can plug in OpenAI, Anthropic or open-weight models.

Snowflake is trying to position itself as:

We already hold your governed data, metadata, lineage, security rules, semantic definitions, query history, pipelines and business context. So our agents can act with enterprise-grade context and controls. They will perform better inside a governance environment that you already trust.

This is interesting because it seems that Snowflake believes solving AI for data is not six months away and another major release from a frontier lab.

And I agree with that.

The bottleneck is not just the model.

The bottleneck is whether the agent can understand the environment it operates in.

For enterprise data, that environment is messy.

Context is extremely fragmented.

Snowflake can bring things together in one place as long as everything is within their platform. This is good for them, but not very realistic as the whole answer.

Even if you migrate all your data platform into their environment, you still have context that exists in external systems.

That context can live in BI tools, notebooks, Slack, docs, catalogs, CRM systems, pipelines, tickets, dashboards, business processes and people’s heads.

This is why context is not just the artifacts you put together.

Context is also the harness you build around it, the tooling, the workflows, the validators and much more.

Retrieval is not enough

This is the part I think matters most.

Context is king, but context is not just retrieval.

Retrieval is one part of context. The other one is validation.

And I would argue that for AI in data, validation is even more important.

It is one thing to ground an agent to better information. It is another thing to be able to reason about the validity of the claims the agent makes based on the grounded information.

A data agent can retrieve the right dashboard, the right metric definition and the right table, and still make a mistake.

It can choose the wrong grain.

It can join tables incorrectly.

It can use a stale definition.

It can compare the wrong time periods.

It can confuse bookings, revenue, ARR and recognized revenue.

It can produce a plausible narrative that is not actually supported by the underlying data.

It can take an action based on a claim that was never validated.

This is why validation has to be part of the agentic loop itself, this is not an offline eval problem.

Offline evals are useful, but they are not enough for AI in data. Validation has to happen while the agent is doing the work.

For data agents, validation means checking claims against source systems, metric definitions, lineage, query results, business rules, historical behavior and the actual evidence the agent used.

It also means making the work auditable.

Both humans and agents should be able to inspect what was claimed, what evidence was used, what checks were performed, what assumptions were made and where the work might have failed.

Snowflake is moving aggressively in the right direction with context, retrieval, governance and security.

But the validation layer is not yet the center of the story.

And without that, I don’t believe we will be able to generate the same success that we see with AI in other engineering disciplines.

What data teams can learn from this

First, coding agents are not the right tool for AI in data.

They are part of the story, but they are not enough.

New architectures, new harnesses and different workflows are needed.

Second, context is king, but context is not one single thing.

A business user, a data engineer and a data scientist operate in completely different worlds, although they interact with the same underlying assets.

So the context they need is different too.

Third, context is extremely fragmented.

Snowflake can bring things together in one place as long as everything is within their platform. This is good for them, but not very realistic.

Even if you migrate all your data platform into their environment, you still have context that exists in external systems.

Fourth, retrieval is just one part of context.

Another important one is validation.

And for AI in data, validation might be the most important piece.

Finally, there is an important question every data leader has to ask themselves:

Should context be Snowflake’s moat, or should it be their own moat in a world where unique business data becomes the only defensible asset a business has?

Final thoughts

Seeing the direction Snowflake is taking made me really happy because it validates many of the assumptions behind Typedef.

I started Typedef because I believed that AI in data is a fundamentally different problem that cannot be solved by coding agents alone.

I do believe that context is part of the solution, but we need more than solving the retrieval part of it.

We need to be able to equip agents with the right tooling to reason across systems, for complex long-horizon tasks, in an always shifting environment.

We also need to be able to continuously validate the claims our agents are making and do that in a governed way, so both humans and agents can audit the work that is being done and keep improving and adapting.

This is where Typedef comes in.

Typedef is building the validation and orchestration layer for AI on enterprise data.

It gives agents the ability to retrieve context across fragmented systems, reason through long-running data tasks, validate claims against governed evidence and leave behind an auditable trail that both humans and agents can inspect.

Snowflake is showing why context matters.

Typedef is focused on the next layer: making sure the work agents do with that context is actually valid.

And we need to do all that while data teams maintain total control and ownership over the data of the organization and turn it into the asset that will enable their companies to thrive in the new AI world.

To learn more about how Typedef can help you achieve this, reach out.