Exploring Spec-Driven Development: Solving the AI Agent Context Drift

Spec-Driven Development (SDD) is rapidly reshaping the software engineering lifecycle. While using product requirements to guide development is standard practice, structuring those requirements specifically to ground an AI agent’s reasoning represents a fundamental shift. Whether you embrace it or resist it, this agent-native workflow is here to stay, forcing teams to rethink how context is managed and shared.

What is Spec-Driven Development?

Spec-Driven Development (SDD) is an approach that emerged alongside the rise of AI coding agents. Instead of relying on traditional, unstructured requirements that leave significant room for interpretation, SDD focuses on creating detailed, explicit specifications that serve as the primary input for AI agents. The goal is to ground the agent’s reasoning, reduce ambiguity, and ensure more consistent, high-quality code generation.

At its core, SDD bridges the gap between human intent and machine execution. It treats the specification not as a static document, but as the foundational artifact around which the entire development process revolves. This approach acknowledges that while AI agents are powerful, they are not infallible. Without clear, structured guidance, they revert to “vibe-coding,” generating code based on probabilistic patterns rather than deliberate reasoning. SDD provides that necessary structure, creating a framework where human oversight and AI capabilities can work in tandem.

Our Journey

Today, we rely on our own custom spec-driven process. We ended up building it after exploring external frameworks like Speckit and BMAD. While those tools are decent out of the box, they ultimately suffer from the same limitation: they’re too generic to fit our specific workflow.

For us, spec-driven development has brought some much-needed order to the chaos of vibe-coding, leaving a clear trail of breadcrumbs for implementation decisions. When auditors ask why we built something a certain way, having a structured record is a far better starting point than trying to explain the guesswork of vibe-coding.

The real value, however, is the repeatable process we’ve built around it. Feature requirements are co-created and refined with our product team, then handed over to the tech lead and architect to write a technical specification. Once the spec passes through its own refinement loop, we generate user stories, placing both specs and stories in a workspace where our AI agents can easily access them.

Figure 1: The Spec-Driven Development Lifecycle Pipeline

Early Challenges

Spec-driven development is still a fairly new practice, which means we don’t have many established industry patterns to draw from. Here are a few of the hurdles we encountered early on:

Solo Specification Generation

Building specifications in isolation tends to produce one-sided, narrow results. We found that the process works significantly better when multiple stakeholders participate in refining the specifications, ensuring all angles are considered.

Technical Capabilities

Product managers with a less technical background often have a harder time using spec generation tools effectively. Conversely, those who are more technically savvy are better equipped to leverage AI, accelerate the spec generation process, and participate more actively.

Process Overhead

Our initial endeavors with this process were bumpy. Creating specifications without the context of existing functionality or broader business goals resulted in specs that were far off track. Including business goals, high-level architecture, and standardized templates helped tremendously. To support this, we ended up configuring dedicated product repositories to house all the necessary artifacts.

Feature Scoping

The initial specifications we created were simply too large. This gave the agents too much freedom during implementation because the specs weren’t specific enough—it felt a bit like stepping back in time to waterfall development. The knock-on effect was massive pull requests and painfully long review times. As we refined the process and began scoping features appropriately, we successfully reduced both PR sizes and review times.

Managing Context vs. Tracking Work

Application Lifecycle Development Software (ALDS) like Jira and Azure DevOps are designed for tracking the progress of work, not for storing long-lived context. They aren’t the right place for complex specifications.

We realized we needed a curated, central area for context management. For example, if you generate a specification and then create corresponding stories in a tracking system, where does the spec actually live? Does the architectural spec live in your codebase? How does that impact access controls? While small organizations might not face this issue, larger ones absolutely do. If a spec needs to change, do you update the existing feature in ADO or create a new feature? What happens if someone decides to clean up your backlog and all the associated context is destroyed? Separating tracking of software changes from context storage became a crucial distinction for us.

Organizational Structure

Team structure also has a massive impact on the success of this process. Teams that lack “T-shaped” engineers tend to struggle with spec-driven development. We set up our processes so that an engineer could pick up a feature and complete it end-to-end. Having a solid understanding of each part of the application, build system, or infrastructure it touches (like infrastructure as code, deployment pipelines, and security scanning) is essential for evaluating whether the agent’s output is of a sufficiently high quality.

Current Gaps

The current scenarios focus on using remote repositories to manage specs:

A Product Manager (PM) creates a new functional specification within a dedicated repository, isolating it on a feature branch. They submit the draft to stakeholders for review. While stakeholders comment on the first draft, the PM switches gears and initiates a second functional specification on a separate branch. As feedback arrives for the first spec, the PM commits their active work on the second branch, switches back to the original branch, and refines the initial spec. Once the first specification completes its review cycle without further comments, it is merged into the main branch. The PM then returns to the second branch to repeat the cycle.

Meanwhile, the technical lead has started work on the technical specification based on the new functional specification on main. The technical spec also goes through the same cycles but with a potentially different set of stakeholders. When the technical spec’s refinement is completed, the lead generates stories and schedules the stories to be picked up by the developers.

As the developers start to pick up the stories, they start breaking down the tasks for the agent. This breakdown process requires the agent to read the current functional and technical specifications. During this period, the second functional specification goes into main. If all goes well during the task breakdown stage of the agent, it will only focus on the relevant functional specification. This is where the first issue pops up: if the planning agent reads both specifications and there are any conflicts between them, there could be unexpected drift. One way to solve this is to only move specifications into main when a feature is completed, but this adds additional overhead and coordination. Another option is to explicitly download the current specification to a workspace and only work on it there. Both processes are somewhat clunky and potentially error-prone.

Figure 2: Parallel Specification Branching causing Context Drift

Scenario 2: Post-Implementation Feedback Loop (Upstream Synchronization)

During implementation, a developer uncovers an architectural limitation that invalidates part of the active technical and functional specifications. The team is faced with two choices:

Retroactive Updates: Pause active development to update both the functional and technical specifications, then sync all uncompleted user stories. This works in isolation but risks a cascading coordination headache if downstream specifications have already branched from the original documents.
Superseding Specs: Complete the feature in its current state, then generate new functional and technical specifications that officially supersede the old ones. While this works well for minor course corrections, significant pivots can result in extensive rework, and leaving deprecated specifications in the repository risks poisoning the AI agent’s context window during future tasks.

Figure 3: The Upstream Synchronization Dilemma

The Path Forward

Based on the scenarios and gaps identified above, it’s clear that relying purely on traditional version control and project management tools isn’t enough to solve the context drift problem in a scaled, agentic workflow. The current workarounds—whether delaying merges, retroactively synchronizing, or manually curating workspaces—introduce too much friction and error into the development pipeline.

In our next post, we will explore a proposed solution to these challenges. We’ll discuss moving beyond static document management and looking towards more dynamic, context-aware systems designed specifically for AI agents. Stay tuned as we dive into new architectural approaches and tools that can help overcome these hurdles and unlock the full potential of Spec-Driven Development at scale.

Cookie Preferences

What is Spec-Driven Development?#

Our Journey#

Early Challenges#

Solo Specification Generation#

Technical Capabilities#

Process Overhead#

Feature Scoping#

Managing Context vs. Tracking Work#

Organizational Structure#

Current Gaps#

Scenario 1: Parallel Specification Refinement#

Scenario 2: Post-Implementation Feedback Loop (Upstream Synchronization)#

The Path Forward#