Context Drift: The Distributed-Team Killer | JAM Creative

By Jordan Hauge — Published May 13, 2026 — Category: AI Strategy, Distributed Teams

Distributed teams have been losing the same battle for 15 years. The numbers have not budged. The tools have multiplied, the practices have matured, the documentation discipline has improved, and knowledge workers still spend a third of their time looking for things their teammates already know. The reason is not time zones. The reason is context drift, the same failure mode that breaks multi-agent AI systems, applied to humans. AI context engines are the first tool that could finally solve it. Most teams will deploy them wrong and lose the win.

Last sprint, a designer in Ukraine spent two days redoing a flow that was already finalized. The same week, an engineer in Brazil started building a feature that had been deprecated three weeks earlier during a Tuesday afternoon discussion in Kosovo, and a new hire in the US asked a Slack question that had already been answered four times across four different threads that nobody could find.This is the regression pattern that defines distributed work in 2026, and the real damage is not the rework itself. It's the trust erosion that follows when someone realizes their best two days produced nothing because the team's left hand did not know what the right hand decided. People stop trusting the system and start hedging against it, double-checking everything and re-asking questions they already know the answer to, just to make sure nothing has shifted underneath them. The team gets slower because everyone is trying to protect themselves from rebuilding the last thing they rebuilt.I have been running distributed teams across multiple continents for years and have lost count of the regressions. Every manager I know running a remote-first team has a version of this story, and most of them blame time zones for it. While it may appear this way on the surface, the data doesn't actually support that explanation. The actual problem has a name and a formal definition, and until very recently nobody had the right tool to fix it.The 15-Year-Old Problem Nobody Has SolvedThe empirical evidence is striking once you put the timeline together.McKinsey measured knowledge-worker time-spent-searching in 2012 and found employees spent 1.8 hours every day, roughly 9.3 hours per week, looking for information. Slack ran its own Workforce Lab study in 2026 and found 33% of worker time still goes to searching for information. Forrester's research for Airtable put the number at 30%. Asana's current Anatomy of Work Index found knowledge workers spend 60 percent of their time on what the report calls "work about work," averaging 209 hours per year per person on duplicative work and 352 hours per year talking about work.Fifteen years of new tools. Notion went from a side project to a public company. Confluence matured into an enterprise standard. Slack hit 120 million daily active users. Loom became universal for async video. Linear and Jira replaced one another in cycles. Architecture Decision Records moved from obscure pattern to recommended practice. AI search layers were added to all of it.The number did not move.There is a reason for that.Every solution attempted shares the same architectural assumption: humans will author the knowledge layer.Notion pages, Confluence spaces, ADRs, and Slack threads all depend on people writing them down and keeping them current, which mostly does not happen at the pace work actually moves. The faster the team moves, the wider the gap grows between decisions actually made and decisions actually documented. Authoring discipline has never scaled with team velocity, and we have been trying to make it scale for fifteen years with a clear track record of failure.The cost is extremely high. A 2026 research finding from Worklytics tracking enterprise developer usage showed that 67% of new developer task items involve learning information someone else already knows. Two-thirds of the work on a typical engineering team is people re-deriving knowledge that already exists somewhere. One engineering team that migrated from Notion 3.0 to Confluence in late 2025 published their data: 18% of engineering hours were being spent on broken documentation workflows before the migration.Slack itself, in its own materials, acknowledges that institutional knowledge gets buried in busy channels and that "a product decision made in a busy general thread six months ago is effectively gone."That is the platform with 120 million daily active users admitting, in writing, that decisions made on it disappear within months. Notion and Confluence workspaces both suffer documented "content decay," with pages going stale faster than humans can refresh them. The siit.io research on knowledge management tools calls the result "digital graveyards where information exists but nobody finds it."This is the world distributed teams are operating in. The problem is not a tooling shortage, it is a structural mismatch between the speed of decision-making and the speed of decision-capture, and the fifteen-year track record of failed solutions is the proof.Naming the Failure: Context Drift in Human SystemsContext drift is a term that emerged in AI systems research over the last two years to describe what happens when the background information a system needs to operate correctly degrades faster than it gets refreshed. Atlan defined it in a recent piece as the silent failure mode that nobody is monitoring, where semantic definitions diverge across systems and a multi-agent workflow resolves the conflict by picking whichever definition its context layer encountered first, with nobody approving the resolution or even being aware it happened.The arXiv paper formalizing this for multi-agent LLM systems, the Agent Stability Index, defines behavioral drift as "the system's decision-making patterns progressively deviate from design specifications without explicit parameter changes or system failures." The drift compounds across handoffs, with one agent's summary becoming the next agent's starting context, and any error that accumulated in the first step gets handed to the second as fact.Read that paragraph again with a distributed human team in mind. It describes exactly what happens between time zones every week.A 2026 piece from MongoDB on multi-agent memory engineering put it most precisely: "Most multi-agent AI systems fail not because agents can't communicate, but because they can't remember." That line, once you read it, you cannot un-apply to distributed teams. Slack, Zoom, and Loom mean a distributed team can talk all day if they want to. The harder thing, and the thing that actually breaks, is remembering together across time zones and weeks with enough fidelity that the team's collective understanding stays coherent.Three failure modes worth naming explicitly, borrowing the framework directly from the AI research:Decision drift is when decisions get made but do not propagate. The architect resolves a debate on Tuesday, and by Friday two engineers in another region are still operating as though the matter is unresolved. The decision exists somewhere, just not where the work is happening.State drift is when the team's understanding of where the project is diverges across members. Your PM is operating as though feature A is in QA, your designer thinks it is still being scoped, and your engineer is convinced it was descoped last week. All three are working from a snapshot that was correct on a different day, and none of them realize they are out of sync until something visible breaks.Context drift is when the background knowledge required to evaluate new work is unevenly distributed. The senior engineer remembers why the team rejected a particular architecture six months ago, but the new hire has no idea. So the new hire proposes the rejected architecture, the senior engineer has to re-litigate a debate the team thought was settled, and a week disappears. Multiply this by every senior-to-junior interaction across every time zone and the cost becomes visible.These three drift modes are the actual failure surface of distributed work. Time zones act as a multiplier on these failures rather than as the cause of them, which is why a co-located team with poor decision capture and high turnover suffers the same problem at smaller scale, and a distributed team with rigorous decision capture and strong ingestion practices can run cleanly across eight time zones.Why AI Context Engines Are Structurally DifferentEvery previous knowledge tool was an authoring tool, which meant the team had to write the knowledge, maintain it, and keep it current, with the team itself as the bottleneck the whole way through.AI context engines invert this.Work happens in the places it has always happened, the engine ingests from those places, and the team queries on demand. There is no authoring step at the front and no maintenance burden in the middle.The wiki writes itself, in a manner of speaking, from the actual artifacts of work: Slack threads, Linear tickets, GitHub PRs, design files, meeting transcripts, code commits. The question stops being "did you remember to write this down" and becomes "did the work happen in a place the engine can see."This is not theoretical anymore. Anthropic published internal data showing that engineering teams using Claude Code for tasks that previously required Google searches and tab-switching are completing them 80% faster in time-to-answer.Hannah Stulberg, a PM at DoorDash, published a detailed walkthrough of what she calls a Team OS, where her entire team, including non-technical members, checks every artifact into a shared repository that the team queries collectively. The data point she shares is striking: a query about customer data consumed 3 percent of the context window because the repository was structured so the engine could navigate directly to the right files. Without that structure, the same query would have burned half the context window on exploration before producing a worse answer.The most telling detail in Hannah's published account is who is participating in the shared context. Engineers, designers, PMs, data scientists, and a strategy partner who had never opened GitHub two months earlier and is now putting up pull requests every day. What matters here is not that the tool turned a non-technical person into a developer, but that properly structured shared context softens the boundaries between roles, because everyone has access to the same information at the same depth.This is the structural shift. The conversation is no longer about "AI added to your knowledge base." It is about a fundamentally different kind of infrastructure: shared working memory for the team, ingested from where work actually happens, queryable with provenance, scaling with team velocity instead of fighting against it.The Contrarian Truth: Most Teams Will Fail at ThisI want to be honest about the current state of this.The frontier is real and the wins are real, but most teams deploying AI context engines in 2026 are going to fail to capture those wins because they are going to deploy them wrong.The clearest evidence comes from inside the most sophisticated AI shops, not outside them. There is a public GitHub issue on the Claude Code repository, filed by an engineering manager running two teams of 14 engineers, that opens with this line: "Claude Code's memory system is individual-only. In real engineering teams, knowledge flows constantly between people, through handoffs, consultations, reviews, and investigations. Today, none of that context transfers at the agent level." He describes shared team memory as "the single biggest efficiency bottleneck for teams adopting Claude Code seriously" and "an unsolved problem across the AI-assisted development industry."This is an engineering manager at the frontier, with the most advanced AI tooling available, naming the problem out loud and acknowledging it is not solved yet. That is the honest state of the practice.The MongoDB framing I cited earlier deserves quoting in full: "Most multi-agent AI systems fail not because agents can't communicate, but because they can't remember." The piece argues that memory engineering is the missing architectural foundation, and that production deployments have shown agents duplicating work, overlapping computation, and cascade failures spreading context pollution across teams. Replace "agents" with "engineers" and the sentence describes every regression-prone distributed team I have ever encountered.Amit Kothari, writing about Claude Projects in November 2025, put it bluntly: "Knowledge stays locked in people's heads. Code reviews drag on. Notion pages nobody updates. Slack threads nobody can find. README files nobody reads. Same rubbish, different packaging." His point is that AI context engines only work if you stop treating them as documentation tools and start treating them as shared working memory. The teams that fail are the teams that deploy Claude Projects or Claude Code or shared repositories the same way they deployed Notion in 2019, expecting the tool to compensate for the lack of practice.The tool does not compensate for the lack of practice, it amplifies the absence of it, and it does so faster and more visibly than the previous generation of tools did. A Notion page that nobody updates gets stale slowly and quietly. A shared AI context that nobody curates gets actively wrong, because the engine confidently surfaces outdated information without flagging it as outdated. The teams that fail with this approach do not fail because the tool sits unused. They fail because the tool returns confident answers that are wrong, and the team acts on them.Three Principles for Making This WorkI am experimenting with the practice across my own distributed team, and what I have learned so far comes down to three principles. The tool choice is mostly secondary, because the principles are really about how to operate any context engine once you have picked one.Ingest, do not author. The single largest mistake I see teams make is recreating their Notion structure inside Claude Projects or a shared repository. They take the same approach that has failed for 15 years and assume that adding AI will fix it. It will not. The shift is that the work itself becomes the documentation, with Slack decisions, Linear tickets, meeting transcripts, and other native work artifacts feeding directly into the shared memory. The artifacts produced by the team in the course of doing work are the inputs to the system, which means humans should not be writing wiki pages that describe what happened. The engine should be capturing what happened and synthesizing it on demand.Treat decisions as first-class objects. Most of what gets captured in distributed work is conversation, which is noise. The signal you actually need to preserve is decisions, and the single highest-leverage practice on a distributed team is capturing them with a consistent format immediately when they happen, before the team disperses across time zones. Architecture Decision Records have been a recommended practice since 2011 and remain underused because they were positioned as an engineering tool, when they are really a distributed-team tool that the engineering community happened to invent first. Every meaningful decision the team makes deserves a four-paragraph capture covering what was decided, what alternatives were considered, what context drove the choice, and what would need to change to revisit it. Once you have a stream of decisions in a consistent format, the AI engine can answer "why did we choose X" with precision, because the decisions are first-class queryable objects instead of buried Slack messages. Decision-loss is the single biggest source of regression in distributed work, and solving decision-capture is what makes most of the drift problem dissolve.Provenance is non-negotiable. Every answer the AI gives must come with a source, not in the vague sense of "according to the team's records," but specifically: this thread on this date, this commit, this Linear ticket, this section of the meeting transcript. Without that audit trail, the AI is just confident hallucination wearing a knowledge-base costume, and the team will learn to ignore it within weeks. With it, the AI becomes the team's shared memory with a verifiable record, which means the team will trust it, which means they will actually use it, which means it gets better through use. That is the only loop in this whole system that compounds in the right direction, and it depends entirely on provenance being treated as load-bearing rather than optional.These three principles only work as a set. Ingestion without decision-capture produces a system that knows what happened but cannot tell you what was decided. Decision-capture without ingestion produces a beautiful decision log that the rest of the work artifacts contradict. Provenance without either of the others produces fast wrong answers with footnotes.The teams getting the 80% research-time reductions and the 3% context utilization wins are running all three principles at once.Where This BreaksThe honest constraints are worth naming.The AI engine is only as good as what it ingests. Decisions made in DMs, in offline meetings, or in someone's head do not exist in any system the engine can read. A senior engineer who resolves an architecture debate during a 1:1 walk has just produced an invisible decision, and no AI in the world can surface it. The practice has to come first.Provenance is brittle. A link to a Slack message that gets deleted, a reference to a Linear ticket that gets archived, a citation to a PR that gets force-pushed: all of these break the audit trail. The team has to operate with a "decisions are permanent" discipline that most teams do not currently maintain.Scale is a real variable. The Engineering Manager in the Claude Code GitHub issue runs 14 engineers across two teams and is hitting walls. Solo operators and small teams are reporting easier wins because the surface area is smaller. The wins are real, but the curve is not linear, and the tooling at the team-memory layer is still being built. Anyone deploying this at 50-engineer scale today is at the frontier, and the frontier is uneven.The model drifts. Even with perfect ingestion and perfect provenance, the AI engine you query today and the one you query in six months will not behave identically because outputs, confidence calibrations, and retrieval strategies all shift over time as the underlying models update. The discipline has to include periodic audits where the team actually checks whether the answers it is getting still match the ground truth.None of these constraints break the thesis, they temper it. The win is real and the win is available to teams that do the work. This is not a magic tool but a structurally new kind of tool that requires a new kind of practice, and the teams that figure out the practice will compound advantage on the teams that do not.What to Take to Your Team This WeekThree concrete moves, in priority order.Audit your decision capture. Pick five significant decisions your team made in the last 30 days and try to find them in writing, along with the alternatives considered, the context behind the choice, and the conditions under which you would revisit. Whatever you find or do not find is your starting baseline. Most teams will find that two of the five decisions are documented, one is captured ambiguously in a Slack thread somewhere, and two are nowhere at all. That gap is what the AI engine cannot close on its own.Pick one decision format and use it for the next 30 days. Light ADR style works fine, with four sections covering what was decided, alternatives considered, context driving the choice, and conditions for revisiting. Use it for every decision, not just architectural ones. Product, design, process, hiring, vendor selection, whatever the team is choosing between. The format matters less than the consistency, and if the team commits to the practice for 30 days, you will have a queryable decision stream that did not exist before.Pick a context engine to experiment with. Claude Projects, a shared Claude Code repository, a self-hosted RAG system, whatever fits your stack. The tool matters less than the practice it is wrapping. Wire it to the places work actually happens, verify provenance on every answer the team relies on, audit it periodically against ground truth, and treat the whole thing as shared working memory rather than as Notion 2.0.The fifteen-year stagnation in distributed-team knowledge management is about to break. The teams that figure out the practice in 2026 will compound velocity advantages on the teams still trying to make Confluence work, and the gap is going to widen quickly. The tool is finally available. The hard part is still the discipline around the tool, and that part has not changed in fifteen years either.