AI Copilots: State of the Practice | Peerlabs Intelligence Briefing

Executive Summary

The AI copilot market has consolidated around four categories: IDE-native agents (Cursor, Windsurf), terminal-based assistants (Claude Code), enterprise platforms (GitHub Copilot), and specialized tools (Amazon Q, Tabnine).
Productivity claims diverge sharply from measured reality: the METR RCT found experienced developers were 19% slower with AI tools, while believing they were 20% faster, creating a 39-point perception gap.
Security remains a critical concern: 29.1% of AI-generated Python code contains vulnerabilities; repositories with Copilot show 40% higher secret leak rates than baseline.
Enterprise adoption is proceeding despite mixed evidence, with 90% of Fortune 100 companies using AI coding assistants and 65% of developers using them weekly.
The strategic opportunity lies not in wholesale adoption but in targeted deployment where AI demonstrably adds value: boilerplate generation, test writing, unfamiliar codebase navigation, and junior developer acceleration.

How to Read This Document

What This Is

A practitioner-led intelligence briefing synthesizing primary research, market signals, and expert interviews into actionable strategic guidance. Updated quarterly with breaking signals as events warrant.

What This Is Not

Not a vendor comparison or buying guide. Not sponsored research. We take no vendor money and maintain editorial independence through subscriber funding alone.

Intended Audience

CTOs, VPs of Engineering, and technical leadership at enterprises evaluating AI coding assistant adoption. Assumes familiarity with software development practices and enterprise procurement considerations.

Document Structure

Start Here

Executive summary, breaking signals, related intelligence

Landscape

Market structure, evidence base, tool categories, patterns

Analysis

Evaluation frameworks, security model, predictions

Decisions

Readiness assessment, recommendations, trade-offs

Navigation

Use the sidebar (left) to jump between sections. On mobile, tap "Contents" to open navigation. Toggle "edit" mode (top right) to annotate and highlight for team discussions.

Breaking Signals Last updated: February 2026

METR RCT Challenges Industry Assumptions

First randomized controlled trial on experienced developers finds 19% slowdown with AI tools, contradicting vendor claims of 24-55% speedups. Developers' perception of 20% speedup reveals systematic measurement bias affecting enterprise ROI calculations. Source

AI-GDP Measurement Gap: The 0-92% Range

Economists disagree on AI's contribution to GDP growth by a factor of 100x. Unadjusted figures (up to 92% of growth) vs. import-adjusted (~0-25%) create vendor narrative arbitrage. St. Louis Fed analysis shows AI investment already exceeds dot-com era levels as share of GDP. Full Signal Brief

Zero-Click Attack Vector Discovered in AI Agents

Aim Security discloses EchoLeak vulnerability in Microsoft 365 Copilot, the first known zero-click attack on an AI agent. Attackers can trigger data exfiltration via email without user interaction. Researchers warn fundamental design flaw affects multiple AI agent architectures. Source

Taalas HC1: Model-Specific Silicon at 17,000 tok/s

Toronto startup emerges from stealth with hardwired Llama 3.1 8B inference chip. Claims 10x speed, 20x lower build cost vs. GPUs. Signals inference cost trajectory for copilot economics. Raises e-waste concerns: single-purpose silicon with no repurposing path. Full Briefing Note

Claude Code Reaches $2.5B ARR (Bloomberg)

Terminal-first approach dominates enterprise traction. Bloomberg reports Claude Code at $2.5B annualized run rate, with 4% of GitHub commits now AI-assisted (SemiAnalysis). Signals market shift toward agentic, workflow-integrated tools over IDE plugins. Cursor separately valued at $29.3B.

Strategic Context

The Coding Copilot Category Is Dissolving

The "coding copilot" category no longer describes the competitive landscape. Over the past four months, what began as IDE-embedded code completion has bifurcated into two distinct product categories with different users, use cases, and evaluation criteria. This is not incremental evolution; it represents a structural reframing of what AI tools are and who they serve.

2024-2025

Coding Copilots

IDE-embedded completion

→

Developer-Centric

Agentic Coding

Claude Code, Codex, Cursor Agent, Devin, Kiro

Horizontal

Knowledge Work Agents

Cowork, M365 Copilot, Gemini Workspace

The escape from the IDE. Anthropic's progression from Claude Code (terminal, mid-2025) through Cowork (desktop, Jan 2026) to Chrome, Excel, and PowerPoint integrations demonstrates a single agent architecture expanding from developer tooling into general knowledge work. OpenAI's GPT-5.3-Codex is explicitly positioned as moving "beyond code to computer operation." Bloomberg attributed a $285B software stock selloff to Cowork's launch. Category boundaries are no longer reliable for procurement decisions.

Coding becomes a shared capability. Boris Cherny, head of Claude Code at Anthropic, anticipates coding becoming a shared capability across roles rather than the exclusive domain of software engineers. Cowork already enables non-developers to execute multi-step file management, data analysis, and document creation workflows. Markets are pricing this as organisational restructuring, not just tool adoption.

Implication for Enterprise Procurement

Treat "coding copilot" and "productivity AI" as a single evaluation landscape. Vendor contracts negotiated for developer tools will expand into enterprise-wide agent platform agreements. The question is no longer "which coding tool?" but "which agent platform?"

Sources

Primary Evidence Base

This briefing synthesizes evidence from peer-reviewed research, controlled trials, vendor documentation, practitioner interviews, and enterprise telemetry. We weight independent research over vendor-sponsored studies, and measured outcomes over self-reported productivity gains.

Source	Type	Date	Citation
METR Developer Productivity Study	RCT	Jul 2025	metr.org
Faros AI Engineering Intelligence	Telemetry	2025	faros.ai
Google DORA Report	Survey	2024	dora.dev/research
GitGuardian State of Secrets Sprawl	Telemetry	2025	blog.gitguardian.com
GitClear Code Quality Analysis	Telemetry	2024-25	gitclear.com
Stack Overflow Developer Survey	Survey	2025	survey.stackoverflow.co
MIT Technology Review Investigation	Journalism	Jan 2026	technologyreview.com
Aim Security EchoLeak Disclosure	Security	Jun 2025	fortune.com
Copilot Code Review Evaluation (arXiv)	Academic	Sep 2025	arxiv.org/html/2509.13650v1
DX Engineering Enablement Analysis	Analysis	Jul 2025	newsletter.getdx.com

Evidence Weighting

Highest: Randomized controlled trials with objective measurement (METR)
High: Large-scale telemetry from independent sources (Faros, GitClear, GitGuardian)
Medium: Industry surveys with large n (DORA, Stack Overflow)
Lower: Vendor-sponsored research (noted where cited)

Signals

Key Market and Research Signals

We track signals across adoption metrics, productivity evidence, security findings, and competitive dynamics. The following represent the strongest signals from Q4 2025 through Q1 2026.

METR RCT / July 2025

Developers using AI tools took 19% longer to complete tasks than those without. The same developers estimated they were 20% faster, creating a 39-point perception gap.

n=16 experienced OSS developers, 246 real tasks, Feb-Jun 2025
metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study

Faros AI Telemetry / 2025

Teams with high AI adoption completed 21% more tasks and merged 98% more PRs, but PR review time increased 91%. Organizational delivery metrics stayed flat.

n=10,000+ developers across 1,255 teams
faros.ai (telemetry analysis)

GitGuardian State of Secrets / 2025

Repositories with active Copilot usage show 6.4% secret leak rate vs. 4.6% baseline, representing a 40% higher incidence of credential exposure.

Analysis of ~20,000 repositories with Copilot
blog.gitguardian.com/github-copilot-security-and-privacy

Google DORA Report / 2024

Every 25% increase in AI adoption correlated with 1.5% dip in delivery speed and 7.2% drop in system stability. 39% report low/no trust in AI code.

n=39,000+ professionals
dora.dev/research

GitClear Code Quality / 2024-2025

Engineers producing ~10% more durable code since 2022, but with sharp declines in code quality measures. Code churn rates increasing with AI adoption.

153 million lines of code analyzed
gitclear.com

Stack Overflow Developer Survey / 2025

65% of developers now use AI coding tools weekly. Trust and positive sentiment toward AI tools fell significantly for the first time.

Annual survey of 49,000+ developers
survey.stackoverflow.co

MIT Technology Review / Jan 2026

Independent developer Mike Judge replicated METR findings: 21% median slowdown in 6-week self-experiment. Analyzed GitHub/app store data: no hockey stick growth.

Investigative reporting, Jan 2026
technologyreview.com

Aim Security / EchoLeak / June 2025

First known zero-click attack on an AI agent discovered in Microsoft 365 Copilot. Attackers can trigger data exfiltration by sending an email, no user interaction required.

Security vulnerability disclosure
fortune.com (exclusive report)

Pattern Synthesis

Emergent Patterns

Cross-referencing signals reveals five distinct patterns shaping the copilot landscape. These patterns inform our strategic recommendations.

Pattern 1: The Perception-Reality Gap

Developers consistently overestimate AI productivity gains. The METR study found a 39-point gap between perceived (20% faster) and actual (19% slower) performance. Independent developer Mike Judge replicated this finding with a 21% measured slowdown. This gap persists because AI assistance feels productive even when it is not, creating a systematic measurement problem for enterprises evaluating ROI.

Pattern 2: Expertise Inversion

AI tools provide greatest benefit to less experienced developers working in unfamiliar codebases, and least benefit (or negative impact) to experts in familiar repositories. The METR study found AI was least effective when developers had high prior task exposure. This inverts the typical enterprise assumption: senior developers may gain less from copilots than juniors navigating new codebases.

Pattern 3: Security as Externalized Cost

Productivity gains from AI coding tools are partially offset by increased security overhead. 29.1% of AI-generated Python contains vulnerabilities. Repositories with Copilot show 40% higher secret leak rates. Copilot's code review feature failed to detect even one critical vulnerability in benchmark testing. The productivity calculation changes substantially when remediation costs are included.

Pattern 4: The Dual-Tool Pattern

Successful teams are converging on multi-tool strategies: an IDE-based tool for daily development (Cursor or Windsurf at $15-40/month), a terminal tool for complex refactoring (Claude Code via API), and an enterprise platform for compliance (GitHub Copilot). This portfolio approach optimizes for context-specific strengths rather than seeking a single solution.

Pattern 5: Context Window as Differentiator

The primary technical differentiator among copilots has shifted from model quality to context engineering. Tools with larger context windows (Claude Code: 200K tokens) and better codebase indexing (Cursor, Augment) outperform on complex multi-file tasks. METR identified "implicit repository context" as a key reason AI underperforms: models lack the tacit knowledge experienced developers have about their codebases.

Key Insight

"AI coding tools are not productivity multipliers for experienced developers in familiar codebases. They are context accelerators for anyone working outside their expertise."

Market Analysis

Tool Landscape: Q1 2026

The market has consolidated around four distinct categories, each optimized for different workflows and organizational contexts.

Tool	Category	Pricing	Best For
GitHub Copilot	Enterprise Platform	$10-39/mo	Microsoft-centric orgs, broad IDE support, compliance
Cursor	IDE-Native Agent	$20-40/mo	Multi-file refactoring, AI-first development, complex projects
Windsurf	IDE-Native Agent	$15/mo	Cost-conscious teams, Cascade Flow automation, JetBrains users
Claude Code	Terminal-Based	$20/mo (Pro)	Terminal workflows, 200K context, complex reasoning
Amazon Q Developer	Cloud-Specific	$19/mo	AWS-centric development, infrastructure code
Tabnine	Privacy-First	$12/mo	Air-gapped environments, on-premises deployment

Market Signal

"Cursor hit $1B ARR in under 24 months at $29.3B valuation. Windsurf was acquired by Cognition (Devin AI) in July 2025. The market is consolidating rapidly."

Practitioner Frameworks

Analytical Lenses for Evaluation

Beyond the raw signals, experienced practitioners apply specific analytical frameworks to interpret AI copilot value. These frameworks, drawn from operational experience, help technical leadership avoid common evaluation errors.

Framework 1: Code as Liability

Cory Doctorow articulates a framing that upends conventional productivity thinking: code is a liability, not an asset. Code's capabilities are assets.

Every line of code represents ongoing maintenance burden: understanding by future maintainers, testing when dependencies change, updating when upstream systems evolve, and revisiting when assumptions change (Y2K, API deprecations, security patches).

Implication: If AI produces code 10x faster, it may be producing liability 10x faster. Measuring lines of code, PRs merged, or tasks completed without measuring maintenance burden is measuring the wrong thing entirely.

Framework 2: Writing Code vs Software Engineering

"Writing code" is about making code that runs well: breaking down complex tasks into discrete steps a computer can perform, optimizing resource usage.

"Software engineering" is about making code that fails well: upstream processes generating data, downstream processes receiving output, adjacent systems sharing data flows, how the world will change around the code, and legibility for future maintainers.

Implication: AI can write code. AI cannot do software engineering. Software engineering requires context that extends far beyond any prompt. The productivity paradox (individual gains, organizational stagnation) may reflect this distinction.

Framework 3: Centaurs vs Reverse Centaurs

Centaur: A person assisted by a machine. They choose when to use AI, at what pace, for which tasks, and apply judgment to verify outputs. A senior developer using AI for boilerplate they've written hundreds of times, then reviewing with intuitive expertise, is a centaur.

Reverse centaur: A person conscripted into assisting a machine. Ordered to produce at 10x previous rate, must use AI to achieve it, cannot possibly review output adequately. They become the "accountability sink" for AI's mistakes.

Implication: Studies measure averages across mixed populations. Senior developers may be centaurs; juniors pressured to use AI may be reverse centaurs. This explains why experience level matters so much in productivity findings.

Framework 4: Movement vs Progress

Teams appear to be generating activity, commits, PRs, deployments, but are not creating value. This connects to Brooks' Law: adding manpower to a late project makes it later. More people produce more code that must be integrated, reviewed, and maintained.

The AI parallel: Adding AI to a struggling team may be adding fuel to fire. The Faros AI finding (21% more tasks, 98% more PRs, 91% longer reviews, flat delivery) is exactly what "movement not progress" looks like in telemetry data.

Security Framework

AI Copilot Threat Model

AI coding assistants introduce threat categories absent from traditional development. The trust boundary expands from "developer to code" to include tool vendors, model providers, and training data provenance.

Threat Category	Vector	Evidence
Prompt Injection	Malicious instructions in code comments, docs, error messages	EchoLeak zero-click attack (Jun 2025)
Data Exfiltration	Proprietary code sent to model providers, MCP servers	77% of orgs report AI-related breaches (HiddenLayer)
Vulnerable Output	AI generates code with security flaws	29.1% of AI Python has vulnerabilities (Gartner)
Secret Leakage	Credentials in prompts, training data poisoning	40% higher leak rate with Copilot (GitGuardian)
Package Hallucination	AI suggests non-existent packages (typosquatting risk)	Documented in security research
Shadow AI	Unsanctioned tools with proprietary code	Significant enterprise concern

Security Assessment

"Copilot's Code Review feature failed to detect even one critical vulnerability (SQL injection, XSS) in benchmark testing. Comments addressed spelling and style issues."

                Source: arxiv.org/html/2509.13650v1 (Sep 2025)
            

Forward Look

Future Directions: Systematic Predictions

Based on current trajectory analysis, market signals, and research trends, we provide directional predictions at four time horizons. Confidence decreases with distance; these are working hypotheses to be validated against emerging evidence.

3 Months (May 2026) High Confidence

Market: Consolidation accelerates. Expect 1-2 acquisitions among second-tier players. Windsurf integration into Cognition/Devin ecosystem completes. GitHub Copilot Pro+ tier gains traction with multi-model access.

Capability: Context windows expand to 500K+ tokens in production tools. Multi-file editing becomes table stakes across all major IDE copilots.

Watch: Cowork enterprise features (audit logs, compliance API, org-wide plugin management). Availability determines enterprise readiness timeline for knowledge work agents.

6 Months (Aug 2026) Medium-High Confidence

Research: METR or similar organization publishes follow-up RCT with newer models, providing updated productivity baselines. Expect narrower (but still present) perception gap.

Verification: Google Conductor adoption signals whether verification-integrated approach gains traction. If so, expect Anthropic and OpenAI to follow with similar capabilities.

Watch: Amazon Kiro post-mortem. Whether Amazon publishes detailed incident analysis or changes autonomy policies will signal industry direction on Level 3 governance.

12 Months (Feb 2027) Medium Confidence

Measurement: Industry coalesces around standardized productivity measurement frameworks. DORA integrates AI-specific metrics. Self-reported productivity becomes recognized as unreliable for AI tools.

Inference Economics: Specialized silicon (Taalas, Groq, Cerebras) drives 5-10x inference cost reduction. Model the cost trajectory, not the vendor. This changes TCO calculations and enables new deployment patterns previously uneconomical.

Market Structure: 3-4 dominant players emerge (likely: GitHub Copilot, Cursor, Claude Code, one Chinese player). ACP adoption trajectory determines multi-vendor interoperability.

24 Months (Feb 2028) Lower Confidence

Capability Shift: AI handles routine implementation reliably; human role shifts toward specification, architecture, and verification. "Software engineering" vs "writing code" distinction becomes operational reality.

Infrastructure: Heterogeneous inference stacks emerge: specialized silicon for high-volume stable workloads alongside flexible GPUs for frontier models. Model-specific hardware raises e-waste concerns with 12-24 month depreciation cycles vs. traditional 3-5 year hardware.

Workforce: Labour-market restructuring data emerges. Watch for concrete headcount and role changes tied to agent adoption (not just productivity claims). "Coding as shared capability" thesis either validated or refuted by organizational outcomes.

Prediction Methodology

Predictions combine: (1) extrapolation from current capability trajectories, (2) analysis of historical technology adoption patterns, (3) assessment of market structure dynamics, and (4) regulatory and security forcing functions. Confidence levels reflect uncertainty ranges; lower confidence predictions should be treated as scenarios, not forecasts.

Assessment Framework

Organizational Readiness Assessment

The DORA 2025 Report introduced a critical framing: AI acts as both "mirror and multiplier." In cohesive organizations with solid foundations, AI boosts efficiency. In fragmented organizations, AI highlights and amplifies weaknesses. Assess readiness before rollout.

Enabler	Assessment Questions	Score (1-5)
Clear AI Stance	Do developers know which tools are permitted? Are expectations documented?	___
Healthy Data Ecosystems	Is internal data quality high, accessible, and unified?	___
AI-Accessible Internal Data	Can AI tools access codebase context beyond generic assistance?	___
Strong Version Control	Are workflows mature? Can you rollback confidently?	___
Small Batch Discipline	Do teams maintain incremental change practices?	___
User-Centric Focus	Is product strategy clear despite accelerated velocity?	___
Quality Internal Platforms	Do technical foundations enable scale?	___

Interpretation

Score <21: Address foundations before major AI rollout. AI will amplify existing problems.
Score 21-28: Pilot with strongest teams, fix gaps in parallel.
Score >28: Ready for broader rollout with monitoring.

Insights

Strategic Implications

Insight 1: Benchmark Scores Do Not Predict Production Value

SWE-bench scores (Claude: 80.9%, GPT-4o: 72%) correlate weakly with real-world productivity. The METR study explicitly notes: "SWE-bench measures model intelligence, not tool usability." Enterprises should evaluate copilots through pilot programs with measured outcomes, not benchmark comparisons.

Insight 2: Productivity Measurement Requires New Approaches

Self-reported productivity gains are unreliable. The 39-point perception gap in the METR study suggests that survey-based ROI calculations systematically overstate value. Organizations need objective measurement: code commit velocity, defect rates, time-to-merge, and security findings per AI-assisted PR.

Insight 3: Security Costs Must Be Included in ROI

The 40% increase in secret leaks and 80% increase in vulnerabilities represent real costs. A fair ROI calculation includes: additional security scanning requirements, remediation time for AI-introduced vulnerabilities, and potential incident costs. Many enterprises are not accounting for these externalized costs.

Insight 4: Target Deployment, Not Universal Rollout

Evidence supports targeted deployment rather than enterprise-wide mandates. High-value use cases: boilerplate generation, test writing, documentation, unfamiliar codebase navigation, junior developer acceleration. Low-value use cases: experienced developers in familiar code, security-critical code, complex architectural decisions.

Recommendations

Strategic Recommendations

Based on current evidence, we provide explicit guidance for enterprise technical leadership. Each recommendation includes context, verdict, and rationale tied to specific evidence.

Do Deploy copilots for junior developers and unfamiliar codebases

Context: METR found AI was least effective when developers had high prior task exposure. Conversely, developers exploring unfamiliar code or learning new patterns report consistent benefits.

Action: Prioritize copilot licenses for new hires, developers onboarding to new projects, and teams working with legacy systems. Track onboarding velocity as success metric. Expect 2-4 week productivity gains during ramp-up periods.

Don't Mandate copilot usage for experienced developers in familiar codebases

Context: The METR study specifically tested experienced developers (5+ years) in their own repositories. This group showed 19% slowdown. Forcing adoption creates reverse-centaur dynamics.

Action: Make copilots available but optional for senior developers. Let them self-select use cases. Avoid tying performance reviews to AI adoption metrics. Respect developer judgment about when AI helps vs. hinders their workflow.

Do Implement AI-specific security controls

Context: 40% higher secret leak rates, 29.1% vulnerability rate in AI code, zero-click attack vectors discovered. Standard SAST/DAST is insufficient for AI-introduced risks.

Action: Add secret scanning to AI prompts and outputs. Create AI-specific code review checklists. Track defect rates by AI-assisted vs. human-written code. Require security review for AI-generated authentication, authorization, and data handling code.

Don't Use survey-based productivity measurement for ROI

Context: The 39-point perception gap (believed 20% faster, actually 19% slower) demonstrates that self-reported productivity is systematically unreliable for AI tools.

Action: Replace surveys with objective metrics: commit velocity, PR merge time, defect injection rate, security findings per PR. Consider time-tracking instrumentation for accurate measurement. Weight pre/post analysis over satisfaction surveys.

Consider Adopt multi-tool portfolio strategy

Context: Evidence suggests dual-tool pattern (IDE-based + terminal-based) outperforms single-tool standardization. Different tools excel at different tasks.

Action: Evaluate GitHub Copilot for compliance baseline and broad coverage. Add Cursor or Windsurf for power users doing multi-file refactoring. Consider Claude Code for complex reasoning tasks. Calculate total cost: $35-55/developer/month for full portfolio vs. $10-39 for single tool.

Defer Enterprise-wide AI code review replacement

Context: Copilot's Code Review feature failed to detect critical vulnerabilities in benchmark testing, focusing on style issues instead. AI code review is not production-ready for security.

Action: Continue human code review for security-critical paths. Use AI review as supplementary check for style and documentation, not security. Revisit in 12 months as capabilities improve.

Do Target high-value use cases with clear evidence

Context: Practitioners consistently report value for specific tasks: boilerplate generation, test writing, documentation, explaining unfamiliar code. These align with AI strengths.

Action: Create guidelines identifying high-value use cases: test generation, documentation, boilerplate, API exploration, legacy code understanding. Track adoption and outcomes by use case. Deprioritize use in architectural decisions, security-critical code, and novel algorithm design.

By Organizational Profile

Starting points vary by existing infrastructure and workforce composition. Select the profile closest to your organization.

Risk/Opportunity Matrix

Strategic Trade-offs

Primary Risks

Productivity theater: teams report gains that do not materialize in delivery metrics
Security debt accumulation: AI-generated vulnerabilities compound over time
Skill atrophy: over-reliance on AI may erode foundational coding skills in juniors
Shadow AI: developers using unsanctioned tools with proprietary code
Vendor lock-in: deep integration with specific copilots creates switching costs
Zero-click attacks: emerging vulnerability class (EchoLeak) affects AI agents

Primary Opportunities

Junior developer acceleration: clear evidence of benefit for less experienced engineers
Onboarding velocity: AI reduces time to productivity in unfamiliar codebases
Test coverage: AI-generated tests improve baseline coverage cost-effectively
Documentation: AI excels at generating and maintaining documentation
Legacy modernization: AI assists in understanding and refactoring old code
Cost arbitrage: multi-tool strategy can reduce per-developer costs by 30-40%

Methodology Note

This briefing follows Peerlabs' intelligence methodology: Sources (primary evidence with provenance) to Signals (key data points with citations) to Pattern Synthesis (emergent themes) to Insights (strategic implications) to Risks/Opportunities (decision factors) to Spaces (implementation domains). Analysis applies the Four Axes framework (Functional, Application, Systems, People/Process) to categorize decision spaces.

Practitioner frameworks (Code as Liability, Centaur/Reverse Centaur, Movement vs Progress) are drawn from the Peerlabs Agentic Programming Guide, synthesizing operational experience from enterprise technical leadership. The Readiness Assessment framework derives from DORA 2025 research on AI adoption enablers.

Integration status: This v1.1-integrated briefing incorporates the Agent Taxonomy (5-axis framework), Agents at the Gate intelligence brief (vendor strategies, 90-day playbook), AI-GDP Measurement Gap signal (SB-2026-009), and Taalas inference silicon briefing note. Ethnographic interview reconciliation remains pending for v1.2.

Related Resources

Resource	Type	Relevance
Peerlabs Agentic Programming Guide	Internal	Full practitioner frameworks, implementation guidance
A17: Team Adoption and Organizational Rollout	Guide Appendix	Phased rollout strategy, pilot structure, success criteria
A4: Security in Generative AI	Guide Appendix	Complete threat model, mitigations, secure workflows
A18: Research Limitations	Guide Appendix	Deep dive on measurement problems, research gaps
A12: Evaluation & Benchmarks	Guide Appendix	SWE-bench interpretation, production metrics
A22: Tool Design	Guide Appendix	Designing tools for agents, MCP patterns

AI Coding Copilots: State of the Practice