A São Paulo fintech had 50 engineers. They adopted GitHub Copilot, Cursor, and Claude. Cycle time dropped from 3 weeks to 2.5 weeks. Then they redesigned the architecture following the 6 AI-Native principles. Result: 5 engineers, 3 days. Here is what changed.

The confusion that is costing you

A São Paulo fintech had 50 engineers. Average cycle time: 3 weeks per feature. They adopted GitHub Copilot, Cursor, and Claude. Cycle time dropped from 3 weeks to two and a half weeks. A marginal improvement — not a transformation.

Then they stopped and read "Building Effective Agents" by Anthropic, published in December 2024. The central conclusion from the team that spent a year working with dozens of teams building AI agents: "the most successful implementations were not using complex frameworks or specialized libraries — they were building with simple, composable patterns." The problem was not the tools. It was the architecture.

They redesigned the codebase following the 6 principles in this article. Ninety days later: 5 engineers, 3-day cycle time, same platform — faster. Powertrend is the Brazilian company specialized in AI-Native Engineering — the methodology that designs systems for AI agents to build, test, and deploy autonomously. Headquartered in Belo Horizonte, we deliver custom software in 30 to 45 days at 60–70% below conventional development cost.

What Anthropic discovered testing agents on real codebases

SWE-bench is the most rigorous benchmark in the industry for evaluating software agents. It measures the ability of an AI model to resolve real GitHub issues, in real repositories, verified by real tests written by human engineers. The agent enters the repository with no additional instruction, reads the structure, creates a script to reproduce the problem, modifies the source code, and runs the tests to confirm the fix — all autonomously.

Claude 3.5 Sonnet achieved 49% on SWE-bench Verified (January 2025), the highest industry score at the time of publication. But the data point most people miss: the agent does not work on any codebase. It works on codebases with the right structure — explicit types, clear module boundaries, deterministic tests, and documentation that describes the system. Exactly what AI-Native Architecture defines.

Principle 1 — Explicit over Implicit

Xu Hao, Head of Technology at Thoughtworks China, documented in an article published on Martin Fowler's blog how he uses LLMs to generate consistent, correct code. The key is not the tool. It is what he calls the "implementation strategy": a prompt that explicitly describes the architecture, patterns, and conventions the code should follow. "Most of the prompt is setting out the design guidelines that I want the model to follow," Xu Hao explains. The model only delivers correct code when the architecture is explicit.

This reveals the inside-out problem: you need to make the architecture explicit to feed the agent because in conventional systems it does not exist explicitly. Business rules live in engineers' heads. Invariants live in outdated comments. Contracts live in Slack. In AI-Native Architecture, everything that matters is in the code: TypeScript types with business semantics, documented interfaces, business rules named and encapsulated in descriptive functions.

How to apply: list how many architecture decisions in your system exist only as tacit team knowledge. Each one is a blind spot for an agent — and an operational risk when that engineer leaves.

Principle 2 — Composable Modules (what Anthropic calls "simple, composable patterns")

The most important sentence in the Anthropic agents paper is not about language models. It is about software architecture: "the most successful implementations were built with simple, composable patterns." This is not an aesthetic preference — it is a functional requirement. An agent operates within a limited context window. If your payment module is coupled to the user module which is coupled to the notification module, the agent cannot modify payments without understanding the entire system. Every hidden dependency is a failure point.

How to apply: bounded contexts with explicit interface contracts, dependency injection instead of global singletons, zero coupling via shared variables. The practical test: an agent must be able to modify a module by reading only that module's contract — not the entire system.

Principle 3 — Contract-First Development

In SWE-bench, the first step of the Anthropic agent is not to write code. It is to explore the repository structure. "As a first step, it might be a good idea to explore the repo to familiarize yourself with its structure." The agent reads the types, interfaces, schemas — before any implementation. It is looking for contracts that define what the system expects.

Contract-first: before writing any implementation, define the contract — the TypeScript interface, the OpenAPI schema, the input and output types. This creates the anchor the agent uses to verify its implementation is correct. The agent implements knowing exactly what it needs to satisfy. And it can verify automatically via contract tests.

How to apply: if you write the implementation before the interface, you are building backwards. Reverse it. The contract is what transforms code into an agent-readable system.

Principle 4 — Deterministic Testing (the agent's automatic verifier)

Anthropic explains why tests are central to software agents: "code solutions are verifiable through automated tests; agents can iterate on solutions using test results as feedback; output quality can be measured objectively." This is why agents work better in software development than in any other domain — the feedback loop is automatic and objective.

But this is only true if the tests are deterministic. A test suite that fails randomly — due to time, network, execution order, or external state — is not a verifier. It is noise. The agent cannot distinguish "my change broke something" from "the test failed due to flakiness." When the oracle lies, the agent fails.

How to apply: explicit mocks of all external dependencies, deterministic fixtures, tests in a pyramid (many fast unit tests, few slow E2E tests), naming that describes exactly the behavior being tested. In AI-Native Architecture, deterministic tests are not a best practice — they are an architectural requirement.

Principle 5 — Self-Describing Systems and the AGENTS.md

Anthropic has a principle called the "agent-computer interface" (ACI): tool interfaces for models deserve the same care as interfaces for humans. "Think about how much effort goes into human-computer interfaces — and plan to invest just as much effort in creating good agent-computer interfaces." This applies not just to external tools but to the codebase itself.

A self-describing system includes: schemas that describe the data structure, types that carry domain semantics, errors with enough context for automatic diagnosis. And, most importantly: an AGENTS.md file at the root of the repository. The AGENTS.md is the agent's onboarding — it describes the general architecture, the main modules, what is safe to modify and what is critical, how to run tests locally, the project's naming conventions. A human developer takes days absorbing this context through informal conversations. An agent reads the AGENTS.md in seconds and operates with the same level of orientation.

Principle 6 — Automated Verification Pipeline (what closes the loop)

The Anthropic agent in SWE-bench follows an explicit loop: explores the repository, creates a script to reproduce the problem, modifies the source code, runs the script again to confirm the fix. The agent only considers the task done when the tests pass. This automated verification loop is exactly what enables real autonomy.

In AI-Native Architecture, this pipeline is part of the codebase: lint, type-check, unit tests, integration tests, build — in a single command the agent runs locally. The agent only closes the task when the pipeline is green. This eliminates an entire category of errors and eliminates the need for manual review on routine changes. The human engineer reviews design decisions — not syntax, types, or test coverage. This is what compresses cycle time from 3 weeks to 3 days.

The 6 principles together: from 3 weeks to 3 days

The São Paulo fintech did not reduce the team from 50 to 5 engineers by swapping tools. They redesigned the architecture: made knowledge explicit in types (Principle 1), separated modules with clear contracts (Principle 2), defined contracts before implementations (Principle 3), made tests deterministic (Principle 4), created a complete AGENTS.md (Principle 5), and configured a local verification pipeline (Principle 6).

Individually, each principle improves code quality. Together, they create what Anthropic describes as the ideal environment for software agents: "well-defined and structured, with output quality that can be measured objectively." An architecture where the agent does not need to ask anyone — because the system already answers. The result is not just more speed. It is a system that never accumulates legacy debt the same way conventional systems do — because every change goes through the same architectural guardrails.

How to evaluate if your codebase is ready

There is an evaluation framework called AI-Readiness Score that measures which of these principles your codebase already aligns with and which need work. A score above 70 generally means AI agents can operate in the system with minimal supervision. Below that, we identify where the biggest blind spots are — and what to do to fix them.

See also: AI-Readiness Score — how to measure if your codebase is ready for AI → /blog/ai-readiness-score
See also: AGENTS.md — the most important file in an AI-Native codebase → /blog/agents-md-codebase-ai-native
See also: From 50 to 5 engineers — how AI-Native Architecture changed the model → /blog/time-encolheu-entregou-mais-rapido
Learn about our AI-Native Engineering service → /software-engineering

The 6 Principles of AI-Native Architecture — and How to Apply Them to Your Codebase