Who is Fernando F. Azevedo?

Fernando F. Azevedo is a Senior Solutions Architect at Banco Itaú with 16+ years of experience across AWS, event-driven architecture, DevSecOps, Data Mesh, AI and financial systems.

What technical topics does Fernando work with?

Fernando works with AWS, Kubernetes, Kafka, Data Mesh, Amazon Bedrock, RAG, DevSecOps, observability, financial systems and architecture communication using C4, ADRs and trade-off analysis.

Is Fernando available for professional conversations?

Fernando is currently building at Banco Itaú and is open to thoughtful conversations about architecture, cloud, AI, engineering leadership, community, podcasts and technical collaboration.

The AI Architect Track

Module 5 · Apply + Project· Lesson 20/22

When (not) to use AI: the architecture decision

Maturity is knowing when AI is the right tool — and when an if/else solves it better.

6 min read

Listen — Fernando's cloned voice

0:0011:06

Speed

Download

The most valuable skill of an AI architect is not knowing how to build agents — it's knowing when not to. AI solves real problems, but it also creates unnecessary complexity, cost, and risk when applied in the wrong place. This lesson is about judgment: the criterion that separates a well-designed system from an expensive project that fails silently.

Where AI genuinely shines

AI — especially LLMs — excels at problems where the correct answer doesn't fit into an explicit rule. Think of classifying the intent of a support message with ambiguous language, extracting structured fields from a scanned PDF contract, summarizing 40 pages of incident logs into three actionable paragraphs, or answering questions about a knowledge base that changes every week.

The common denominator: ambiguity, natural language, variation in form with stable meaning. In these cases, writing manual rules is fragile — you spend weeks covering edge cases and still miss new ones. A model trained on human language generalizes naturally.

Other strong cases: draft generation (where a human reviews), semantic search (lesson 04), fuzzy classification with many categories, and synthesis of distributed information. The pattern is always the same: variable input, output tolerant of small errors, and a human or downstream system able to absorb imperfections.

If your problem has these characteristics, AI is not hype — it's the right tool.

Where AI is the wrong choice

There are three situations where I actively recommend not using AI, and I learned each one the hard way.

A deterministic rule exists and is stable. If the logic is if status == 'APPROVED' and amount < 1000: auto_release(), an LLM only adds latency, cost, and non-determinism. if/else is faster, testable, auditable, and doesn't hallucinate.

Critical accuracy with no verification mechanism. Tax calculation, medication dosage, financial transfer — any domain where errors are costly and there's no human or system checking the output. LLMs make arithmetic mistakes, confuse dates, and fabricate references. If you don't have robust evals (lesson 09) and a guardrail (lesson 10) covering the critical path, don't put AI in that flow.

Cost or latency don't work out. An endpoint that needs to respond in 80ms at $0.0001 per call is not a candidate for a 70B-parameter LLM. Do the math before prototyping. Sometimes a smaller model, a traditional classifier, or simply a search index solves it at 1/100th the cost.

The most common mistake I see: teams using an LLM to parse JSON that already comes structured from the API, or to validate a field that has a three-line regex. That's complexity with no benefit.

Decision flow: use AI or not?

Walk through this flow for any requirement before choosing an approach. Each node is a real architecture question.

🔍 Entrada — Análise do problema

Requisito · chega
Existe regra · determinística?
Use if/else · ou regex

⚖️ Avaliação — Tolerância e custo

Entrada é ambígua · ou linguagem natural?
Sistema tolera · erro ocasional?
Custo/latência · fecham?
Não use IA · (revisar requisito)

✅ Saída — Abordagem escolhida

Padrão híbrido · IA + regra + humano
IA como · componente principal
Adicionar evals · e guardrails

Use AI vs. don't use AI: comparative analysis

AI as primary component

Pros

Handles natural language and ambiguity without manual rules
Generalizes to cases not anticipated at design time
Scales capability without scaling rule engineering

Cons

Non-deterministic: same input may produce different outputs
Per-token cost accumulates at high volume
Requires evals, guardrails, and continuous monitoring

Right for: fuzzy classification, text extraction, synthesis, semantic search, generation with review

Deterministic logic (if/else, regex, rules)

Pros

100% predictable and auditable behavior
Near-zero marginal cost at scale
Testable with simple unit tests

Cons

Fragile to unanticipated form variations (free language)
Maintenance grows with rule complexity
Doesn't scale to open-ended domains

Right for: field validation, status-based routing, financial calculations, fixed-format parsing

Hybrid pattern (AI + rule + human)

Pros

AI handles most cases; rules cover critical ones
Human enters only on low-confidence cases
Reduces risk without sacrificing automation

Cons

More components = more failure surface to manage
Requires defining confidence thresholds and human-review SLAs

Right for: content moderation, medical triage, credit approval, any high-impact flow

The decision framework in four questions

Before any line of code, answer these four questions in order. They work as a filter — each negative answer shortens the path.

1. Does the problem require non-determinism? If the correct answer is always the same given the same input, you don't need AI. A parser, a SQL query, a pure function solve it better.

2. Does the system tolerate occasional errors? LLMs make mistakes. If an error causes financial, legal, or health damage without a containment mechanism, the risk isn't justified — unless you add mandatory output verification (evals + human in the loop).

3. Is there a source of truth to ground the answer? If yes, RAG (lesson 06) or grounding (lesson 19) reduce hallucination. If there's no source and accuracy is critical, rethink the approach.

4. Does the cost per call and latency fit the SLA and budget? Calculate: monthly volume × average tokens × price per token. Compare with the value generated. If the math doesn't work even with the cheapest model on Bedrock (lesson 16), the problem may not be an AI one — or it needs a different architecture (cache, smaller model, pre-computation).

These four questions don't eliminate creativity — they direct energy to where AI genuinely delivers value.

In practice: the pattern I use most

Senior Solutions Architect

In practice, most production systems that work well use the hybrid pattern: AI processes the general case (80-90% of volume), deterministic rules block or redirect critical cases, and humans review low-confidence ones. This pattern isn't weakness — it's mature engineering. I never put AI in a critical path without at least one deterministic verification layer after it. The model can be wrong; the system cannot.

How to evaluate a new requirement

1
Describe the problem in one sentence without mentioning AI
If the description already implies a clear rule, it's probably not an AI case. E.g., 'reject order if CPF is invalid' → regex.
2
List failure cases and their impact
How much does a false positive cost? A false negative? If both are expensive, you need verification — and AI may not be the right component for the final decision.
3
Estimate cost before prototyping
Volume × tokens × price. Add network latency and cold start if serverless. Compare with the cost of the equivalent deterministic solution.
4
Define the success criterion before the first prompt
Without an evaluation metric (lesson 09), you won't know when to stop iterating. Define: minimum acceptable accuracy, maximum latency, cost per transaction.
5
Design the fallback before the happy path
What happens when the model returns low confidence, timeout, or an invalid response? Define this before building the main flow.

Frequent architecture questions

What if I don't know the volume before production?

Use the cheapest model that meets minimum quality, add cache for repeated inputs (lesson 19), and instrument everything from day one. Surprise AI costs almost always come from lack of observability, not unexpected volume.

Can I use AI for critical business logic if a human is reviewing?

Yes — that's the hybrid pattern. The condition is that human review is real, with a defined SLA, and not a checkbox nobody reads. If the human approves everything without reading, you don't have review — you have security theater.

Traditional classifier (classic ML) vs. LLM: when to use each?

Trained classifier: when you have enough labeled data, latency < 50ms is a requirement, and categories are stable. LLM: when categories change, labeled data is scarce, or classification requires reasoning over broad context. LLM is more flexible; classifier is more predictable and cheaper.

Opening the final module: from prototype to production

This lesson closes the foundations cycle and opens the final module of the track. You now have the complete map: you understand how models learn (lessons 01-03), how to represent and retrieve knowledge (04-06), how to connect AI to the world (07-08), how to evaluate and protect systems (09-10), how to build agents (11-15), and how to use AWS infrastructure for all of it (16-19).

What remains is the transition from prototype to a system that works in production with real users — and the guided project that consolidates all of this into practice.

In lesson 21, we'll cover what changes when you leave the notebook: observability, prompt versioning, CI/CD for AI systems, cost management at scale, and the deployment patterns that work on Bedrock AgentCore. Not theory — these are the decisions you'll make in the first weeks of every real project.

Lesson 22 is the guided project: you'll design a RAG + agent system from scratch, making each architecture decision with the criteria you learned here. It includes a final exam that tests judgment, not memorization.

Maturity in AI is not knowing how to use every feature — it's knowing how to choose the right ones for the right problem, with explicit trade-offs. You've arrived.

Quiz

Quick check

1. Which case is the LEAST suited to generative AI?

Architect's verdict

AI is a powerful tool with a specific risk profile: non-deterministic, expensive at scale, and silently wrong when it fails. Use it where ambiguity and natural language make manual rules impractical. Avoid it where accuracy is critical without verification, where the rule already exists, or where the cost doesn't work out. The hybrid pattern — AI for the general case, rules for the critical, human for the uncertain — is the most mature architecture I know for production systems. Judgment is what differentiates an architect from someone who just knows how to call the API.

Previous Next lesson

Where AI genuinely shines

If your problem has these characteristics, AI is not hype — it's the right tool.

Where AI is the wrong choice

There are three situations where I actively recommend not using AI, and I learned each one the hard way.

The most common mistake I see: teams using an LLM to parse JSON that already comes structured from the API, or to validate a field that has a three-line regex. That's complexity with no benefit.

Decision flow: use AI or not?

Walk through this flow for any requirement before choosing an approach. Each node is a real architecture question.

🔍 Entrada — Análise do problema

Requisito · chega
Existe regra · determinística?
Use if/else · ou regex

⚖️ Avaliação — Tolerância e custo

Entrada é ambígua · ou linguagem natural?
Sistema tolera · erro ocasional?
Custo/latência · fecham?
Não use IA · (revisar requisito)

✅ Saída — Abordagem escolhida

Padrão híbrido · IA + regra + humano
IA como · componente principal
Adicionar evals · e guardrails

Use AI vs. don't use AI: comparative analysis

AI as primary component

Pros

Handles natural language and ambiguity without manual rules
Generalizes to cases not anticipated at design time
Scales capability without scaling rule engineering

Cons

Non-deterministic: same input may produce different outputs
Per-token cost accumulates at high volume
Requires evals, guardrails, and continuous monitoring

Right for: fuzzy classification, text extraction, synthesis, semantic search, generation with review

Deterministic logic (if/else, regex, rules)

Pros

100% predictable and auditable behavior
Near-zero marginal cost at scale
Testable with simple unit tests

Cons

Fragile to unanticipated form variations (free language)
Maintenance grows with rule complexity
Doesn't scale to open-ended domains

Right for: field validation, status-based routing, financial calculations, fixed-format parsing

Hybrid pattern (AI + rule + human)

Pros

AI handles most cases; rules cover critical ones
Human enters only on low-confidence cases
Reduces risk without sacrificing automation

Cons

More components = more failure surface to manage
Requires defining confidence thresholds and human-review SLAs

Right for: content moderation, medical triage, credit approval, any high-impact flow

The decision framework in four questions

Before any line of code, answer these four questions in order. They work as a filter — each negative answer shortens the path.

1. Does the problem require non-determinism? If the correct answer is always the same given the same input, you don't need AI. A parser, a SQL query, a pure function solve it better.

3. Is there a source of truth to ground the answer? If yes, RAG (lesson 06) or grounding (lesson 19) reduce hallucination. If there's no source and accuracy is critical, rethink the approach.

These four questions don't eliminate creativity — they direct energy to where AI genuinely delivers value.

How to evaluate a new requirement

Describe the problem in one sentence without mentioning AI

If the description already implies a clear rule, it's probably not an AI case. E.g., 'reject order if CPF is invalid' → regex.

List failure cases and their impact

How much does a false positive cost? A false negative? If both are expensive, you need verification — and AI may not be the right component for the final decision.

Estimate cost before prototyping

Volume × tokens × price. Add network latency and cold start if serverless. Compare with the cost of the equivalent deterministic solution.

Define the success criterion before the first prompt

Without an evaluation metric (lesson 09), you won't know when to stop iterating. Define: minimum acceptable accuracy, maximum latency, cost per transaction.

Design the fallback before the happy path

What happens when the model returns low confidence, timeout, or an invalid response? Define this before building the main flow.

Frequent architecture questions

What if I don't know the volume before production?

Can I use AI for critical business logic if a human is reviewing?

Traditional classifier (classic ML) vs. LLM: when to use each?

Opening the final module: from prototype to production

What remains is the transition from prototype to a system that works in production with real users — and the guided project that consolidates all of this into practice.

Maturity in AI is not knowing how to use every feature — it's knowing how to choose the right ones for the right problem, with explicit trade-offs. You've arrived.

Architect's verdict