# aws-ai-reference-architectures

Six AWS AI reference architectures — diagrams, IaC skeletons, and Well-Architected analysis.

- URL: https://fernando.moretes.com/open-source/aws-ai-reference-architectures

- Markdown: https://fernando.moretes.com/open-source/aws-ai-reference-architectures/guide.md?lang=en

- GitHub: https://github.com/fernandofatech/aws-ai-reference-architectures

- Homepage: https://ai-architectures.moretes.com

- Language: HCL

- Topics: ai, architecture, aws, bedrock, github-actions, mlops, moretes, portfolio, reference-architecture, sagemaker, solution-architecture, terraform, well-architected

- Stars: 0

- Forks: 0

- Updated: 2026-05-16T02:23:15Z

---

aws-ai-reference-architectures is a bilingual portfolio of six reference architectures for AI workloads on AWS, each with a Mermaid diagram, justified architectural decisions, cost analysis at three scales, a Well-Architected review, and a working Terraform skeleton.

## Why this repository exists

Most publicly available AWS AI examples fall into one of two extremes: toy notebooks with no IaC or security, or 200-page enterprise white papers that nobody finishes reading. This repository deliberately occupies the middle ground.

Each architecture answers a fixed set of questions: what is the problem, which services and why, what are the three to five decisions that actually matter (with rationale and alternatives), what does it cost at S/M/L sizes, what does the Well-Architected Framework surface, and when *not* to use this pattern.

The format is consistent across all six architectures so you can compare patterns side by side or pull a specific section — for example, the cost table or the MADR-formatted decisions — without reading everything. The IaC is intentionally skeletal: it shows resources and wiring, but makes no attempt to be a generic Terraform module that no real team actually uses as-is.

## What is included

- **Six patterns covered:** RAG with Bedrock + OpenSearch, multi-agent orchestration, streaming inference, event-driven AI processing, fine-tuning pipeline with SageMaker/MLflow, and a secure agentic system with Guardrails.
- **Architectural decisions in MADR format** — each decision includes context, considered options, rationale, and consequences, ready to copy into your own ADR.
- **Cost analysis at three scales** — tables with explicit assumptions for S, M, and L sizes, useful in budget conversations with stakeholders.
- **Well-Architected review** per architecture covering all six pillars — a complement to, not a replacement for, the formal AWS Well-Architected Tool.
- **Full CI pipeline** with CodeQL, Trivy, Gitleaks, dependency review, and automatic deploy to GitHub Pages (docs) and Vercel (landing).
- **Bilingual site** (PT/EN) published at ai-architectures.moretes.com, built with plain HTML/CSS/JS and no framework dependencies.

## How the repository is organized

Each architecture is a self-contained folder with documentation, diagram, and IaC. Two publishing targets are maintained in parallel via GitHub Actions.

### 📁 Repo

- architectures/ 01–06 (data)
- docs/ MkDocs Material (frontend)
- frontend/ Static landing (frontend)
- .github/workflows/ CI + security (ci)

### ☁️ Publish

- GitHub Pages Docs site (external)
- Vercel Landing site (edge)

### 🔒 Security

- CodeQL SAST (security)
- Trivy FS scan (security)
- Gitleaks Secret scan (security)

### Flows

- arch -> docs: content source
- arch -> frontend: catalog
- ci -> docs: build + deploy
- ci -> vercel: auto deploy
- docs -> pages: publishes
- frontend -> vercel: publishes
- ci -> codeql: runs
- ci -> trivy: runs
- ci -> gitleaks: runs

## The six architectures in detail

**01 — RAG with Bedrock + OpenSearch** covers the retrieval-augmented generation pattern for internal knowledge bases. It is the most common entry point for teams that want Q&A over corporate documentation without fine-tuning.

**02 — Multi-agent orchestration** uses Bedrock Agents combined with Step Functions for long-running workflows that need durable state across steps — the case where a single LLM call is not enough.

**03 — Streaming AI inference** shows how to wire API Gateway, Lambda, and Bedrock's native streaming for token-level chat UIs without polling.

**04 — Event-driven AI processing** is the async pattern: EventBridge + SQS + Lambda + Bedrock for classification, enrichment, and content moderation at scale, decoupled from the producer.

**05 — Fine-tuning pipeline** covers the full cycle with SageMaker, S3, and MLflow — data preparation, training, experiment tracking, and model promotion. Useful when a generic foundation model does not reach the required quality bar.

**06 — Secure agentic system** is the most complex pattern: Bedrock Agents with Guardrails, VPC isolation, and multi-tenant controls. It is the starting point for anyone who needs to put an agent in production with real security and compliance constraints.

## How to install and use locally

1. **Clone the repository** — Run `git clone https://github.com/fernandofatech/aws-ai-reference-architectures.git` and enter the folder with `cd aws-ai-reference-architectures`.

2. **Read an architecture directly in the terminal or editor** — Each architecture is self-documented. Open, for example, `architectures/01-rag-bedrock-opensearch/README.md` in your editor or use `cat` to inspect the content. There are no runtime dependencies for reading.

3. **Serve the documentation site locally (optional)** — Install Python dependencies with `pip install mkdocs-material` and run `mkdocs serve` from the root. The site will be available at `http://127.0.0.1:8000`.

4. **Serve the catalog frontend locally (optional)** — Enter `cd frontend` and run any static HTTP server, for example `python3 -m http.server 8080`. There is no build step — the frontend is plain HTML/CSS/JS.

5. **Inspect and adapt the Terraform skeleton** — Each architecture folder contains a `terraform/` subdirectory with `.tf` files declaring the main resources. Run `terraform init` inside any of them to validate the syntax. Adjust names, tags, IAM policies, and remote state before any `terraform apply`.

6. **Run security checks locally** — Install Trivy and Gitleaks and run `trivy fs .` and `gitleaks detect --source .` from the root to replicate the checks the CI runs on every push.

_Full flow: clone, read, and serve the docs site_

```bash
# Clone
git clone https://github.com/fernandofatech/aws-ai-reference-architectures.git
cd aws-ai-reference-architectures

# Read an architecture (no dependencies needed)
cat architectures/01-rag-bedrock-opensearch/README.md | less

# Serve the MkDocs documentation site locally
pip install mkdocs-material
mkdocs serve
# → open http://127.0.0.1:8000

# Serve the static catalog frontend
cd frontend && python3 -m http.server 8080
# → open http://127.0.0.1:8080

# Validate a Terraform skeleton (no AWS credentials needed for init)
cd ../architectures/04-event-driven-ai-processing/terraform
terraform init
terraform validate
```

> **The IaC is a skeleton, not a module:** The Terraform files show the correct resources and wiring for each pattern, but they are not ready for `terraform apply` in production. Each team needs to adapt: resource names, mandatory tags, IAM policies with real least privilege, remote state configuration (S3 + DynamoDB), and integration with the existing CI/CD pipeline. Using the skeleton directly in production without these adaptations is a risk.

## How the repository works internally

The repository has three content layers that are kept separate by design.

The **architecture** layer lives in `architectures/` and is the core of the project. Each subfolder contains a structured `README.md` with the fixed sections described above, a `diagram.mmd` file with the Mermaid diagram, and a `terraform/` directory with the IaC skeleton. All technical documentation lives here — the rest of the repository is publishing infrastructure.

The **documentation** layer in `docs/` uses MkDocs Material to generate a navigable site published to GitHub Pages. The `docs.yml` workflow runs a strict build (fails on warnings) and deploys automatically on every push to `main`.

The **frontend** layer in `frontend/` is a static catalog with no framework dependencies, connected to Vercel via Git integration. Automatic previews are generated for every pull request; production deploy happens on merge to `main`.

The CI pipeline covers four distinct concerns: frontend quality (lint + audit), documentation quality (strict build), security (CodeQL + Trivy + Gitleaks + dependency review), and maintenance (Dependabot for Actions and frontend dependencies). The workflows are independent — a security scan failure does not block the docs deploy, and vice versa.

## Frequently asked questions

### Can I use the Terraform code directly in a client project?

Yes, with adaptations. The skeleton shows the correct resources and wiring between them, but you will need to adjust IAM policies, resource names, tags, remote state, and pipeline integration before any deploy. Treat it as a starting point, not a ready-made module.

### Why six architectures and not more?

Six patterns cover most AI workloads I encounter in practice. More patterns without the same depth of analysis do not add value. If your use case does not fit any of the six, open an issue with the context.

### Does the Well-Architected review replace the AWS Well-Architected Tool?

No. The review here highlights the most relevant findings for each specific pattern and serves as preparation or a complement. The formal AWS Tool should be used before any production launch.

### Does the site work without JavaScript?

The catalog frontend uses JavaScript for navigation features, but the main content is accessible. The documentation site (MkDocs Material) requires JavaScript for full search and navigation.

> **How to use in existing design reviews:** For reviews of systems already in production (brownfield), go directly to the Well-Architected section of the architecture closest to your current pattern and compare point by point with what you have. The trade-offs sections are also useful for justifying or questioning past decisions in a technical review context.

## Who this repository is for

This repository is useful for solutions architects and senior engineers who need a structured starting point for AI designs on AWS — not a tutorial, not a white paper, but something you can open in a design meeting and use directly. It is equally useful for anyone evaluating my work as an architect: every decision is justified, every trade-off is explicit, and the CI pipeline reflects the practices I apply in real projects. If you are getting started with AI on AWS and want to understand the patterns before writing code, start with architecture 01 or 04. If you already have a system in production and want to identify gaps, go to the Well-Architected section of the corresponding architecture.

## References

- [GitHub — fernandofatech/aws-ai-reference-architectures](https://github.com/fernandofatech/aws-ai-reference-architectures)
- [Portfolio site — ai-architectures.moretes.com](https://ai-architectures.moretes.com)
- [Project documentation — GitHub Pages](https://fernandofatech.github.io/aws-ai-reference-architectures/)
- [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
- [AWS Well-Architected Framework](https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html)
- [MADR — Markdown Architectural Decision Records](https://adr.github.io/madr/)
- [MkDocs Material](https://squidfunk.github.io/mkdocs-material/)

## Links

- [GitHub repository](https://github.com/fernandofatech/aws-ai-reference-architectures)
- [Homepage](https://ai-architectures.moretes.com)