AWS Documentation References

Official AWS documentation supporting the NorthBuilt RAG System architecture.

Table of Contents

  1. Overview
  2. Amazon S3 Vectors
    1. Core Documentation
    2. Configuration and Limits
    3. Integration
    4. Key Limits (as of 2025)
    5. Blog Posts and Announcements
  3. Amazon Bedrock Knowledge Bases
    1. Core Documentation
    2. Data Source Configuration
    3. Chunking Strategies
    4. Query and Retrieval
    5. API Reference
    6. Product Resources
  4. Metadata Filtering
    1. Documentation
    2. Supported Filter Operators
    3. Blog Posts
  5. Multi-Tenancy
    1. AWS Blog Posts
    2. Patterns Comparison
  6. Reranking
    1. Documentation
    2. Available Models
    3. Blog Posts
  7. Amazon Titan Embeddings
    1. Documentation
    2. Model Specifications
    3. Blog Posts
  8. Claude on Amazon Bedrock
    1. Documentation
    2. Models Used
  9. AWS Cognito
    1. Documentation
  10. Additional Resources
    1. AWS Samples and Examples
    2. Architecture Guidance
  11. How We Use These References

Overview

This page provides direct links to official AWS documentation that supports and validates our architectural decisions. These references are authoritative sources for understanding the capabilities, limitations, and best practices of the AWS services we use.


Amazon S3 Vectors

S3 Vectors is the vector storage backend for our Bedrock Knowledge Base, providing purpose-built, cost-optimized storage for semantic search.

Core Documentation

Resource Description
Working with S3 Vectors and vector buckets Main user guide for S3 Vectors
Getting Started with S3 Vectors Tutorial for initial setup
Vector indexes Index configuration and management
Vectors Vector operations and storage

Configuration and Limits

Resource Description
Metadata filtering How to filter queries by metadata
Limitations and restrictions Service constraints and quotas
Regions, endpoints, and quotas Regional availability and limits

Integration

Resource Description
Using S3 Vectors with Amazon Bedrock Knowledge Bases Integration guide for Bedrock KB

Key Limits (as of 2025)

Limit Value Source
Vectors per index 2 billion Quotas
Vector dimensions 1-4,096 Limitations
Filterable metadata per vector 2 KB Metadata filtering
Total metadata per vector 40 KB Metadata filtering
Non-filterable keys per index 10 Metadata filtering
Vector indexes per bucket 10,000 Quotas

Blog Posts and Announcements

Resource Description
S3 Vectors GA Announcement General availability announcement with features
Amazon S3 Vectors Product Page Product overview and pricing

Amazon Bedrock Knowledge Bases

Bedrock Knowledge Bases provides the managed RAG orchestration layer, handling document ingestion, chunking, embedding, and retrieval.

Core Documentation

Resource Description
Retrieve data and generate AI responses Main Knowledge Base documentation
How Amazon Bedrock knowledge bases work Architecture and workflow explanation
Create a knowledge base Step-by-step creation guide
Deploy your knowledge base Deployment best practices

Data Source Configuration

Resource Description
Prerequisites for your data Data source requirements
Supported models and Regions Model and region availability
Multimodal content Image, audio, and video support

Chunking Strategies

Resource Description
How content chunking works Chunking strategy guide
ChunkingConfiguration API API reference for chunking

Chunking Strategy Options:

  • Fixed-size: Token count with overlap percentage
  • Hierarchical: Parent-child chunk relationships
  • Semantic: NLP-based meaning boundaries
  • No chunking: Treat document as single chunk

Query and Retrieval

Resource Description
Configure queries and response generation Query configuration options
Query and retrieve data Retrieval operations guide

API Reference

Resource Description
RetrieveAndGenerate API Combined retrieve and generate
RetrievalFilter API Metadata filter specification
KnowledgeBase API Knowledge Base management

Product Resources

Resource Description
Amazon Bedrock Knowledge Bases Product Page Product overview and pricing
AWS Prescriptive Guidance - RAG Options Best practices guide

Metadata Filtering

Metadata filtering enables multi-tenant isolation and targeted retrieval in our RAG system.

Documentation

Resource Description
RetrievalFilter API Reference Filter syntax and operators
S3 Vectors Metadata Filtering S3 Vectors-specific filtering

Supported Filter Operators

Operator Description S3 Vectors Support
equals Exact match Yes
notEquals Exclusion Yes
greaterThan Numeric comparison Yes
lessThan Numeric comparison Yes
in Match any in list Yes
notIn Exclude any in list Yes
andAll Logical AND Yes
orAll Logical OR Yes
startsWith Prefix match No
stringContains Substring match No
listContains List membership Yes

Important: When using S3 Vectors as the vector store, startsWith and stringContains operators are not supported. See RetrievalFilter documentation.

Blog Posts

Resource Description
Metadata filtering for improved retrieval accuracy Feature introduction
Access control with metadata filtering Security patterns

Multi-Tenancy

Multi-tenancy patterns enable serving multiple clients from a single Knowledge Base while maintaining data isolation.

AWS Blog Posts

Resource Description
Multi-tenancy with metadata filtering Our primary implementation pattern
Multi-tenant RAG patterns Silo vs Pool patterns
Multi-tenant vector search with Aurora Aurora PostgreSQL approach
Multi-tenant RAG with JWT JWT-based access control

Patterns Comparison

Pattern Description Use Case
Pool (our approach) Single KB with metadata filtering Cost-effective, simpler management
Silo Separate KB per tenant Maximum isolation, per-tenant encryption
Hybrid Shared infrastructure, separate data sources Balance of isolation and efficiency

Reranking

Reranking improves retrieval relevance by re-scoring results based on semantic similarity to the query.

Documentation

Resource Description
Improve relevance with reranking Main reranking guide
Use a reranker model Implementation guide
Supported models and Regions Model availability

Available Models

Model Availability Notes
Cohere Rerank 3.5 us-east-1, us-west-2, ca-central-1, eu-central-1, ap-northeast-1 Used in our system
Amazon Rerank 1.0 Not us-east-1 Not available in our region

Blog Posts

Resource Description
Cohere Rerank 3.5 on Amazon Bedrock Model capabilities
Rerank API announcement Feature launch

Amazon Titan Embeddings

Titan Embeddings V2 generates the vector representations for semantic search.

Documentation

Resource Description
Amazon Titan Text Embeddings models Model specifications
Titan Embeddings parameters API parameters

Model Specifications

Specification Value
Model ID amazon.titan-embed-text-v2:0
Max input 8,192 tokens
Default dimensions 1,024
Available dimensions 256, 512, 1,024
Dimension retention (512) ~99% accuracy
Dimension retention (256) ~97% accuracy

Blog Posts

Resource Description
Getting started with Titan Embeddings V2 Model introduction
Titan Embeddings V2 announcement Feature capabilities

Claude on Amazon Bedrock

Claude Sonnet 4.5 powers our response generation and query understanding.

Documentation

Resource Description
Claude models on Bedrock Claude API parameters
Anthropic Claude Documentation Anthropic’s official docs

Models Used

Model Purpose Model ID
Claude Sonnet 4.5 Response generation us.anthropic.claude-sonnet-4-5-20250929-v1:0
Claude Haiku Query understanding (cost-efficient) anthropic.claude-3-haiku-20240307-v1:0

AWS Cognito

Cognito provides authentication with Google OAuth integration.

Documentation

Resource Description
Amazon Cognito User Guide Main documentation
Social sign-in with Google Google OAuth setup
JWT tokens Token handling

Additional Resources

AWS Samples and Examples

Resource Description
Amazon Bedrock Samples (GitHub) Code examples
RAG chunking strategies Chunking cookbook

Architecture Guidance

Resource Description
RAG with Amazon Bedrock AWS Prescriptive Guidance
Well-Architected RAG Lens Best practices

How We Use These References

Throughout our documentation, we link to these AWS sources to:

  1. Validate architectural decisions - Our choices are backed by official AWS guidance
  2. Explain limitations - Service constraints are documented by AWS
  3. Provide deeper context - Readers can explore official docs for more detail
  4. Stay current - AWS documentation reflects the latest service capabilities

When you see a claim in our documentation, look for the accompanying AWS reference link to verify and learn more.


Last updated: 2026-01-01