rag-app-on-aws
Build and deploy a full-stack RAG app on AWS with Terraform, using free tier Gemini Pro, real-time web search using Remote MCP server and Streamlit UI with token based authentication.
claude mcp add --transport stdio genieincodebottle-rag-app-on-aws docker run -i rag-app-on-aws-image \ --env TF_LOG="INFO" \ --env AWS_REGION="Describe or set your AWS region" \ --env TF_VAR_example="value"
How to use
This MCP server represents the AWS-based Retrieval-Augmented Generation (RAG) deployment described in the Rag App on AWS repository. Rather than a single executable service, it points to a Terraform-driven infrastructure stack that provisions an end-to-end backend (API Gateway, Lambda functions, S3, RDS with pgvector) and a Streamlit UI for user interaction. The primary capabilities come from the deployed components: document ingestion and embedding generation, vector storage and semantic search, authentication via Cognito, and a Streamlit-based front-end for login, upload, querying, and evaluation dashboards. Use the server to provision, manage, and operate the full RAG pipeline against your AWS environment or to tie in with the related UI and integration points described in the repository.
When you run this MCP server, you’ll gain access to the tooling and workflows that orchestrate: (1) IaC with Terraform for multi-environment deployments (dev, staging, prod); (2) Lambda-backed backend logic for document processing, uploading, querying, and authentication; (3) a Postgres RDS instance with pgvector for embedding storage and vector search; (4) Streamlit UI components for interactive document upload and RAG evaluation dashboards. You can use the provided Terraform modules to adjust network, compute, and storage configurations, and you can extend or replace Lambda handlers as your use case evolves.
How to install
Prerequisites:
- Terraform installed and configured with AWS credentials
- AWS CLI installed and configured (or appropriate environment-based credentials)
- Python (for local tooling and tests, if you plan to run tests or utility scripts)
Installation steps:
-
Prepare AWS credentials and environment:
- Configure your AWS region (e.g., export AWS_REGION=us-east-1)
- Create or select an AWS account where you want to deploy the resources
-
Clone the repository (or fetch the Rag App on AWS codebase): git clone https://github.com/genieincodebottle/rag-app-on-aws.git cd rag-app-on-aws
-
Initialize Terraform and install modules: cd environments/dev terraform init
-
Review and customize variables for your environment:
- Open environments/dev/variables.tf and environments/dev/main.tf to align with your account (VPC, subnets, secrets, Cognito, etc.).
- Set any required variables in a terraform.tfvars file or via -var flags. Example: terraform plan -var 'region=us-east-1' -var 'db_password=YOUR_PASSWORD'
-
Apply the infrastructure: terraform apply
-
Post-install validation:
- Verify that API Gateway endpoints, Lambda functions, and the RDS instance are healthy.
- Access the Streamlit UI endpoint and complete the authentication flow via Cognito as configured in the infra.
Note: If you prefer to run components locally for development, you can work with the individual Lambda function code under src/ and use local testing harnesses as described in the repository.
Additional notes
Tips and common considerations:
- The infrastructure uses Terraform modules organized under modules/ to provision networking, compute (Lambda), database (RDS with pgvector), storage, and monitoring. Use the dev/staging/prod environment separation to avoid cross-environment interference.
- Secrets are typically managed via AWS Secrets Manager and Cognito for authentication; ensure proper IAM permissions and secret rotation policies.
- The Rag App UI (rag_ui) is Streamlit-based; for local testing, you can run the UI against the deployed backend endpoints or point it at a local mock backend during development.
- If you encounter Terraform drift or resource join errors, run terraform state list and terraform import for any pre-existing resources, followed by terraform plan to reconcile state.
- Ensure network boundaries (private subnets for Lambda, NAT Gateway for outbound internet access) are consistent with your security requirements and Gemini API access (or any external dependencies).
- Costs can accrue from RDS, NAT Gateway, and Lambdas; monitor usage and adjust autoscaling and timeouts accordingly.
Related MCP Servers
sdk-typescript
A model-driven approach to building AI agents in just a few lines of code.
mcp-gateway
A plugin-based gateway that orchestrates other MCPs and allows developers to build upon it enterprise-grade agents.
Lambda
Creates a simple MCP tool server with "streaming" HTTP.
MCP2Lambda
Run any AWS Lambda function as a Large Language Model (LLM) tool without code changes using Anthropic's Model Context Protocol (MCP).
sample-agentic-ai-web
This project demonstrates how to use AWS Bedrock with Anthropic Claude and Amazon Nova models to create a web automation assistant with tool use, human-in-the-loop interaction, and vision capabilities.
gemini-webapi
MCP server for Google Gemini — free image generation, editing & chat via browser cookies. No API keys needed.