AWS Bedrock provides access to foundation models from multiple AI providers through a single API to build generative AI apps.

Which tasks can Bedrock handle?

Text generation, embeddings, and image generation across models like Claude, Titan, Llama, Mistral, and Stable Diffusion.

How do I get started?

Enable the models in the Bedrock console, choose an appropriate inference type, and invoke models or embeddings via the CLI or boto3 to prototype quickly.

bedrock

Scanned

npx machina-cli add skill itsmostafa/aws-agent-skills/bedrock --openclaw

Files (1)

SKILL.md

10.6 KB

AWS Bedrock

Amazon Bedrock provides access to foundation models (FMs) from AI companies through a unified API. Build generative AI applications with text generation, embeddings, and image generation capabilities.

Core Concepts
Common Patterns
CLI Reference
Best Practices
Troubleshooting
References

Core Concepts

Foundation Models

Pre-trained models available through Bedrock:

Claude (Anthropic): Text generation, analysis, coding
Titan (Amazon): Text, embeddings, image generation
Llama (Meta): Open-weight text generation
Mistral: Efficient text generation
Stable Diffusion (Stability AI): Image generation

Model Access

Models must be enabled in your account before use:

Request access in Bedrock console
Some models require acceptance of EULAs
Access is region-specific

Inference Types

Type	Use Case	Pricing
On-Demand	Variable workloads	Per token
Provisioned Throughput	Consistent high-volume	Hourly commitment
Batch Inference	Async large-scale	Discounted per token

Common Patterns

Invoke Model (Text Generation)

AWS CLI:

# Invoke Claude
aws bedrock-runtime invoke-model \
  --model-id anthropic.claude-3-sonnet-20240229-v1:0 \
  --content-type application/json \
  --accept application/json \
  --body '{
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Explain AWS Lambda in 3 sentences."}
    ]
  }' \
  response.json

cat response.json | jq -r '.content[0].text'

boto3:

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def invoke_claude(prompt, max_tokens=1024):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': max_tokens,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']

# Usage
response = invoke_claude('What is Amazon S3?')
print(response)

Streaming Response

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def stream_claude(prompt):
    response = bedrock.invoke_model_with_response_stream(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': [
                {'role': 'user', 'content': prompt}
            ]
        })
    )

    for event in response['body']:
        chunk = json.loads(event['chunk']['bytes'])
        if chunk['type'] == 'content_block_delta':
            yield chunk['delta'].get('text', '')

# Usage
for text in stream_claude('Write a haiku about cloud computing.'):
    print(text, end='', flush=True)

Generate Embeddings

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def get_embedding(text):
    response = bedrock.invoke_model(
        modelId='amazon.titan-embed-text-v2:0',
        contentType='application/json',
        accept='application/json',
        body=json.dumps({
            'inputText': text,
            'dimensions': 1024,
            'normalize': True
        })
    )

    result = json.loads(response['body'].read())
    return result['embedding']

# Usage
embedding = get_embedding('AWS Lambda is a serverless compute service.')
print(f'Embedding dimension: {len(embedding)}')

Conversation with History

import boto3
import json

bedrock = boto3.client('bedrock-runtime')

class Conversation:
    def __init__(self, system_prompt=None):
        self.messages = []
        self.system = system_prompt

    def chat(self, user_message):
        self.messages.append({
            'role': 'user',
            'content': user_message
        })

        body = {
            'anthropic_version': 'bedrock-2023-05-31',
            'max_tokens': 1024,
            'messages': self.messages
        }

        if self.system:
            body['system'] = self.system

        response = bedrock.invoke_model(
            modelId='anthropic.claude-3-sonnet-20240229-v1:0',
            contentType='application/json',
            accept='application/json',
            body=json.dumps(body)
        )

        result = json.loads(response['body'].read())
        assistant_message = result['content'][0]['text']

        self.messages.append({
            'role': 'assistant',
            'content': assistant_message
        })

        return assistant_message

# Usage
conv = Conversation(system_prompt='You are an AWS solutions architect.')
print(conv.chat('What database should I use for a chat application?'))
print(conv.chat('What about for time-series data?'))

List Available Models

# List all foundation models
aws bedrock list-foundation-models \
  --query 'modelSummaries[*].[modelId,modelName,providerName]' \
  --output table

# Filter by provider
aws bedrock list-foundation-models \
  --by-provider anthropic \
  --query 'modelSummaries[*].modelId'

# Get model details
aws bedrock get-foundation-model \
  --model-identifier anthropic.claude-3-sonnet-20240229-v1:0

Request Model Access

# List model access status
aws bedrock list-foundation-model-agreement-offers \
  --model-id anthropic.claude-3-sonnet-20240229-v1:0

CLI Reference

Bedrock (Control Plane)

Command	Description
`aws bedrock list-foundation-models`	List available models
`aws bedrock get-foundation-model`	Get model details
`aws bedrock list-custom-models`	List fine-tuned models
`aws bedrock create-model-customization-job`	Start fine-tuning
`aws bedrock list-provisioned-model-throughputs`	List provisioned capacity

Bedrock Runtime (Data Plane)

Command	Description
`aws bedrock-runtime invoke-model`	Invoke model synchronously
`aws bedrock-runtime invoke-model-with-response-stream`	Invoke with streaming
`aws bedrock-runtime converse`	Multi-turn conversation API
`aws bedrock-runtime converse-stream`	Streaming conversation

Bedrock Agent Runtime

Command	Description
`aws bedrock-agent-runtime invoke-agent`	Invoke a Bedrock agent
`aws bedrock-agent-runtime retrieve`	Query knowledge base
`aws bedrock-agent-runtime retrieve-and-generate`	RAG query

Best Practices

Cost Optimization

Use appropriate models: Smaller models for simple tasks
Set max_tokens: Limit output length when possible
Cache responses: For repeated identical queries
Batch when possible: Use batch inference for bulk processing
Monitor usage: Set up CloudWatch alarms for cost

Performance

Use streaming: For better user experience with long outputs
Connection pooling: Reuse boto3 clients
Regional deployment: Use closest region to reduce latency
Provisioned throughput: For consistent high-volume workloads

Security

Least privilege IAM: Only grant needed model access
VPC endpoints: Keep traffic private
Guardrails: Implement content filtering
Audit with CloudTrail: Track model invocations

IAM Permissions

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0",
        "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
      ]
    }
  ]
}

Troubleshooting

AccessDeniedException

Causes:

Model access not enabled in console
IAM policy missing bedrock:InvokeModel
Wrong model ID or region

Debug:

# Check model access status
aws bedrock list-foundation-models \
  --query 'modelSummaries[?modelId==`anthropic.claude-3-sonnet-20240229-v1:0`]'

# Test IAM permissions
aws iam simulate-principal-policy \
  --policy-source-arn arn:aws:iam::123456789012:role/my-role \
  --action-names bedrock:InvokeModel \
  --resource-arns "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0"

ModelNotReadyException

Cause: Model is still being provisioned or temporarily unavailable.

Solution: Implement retry with exponential backoff:

import time
from botocore.exceptions import ClientError

def invoke_with_retry(bedrock, body, max_retries=3):
    for attempt in range(max_retries):
        try:
            return bedrock.invoke_model(
                modelId='anthropic.claude-3-sonnet-20240229-v1:0',
                body=json.dumps(body)
            )
        except ClientError as e:
            if e.response['Error']['Code'] == 'ModelNotReadyException':
                time.sleep(2 ** attempt)
            else:
                raise
    raise Exception('Max retries exceeded')

ThrottlingException

Causes:

Exceeded on-demand quota
Too many concurrent requests

Solutions:

Request quota increase
Implement exponential backoff
Consider provisioned throughput

ValidationException

Common issues:

Invalid model ID
Malformed request body
max_tokens exceeds model limit

Debug:

# Check model-specific requirements
aws bedrock get-foundation-model \
  --model-identifier anthropic.claude-3-sonnet-20240229-v1:0 \
  --query 'modelDetails.inferenceTypesSupported'

References

Source

git clone https://github.com/itsmostafa/aws-agent-skills/blob/main/skills/bedrock/SKILL.mdView on GitHub

Overview

Bedrock offers a unified API to access multiple foundation models (Claude, Titan, Llama, Mistral, Stable Diffusion) for generative AI tasks. It enables building AI apps with text generation, embeddings, and image generation, plus options to configure model access and support RAG workflows.

How This Skill Works

Bedrock exposes a single API to call different foundation models after you enable access in your account. Choose an inference type (On-Demand, Provisioned Throughput, or Batch Inference) and send a structured payload via CLI or SDK (boto3). You can also stream responses and generate embeddings using modelId like amazon.titan-embed-text-v2.

When to Use It

Invoking a foundation model for text generation or analysis in a live app
Building AI-powered applications that rely on generation, embeddings, or image generation
Creating embeddings for search, similarity, or RAG pipelines
Configuring which models are enabled in your AWS Bedrock account and managing access/EULA requirements
Implementing RAG patterns by combining retrieval with Bedrock inference

Quick Start

Step 1: Enable desired Bedrock models in the Bedrock console and accept any required EULAs
Step 2: Select an inference type (On-Demand, Provisioned Throughput, or Batch) based on your workload
Step 3: Call bedrock-runtime invoke-model (CLI) or the boto3 client to run generation or embedding tasks and process the response

Best Practices

Enable desired models in the Bedrock console and complete any required EULA agreements
Choose the correct inference type based on workload and cost: On-Demand, Provisioned Throughput, or Batch
Use embeddings (e.g., Titan embed) for indexing, retrieval, and RAG readiness
Test prompts and output safety; monitor latency and quotas across regions
Document modelIds, versions, and access controls for reproducible automation using CLI or boto3

Example Use Cases

Invoke Claude via bedrock-runtime to power a chat assistant
Generate text embeddings with amazon.titan-embed-text-v2 for document similarity search
Stream Claude responses in real time using invoke_model_with_response_stream
Build a simple RAG pipeline by indexing documents with embeddings and querying Bedrock for generation
Enable and manage model access (e.g., Claude, Titan) in the Bedrock console for a region

Frequently Asked Questions

Add this skill to your agents

bedrock

AWS Bedrock

Table of Contents

Core Concepts

Foundation Models

Model Access

Inference Types

Common Patterns

Invoke Model (Text Generation)

Streaming Response

Generate Embeddings

Conversation with History

List Available Models

Request Model Access

CLI Reference

Bedrock (Control Plane)

Bedrock Runtime (Data Plane)

Bedrock Agent Runtime

Best Practices

Cost Optimization

Performance

Security

IAM Permissions

Troubleshooting

AccessDeniedException

ModelNotReadyException

ThrottlingException

ValidationException

References

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What is AWS Bedrock?

Which tasks can Bedrock handle?

How do I get started?