AI Slack Bot Agent
A serverless AI agent on AWS that receives natural language questions via Slack, reasons using AWS Bedrock (Claude Haiku), calls real AWS tools — DynamoDB, S3, web search — and posts structured answers back to the channel.
Estimated monthly cost: ~$3–6/mo | Runtime: Node.js 20 on AWS Lambda | AI model: Claude Haiku via AWS Bedrock | Region: us-east-1
How it works
A user types @docsbot in Slack with a natural language question. API Gateway receives the webhook and passes it to the Router Lambda, which validates the Slack signature, returns HTTP 200 immediately (solving Slack’s 3-second timeout), and enqueues the job to SQS.
The Worker Lambda picks up the message, calls AWS Bedrock with the user’s question, and lets Claude Haiku decide which tools to invoke — querying DynamoDB for structured business data, reading documents from S3, or using previous conversation history. The final answer is posted back to Slack as a structured reply.
Services used
Compute
| Service | Role |
|---|---|
| Lambda — router | Receives Slack event, validates signature, returns 200 OK instantly, enqueues job |
| Lambda — worker | Reads SQS, calls Bedrock, executes tool calls, posts reply to Slack |
AI & agent
| Service | Role |
|---|---|
| AWS Bedrock | Hosts Claude Haiku model — handles reasoning and tool-use decisions |
| Bedrock tool use | Agent decides which tools to call and in what order based on the question |
API & messaging
| Service | Role |
|---|---|
| API Gateway | Public HTTPS endpoint that Slack posts events to |
| SQS queue | Decouples router from worker — solves the Slack 3-second timeout problem |
| SQS DLQ | Dead-letter queue captures failed jobs for inspection and retry |
Data & storage
| Service | Role |
|---|---|
| DynamoDB — conversations | Stores message history per user so agent has memory across turns |
| DynamoDB — business data | Stores queryable structured data (e.g. sales reps, products) for agent tools |
| S3 — documents | Stores PDFs and reports the agent summarizer tool reads |
| S3 — Terraform state | Remote backend for Terraform state files |
Security & config
| Service | Role |
|---|---|
| Secrets Manager | Stores Slack bot token and signing secret — never in env vars |
| IAM roles | Separate least-privilege role per Lambda — no shared credentials |
| VPC + private subnets | Lambda runs in private subnet — no public internet exposure |
| KMS | Encryption at rest for DynamoDB and S3 |
Observability
| Service | Role |
|---|---|
| CloudWatch Logs | Captures all Lambda stdout — structured JSON logging |
| X-Ray tracing | Traces the full agent reasoning chain — shows each tool call as a span |
| CloudWatch Alarms | Alerts on DLQ depth > 0 and Lambda error rate > 5% |
CI/CD pipeline
| Service | Role |
|---|---|
| CodeCommit | Git source repository with branch protection on main |
| CodePipeline | Orchestrates: source → build → terraform plan → approval → terraform apply |
| CodeBuild | Runs terraform fmt, terraform validate, tfsec, checkov, unit tests |
| Manual approval gate | Required between terraform plan and terraform apply for prod |
Key design decisions
Two-Lambda pattern — Router Lambda returns HTTP 200 to Slack within 3 seconds, enqueues the job to SQS. Worker Lambda processes asynchronously with no time pressure. This solves Slack’s strict acknowledgment timeout.
Bedrock over direct API calls — All AI requests stay inside the AWS network. Bedrock does not store or train on your prompts. Better data governance than calling Anthropic or OpenAI APIs directly.
DynamoDB on-demand — No minimum capacity cost. At portfolio-scale traffic the database cost is effectively zero. Scales automatically if traffic spikes.
Least-privilege IAM — Each Lambda has its own role with only the permissions it needs. Router Lambda can only write to SQS and read Secrets Manager. Worker Lambda can read SQS, call Bedrock, read/write DynamoDB, read S3, and call Slack’s API. Nothing more.
Signing secret validation — Every incoming Slack request is verified using HMAC-SHA256 against the Slack signing secret before any processing begins. Random HTTP requests to the API Gateway endpoint are rejected immediately.