chore: add ralph sidebar workflow setup files
This commit is contained in:
@@ -0,0 +1,216 @@
|
||||
---
|
||||
name: ralph-setup
|
||||
description: Set up autonomous AI development tasks using the Ralph Wiggum technique. Use when the user wants to create a RALPH orchestration — either a simple looping prompt or a multi-hat coordinated workflow. Interviews the user to understand requirements, decides the appropriate mode, and generates all necessary configuration files (ralph.yml, hats.yml, PROMPT.md). Triggers on mentions of "ralph", "autonomous loop", "hat-based", "orchestration", or requests to set up iterative AI agent tasks.
|
||||
---
|
||||
|
||||
# Ralph Setup Skill
|
||||
|
||||
Set up autonomous AI development tasks using the Ralph Wiggum technique — either as a simple iterating prompt or a coordinated hat-based workflow.
|
||||
|
||||
## Background
|
||||
|
||||
Ralph implements the Ralph Wiggum technique: give an AI agent a task, loop it until it's done. The orchestrator is deliberately thin — it trusts the agent to do the work and enforces quality through backpressure (tests, lint, typecheck must pass).
|
||||
|
||||
There are two modes:
|
||||
|
||||
| Mode | What It Does | Best For |
|
||||
|------|-------------|----------|
|
||||
| **Traditional (Simple Prompt)** | Single loop — agent iterates until LOOP_COMPLETE | Quick tasks, single-concern work, anything one agent can handle in a straight line |
|
||||
| **Hat-Based** | Specialised personas coordinate through typed events | Complex workflows, multi-step processes, tasks needing distinct planning/building/reviewing phases |
|
||||
|
||||
## Core Tenets (Apply to Both Modes)
|
||||
|
||||
These six tenets guide every RALPH setup. Reference them when making decisions:
|
||||
|
||||
1. **Fresh Context Is Reliability** — Each iteration clears context. The prompt must be self-contained enough to re-read, re-plan, and re-execute every cycle.
|
||||
2. **Backpressure Over Prescription** — Don't prescribe HOW to do the work. Create gates that reject bad work (tests pass, lint clean, types check).
|
||||
3. **The Plan Is Disposable** — Regeneration costs one planning loop. Cheap. Don't over-invest in preserving plans.
|
||||
4. **Disk Is State, Git Is Memory** — Files are the handoff mechanism between iterations. Git provides checkpointing and rollback.
|
||||
5. **Steer With Signals, Not Scripts** — Add signs (success criteria, quality gates), not step-by-step scripts.
|
||||
6. **Let Ralph Ralph** — Sit ON the loop, not IN it. The orchestrator coordinates; the agent does the work.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase 1: Interview the User
|
||||
|
||||
Before generating anything, you need to understand the task. Ask targeted questions to fill in these blanks:
|
||||
|
||||
**Essential information:**
|
||||
- What is the task? (Be specific — "build an API" is too vague; "build a REST API for user management with Express.js and TypeScript" is good)
|
||||
- What does "done" look like? (Measurable success criteria — tests pass, endpoints respond, specific files exist)
|
||||
- What language/framework/tools are involved?
|
||||
- Does the project already exist, or is this greenfield?
|
||||
- Are there existing tests, linting, or type-checking set up?
|
||||
|
||||
**Information that helps you decide the mode:**
|
||||
- How many distinct phases or concerns does this task have? (1-2 = simple prompt; 3+ = consider hats)
|
||||
- Does the task need planning before building? (If yes, hat-based is likely better)
|
||||
- Does the task need a review/QA step separate from building? (If yes, hat-based)
|
||||
- Is there a spec or design document to follow? (Spec-driven development suits hats well)
|
||||
- How complex is the codebase? (Large existing codebase with multiple modules = hat-based)
|
||||
|
||||
**Don't over-interview.** If the user gives you a clear, well-scoped task, you may have enough after 1-2 questions. If the task is vague, probe until you can write a crisp PROMPT.md.
|
||||
|
||||
### Phase 2: Decide the Mode
|
||||
|
||||
Use this decision framework:
|
||||
|
||||
**Choose Simple Prompt when:**
|
||||
- The task is a single concern (add a feature, fix a bug, write a script)
|
||||
- One agent can handle it start to finish without distinct phases
|
||||
- The success criteria are straightforward (tests pass, script runs)
|
||||
- The user explicitly wants something quick and simple
|
||||
- The task can be fully described in a PROMPT.md under ~50 lines
|
||||
|
||||
**Choose Hat-Based when:**
|
||||
- The task has 3+ distinct phases (plan → build → test → review)
|
||||
- Different phases need different "mindsets" (architect vs implementer vs reviewer)
|
||||
- The task involves spec-driven development (spec → implement → verify)
|
||||
- There's a TDD workflow (write tests → implement → verify)
|
||||
- The task is large enough that a single prompt would be overwhelming
|
||||
- Multiple files/modules need coordinated changes
|
||||
- The user explicitly asks for hats or a structured workflow
|
||||
|
||||
**When in doubt:** Start with Simple Prompt. You can always add hats later. Simpler is more robust.
|
||||
|
||||
### Phase 3: Generate the Files
|
||||
|
||||
Generate the appropriate files into the user's project directory. Always explain what you're creating and why.
|
||||
|
||||
Read the appropriate reference file before generating:
|
||||
- For Simple Prompt: `references/simple-prompt-reference.md`
|
||||
- For Hat-Based: `references/hat-based-reference.md`
|
||||
|
||||
#### Files to Generate
|
||||
|
||||
**Both modes:**
|
||||
- `ralph.yml` — Main configuration
|
||||
- `PROMPT.md` — The task definition
|
||||
|
||||
**Hat-Based mode additionally:**
|
||||
- `hats.yml` — Hat definitions with triggers, publishes, and instructions
|
||||
|
||||
### Phase 4: Review with the User
|
||||
|
||||
After generating the files, walk the user through what you created:
|
||||
- Summarise the task as you understood it
|
||||
- Explain the mode choice and why
|
||||
- Highlight the success criteria / completion promise
|
||||
- For hat-based: explain the event flow between hats
|
||||
- Ask if anything needs adjusting before they run it
|
||||
|
||||
Then tell them how to run it:
|
||||
```bash
|
||||
# Simple prompt
|
||||
ralph run
|
||||
|
||||
# Hat-based
|
||||
ralph run --config hats.yml
|
||||
|
||||
# With iteration limit
|
||||
ralph run --max-iterations 50
|
||||
```
|
||||
|
||||
## Writing Good Prompts (PROMPT.md)
|
||||
|
||||
The PROMPT.md is the most important file. It must be:
|
||||
|
||||
**Self-contained:** Every iteration starts fresh. The prompt must contain everything the agent needs to understand the task, check progress, and continue.
|
||||
|
||||
**Outcome-focused:** Define WHAT, not HOW. Let the agent figure out the approach.
|
||||
|
||||
**Measurable:** Include concrete success criteria the agent can verify:
|
||||
- "All tests pass" (not "write good tests")
|
||||
- "The /users endpoint returns 200 with valid JSON" (not "make the API work")
|
||||
- "TypeScript compiles with zero errors" (not "fix the types")
|
||||
|
||||
**Structured but not prescriptive:** Use sections like Task, Requirements, Success Criteria, Constraints. Don't write step-by-step instructions.
|
||||
|
||||
### Prompt Template (Simple)
|
||||
|
||||
```markdown
|
||||
# Task: [Clear, specific title]
|
||||
|
||||
[2-3 sentence description of what needs to be built/done]
|
||||
|
||||
## Requirements
|
||||
|
||||
- [Specific requirement 1]
|
||||
- [Specific requirement 2]
|
||||
- [Specific requirement 3]
|
||||
|
||||
## Success Criteria
|
||||
|
||||
All of the following must be true:
|
||||
- [ ] [Measurable criterion 1]
|
||||
- [ ] [Measurable criterion 2]
|
||||
- [ ] [Measurable criterion 3]
|
||||
|
||||
## Constraints
|
||||
|
||||
- [Technology constraints]
|
||||
- [Style/convention constraints]
|
||||
- [Performance constraints if any]
|
||||
|
||||
## Status
|
||||
|
||||
Track your progress here. Mark items complete as you go.
|
||||
When all success criteria are met, print LOOP_COMPLETE.
|
||||
```
|
||||
|
||||
## Designing Hat Systems
|
||||
|
||||
When creating hats, follow these principles:
|
||||
|
||||
**Each hat should have a single responsibility.** Don't create a hat that plans AND builds.
|
||||
|
||||
**Events flow forward.** The event chain should be a clear pipeline: task.start → plan.ready → build.done → review.complete → task.done.
|
||||
|
||||
**Instructions should be specific to the hat's role.** The planner hat gets planning instructions, the builder gets building instructions.
|
||||
|
||||
**Keep it minimal.** 2-4 hats is typical. More than 5 is usually overengineered.
|
||||
|
||||
### Common Hat Patterns
|
||||
|
||||
**Plan → Build (2 hats):**
|
||||
Good for tasks that need architectural thinking before coding.
|
||||
|
||||
**Plan → Build → Review (3 hats):**
|
||||
Good for tasks that need quality assurance.
|
||||
|
||||
**Spec → Implement → Verify (3 hats):**
|
||||
Good for spec-driven development.
|
||||
|
||||
**Test → Implement → Verify (3 hats):**
|
||||
Good for TDD workflows.
|
||||
|
||||
See `references/hat-based-reference.md` for full configuration examples.
|
||||
|
||||
## Backpressure Configuration
|
||||
|
||||
Backpressure gates reject incomplete work. Common gates:
|
||||
|
||||
```yaml
|
||||
backpressure:
|
||||
gates:
|
||||
- name: "tests"
|
||||
command: "npm test"
|
||||
on_fail: "retry"
|
||||
- name: "lint"
|
||||
command: "npm run lint"
|
||||
on_fail: "retry"
|
||||
- name: "typecheck"
|
||||
command: "npx tsc --noEmit"
|
||||
on_fail: "retry"
|
||||
```
|
||||
|
||||
Only add gates for tools that exist in the project. If there are no tests yet, don't add a test gate (unless the task IS to create tests).
|
||||
|
||||
## Cost and Safety
|
||||
|
||||
Always configure iteration limits. Remind the user:
|
||||
- Default max iterations: 100
|
||||
- Default max runtime: 4 hours
|
||||
- A 50-iteration cycle on a large codebase can cost $50-100+ in API credits
|
||||
- Recommend starting with `--max-iterations 30` for new setups and increasing if needed
|
||||
- Git checkpointing is on by default — the user can always roll back
|
||||
@@ -0,0 +1,335 @@
|
||||
# Hat-Based Reference
|
||||
|
||||
## Overview
|
||||
|
||||
Hat-based mode uses specialised personas ("hats") that coordinate through typed events. Each hat triggers on specific events and publishes new events when done, creating a pipeline of distinct phases.
|
||||
|
||||
Use this when the task genuinely benefits from separating concerns — e.g., planning separately from building, or reviewing separately from implementing.
|
||||
|
||||
## hats.yml Structure
|
||||
|
||||
```yaml
|
||||
cli:
|
||||
backend: "claude"
|
||||
|
||||
event_loop:
|
||||
starting_event: "task.start" # First event that kicks off the pipeline
|
||||
completion_promise: "LOOP_COMPLETE" # String that signals completion
|
||||
max_iterations: 100 # Safety limit
|
||||
|
||||
hats:
|
||||
hat_name:
|
||||
name: "Human-Readable Name"
|
||||
triggers: ["event.that.activates.this.hat"]
|
||||
publishes: ["event.this.hat.emits.when.done"]
|
||||
instructions: |
|
||||
Detailed instructions for what this hat should do.
|
||||
Must be self-contained — the hat gets fresh context each time.
|
||||
Should reference PROMPT.md for the overall task.
|
||||
Should specify what "done" means for this hat.
|
||||
```
|
||||
|
||||
### Key Rules
|
||||
|
||||
- **triggers**: List of events that activate this hat. A hat runs when ANY of its trigger events fire.
|
||||
- **publishes**: List of events this hat emits when it completes its work.
|
||||
- **instructions**: The prompt for this hat. Must be specific to the hat's role.
|
||||
- Events flow forward through the pipeline. Avoid circular event chains.
|
||||
- The last hat in the pipeline should print LOOP_COMPLETE when the overall task is done.
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Pattern 1: Plan → Build (2 Hats)
|
||||
|
||||
Best for tasks that need architectural thinking before coding.
|
||||
|
||||
```yaml
|
||||
cli:
|
||||
backend: "claude"
|
||||
|
||||
event_loop:
|
||||
starting_event: "task.start"
|
||||
completion_promise: "LOOP_COMPLETE"
|
||||
|
||||
hats:
|
||||
planner:
|
||||
name: "Planner"
|
||||
triggers: ["task.start"]
|
||||
publishes: ["plan.ready"]
|
||||
instructions: |
|
||||
You are the Planner. Read PROMPT.md to understand the task.
|
||||
|
||||
Your job:
|
||||
1. Analyse the requirements and existing codebase
|
||||
2. Create a clear implementation plan in .ralph/plan.md
|
||||
3. Break the work into concrete steps with file-level detail
|
||||
4. Identify any risks or unknowns
|
||||
|
||||
Write the plan to .ralph/plan.md then emit plan.ready.
|
||||
|
||||
Do NOT write any code. Planning only.
|
||||
|
||||
builder:
|
||||
name: "Builder"
|
||||
triggers: ["plan.ready"]
|
||||
publishes: ["task.done"]
|
||||
instructions: |
|
||||
You are the Builder. Read PROMPT.md for the task and .ralph/plan.md
|
||||
for the implementation plan.
|
||||
|
||||
Your job:
|
||||
1. Follow the plan step by step
|
||||
2. Write clean, tested code
|
||||
3. Run tests after each significant change
|
||||
4. Update .ralph/plan.md to mark completed steps
|
||||
|
||||
When all success criteria from PROMPT.md are met and all tests pass,
|
||||
print LOOP_COMPLETE.
|
||||
```
|
||||
|
||||
### Pattern 2: Plan → Build → Review (3 Hats)
|
||||
|
||||
Adds a review phase for quality assurance.
|
||||
|
||||
```yaml
|
||||
cli:
|
||||
backend: "claude"
|
||||
|
||||
event_loop:
|
||||
starting_event: "task.start"
|
||||
completion_promise: "LOOP_COMPLETE"
|
||||
|
||||
hats:
|
||||
planner:
|
||||
name: "Planner"
|
||||
triggers: ["task.start", "review.changes_requested"]
|
||||
publishes: ["plan.ready"]
|
||||
instructions: |
|
||||
You are the Planner. Read PROMPT.md to understand the task.
|
||||
|
||||
If triggered by review.changes_requested, read .ralph/review.md
|
||||
for feedback and update the plan accordingly.
|
||||
|
||||
Create or update .ralph/plan.md with a clear implementation plan.
|
||||
Emit plan.ready when done. Do NOT write code.
|
||||
|
||||
builder:
|
||||
name: "Builder"
|
||||
triggers: ["plan.ready"]
|
||||
publishes: ["build.done"]
|
||||
instructions: |
|
||||
You are the Builder. Read PROMPT.md and .ralph/plan.md.
|
||||
|
||||
Implement the plan. Write tests. Run them.
|
||||
When implementation is complete, emit build.done.
|
||||
|
||||
Do NOT assess overall quality — that's the Reviewer's job.
|
||||
|
||||
reviewer:
|
||||
name: "Reviewer"
|
||||
triggers: ["build.done"]
|
||||
publishes: ["review.approved", "review.changes_requested"]
|
||||
instructions: |
|
||||
You are the Reviewer. Read PROMPT.md for requirements.
|
||||
|
||||
Review the current state of the codebase against the success criteria:
|
||||
1. Do all tests pass?
|
||||
2. Are all requirements met?
|
||||
3. Is the code clean and following project conventions?
|
||||
4. Are there edge cases not covered?
|
||||
|
||||
If everything passes, write your review to .ralph/review.md
|
||||
and print LOOP_COMPLETE.
|
||||
|
||||
If changes are needed, write specific feedback to .ralph/review.md
|
||||
and emit review.changes_requested.
|
||||
```
|
||||
|
||||
### Pattern 3: Spec → Implement → Verify (3 Hats)
|
||||
|
||||
For spec-driven development — good when working from a design document.
|
||||
|
||||
```yaml
|
||||
cli:
|
||||
backend: "claude"
|
||||
|
||||
event_loop:
|
||||
starting_event: "task.start"
|
||||
completion_promise: "LOOP_COMPLETE"
|
||||
|
||||
hats:
|
||||
spec_writer:
|
||||
name: "Spec Writer"
|
||||
triggers: ["task.start", "verify.gaps_found"]
|
||||
publishes: ["spec.ready"]
|
||||
instructions: |
|
||||
You are the Spec Writer. Read PROMPT.md for the high-level task.
|
||||
|
||||
If triggered by verify.gaps_found, read .ralph/verification.md
|
||||
for gaps and update the spec to address them.
|
||||
|
||||
Write a detailed technical specification to .ralph/spec.md:
|
||||
- API contracts (endpoints, request/response shapes)
|
||||
- Data models
|
||||
- Error handling behaviour
|
||||
- Test scenarios
|
||||
|
||||
Emit spec.ready when done. Do NOT write implementation code.
|
||||
|
||||
implementer:
|
||||
name: "Implementer"
|
||||
triggers: ["spec.ready"]
|
||||
publishes: ["implementation.done"]
|
||||
instructions: |
|
||||
You are the Implementer. Read .ralph/spec.md for the specification.
|
||||
|
||||
Implement exactly what the spec describes. Write tests that verify
|
||||
each specification point. Run tests after each change.
|
||||
|
||||
Emit implementation.done when the spec is fully implemented.
|
||||
|
||||
verifier:
|
||||
name: "Verifier"
|
||||
triggers: ["implementation.done"]
|
||||
publishes: ["verify.passed", "verify.gaps_found"]
|
||||
instructions: |
|
||||
You are the Verifier. Read .ralph/spec.md and PROMPT.md.
|
||||
|
||||
Verify that the implementation matches the spec:
|
||||
1. Run all tests — they must pass
|
||||
2. Check each spec point against the code
|
||||
3. Verify success criteria from PROMPT.md
|
||||
|
||||
If everything checks out, print LOOP_COMPLETE.
|
||||
|
||||
If there are gaps, write them to .ralph/verification.md
|
||||
and emit verify.gaps_found.
|
||||
```
|
||||
|
||||
### Pattern 4: TDD — Test → Implement → Verify (3 Hats)
|
||||
|
||||
For test-driven development workflows.
|
||||
|
||||
```yaml
|
||||
cli:
|
||||
backend: "claude"
|
||||
|
||||
event_loop:
|
||||
starting_event: "task.start"
|
||||
completion_promise: "LOOP_COMPLETE"
|
||||
|
||||
hats:
|
||||
test_writer:
|
||||
name: "Test Writer"
|
||||
triggers: ["task.start", "verify.tests_needed"]
|
||||
publishes: ["tests.ready"]
|
||||
instructions: |
|
||||
You are the Test Writer. Read PROMPT.md for requirements.
|
||||
|
||||
Write failing tests FIRST that describe the desired behaviour.
|
||||
Tests should be comprehensive and cover edge cases.
|
||||
|
||||
If triggered by verify.tests_needed, read .ralph/verification.md
|
||||
for the specific test gaps to fill.
|
||||
|
||||
Write tests, verify they fail (red phase), then emit tests.ready.
|
||||
Do NOT write implementation code.
|
||||
|
||||
implementer:
|
||||
name: "Implementer"
|
||||
triggers: ["tests.ready"]
|
||||
publishes: ["implementation.done"]
|
||||
instructions: |
|
||||
You are the Implementer. Your goal is to make the tests pass.
|
||||
|
||||
Read the test files to understand what behaviour is expected.
|
||||
Write the minimum code to make all tests pass (green phase).
|
||||
|
||||
Run tests after each change. When all tests pass,
|
||||
emit implementation.done.
|
||||
|
||||
verifier:
|
||||
name: "Verifier"
|
||||
triggers: ["implementation.done"]
|
||||
publishes: ["verify.passed", "verify.tests_needed"]
|
||||
instructions: |
|
||||
You are the Verifier. Read PROMPT.md for the full requirements.
|
||||
|
||||
Check:
|
||||
1. All tests pass
|
||||
2. Test coverage is adequate for the requirements
|
||||
3. All success criteria from PROMPT.md are met
|
||||
4. Code is clean (refactor phase if needed)
|
||||
|
||||
If complete, print LOOP_COMPLETE.
|
||||
If more tests are needed, write gaps to .ralph/verification.md
|
||||
and emit verify.tests_needed.
|
||||
```
|
||||
|
||||
## Backpressure with Hats
|
||||
|
||||
Backpressure gates can be applied globally or per-hat:
|
||||
|
||||
```yaml
|
||||
# Global backpressure — applies to all hats
|
||||
backpressure:
|
||||
gates:
|
||||
- name: "tests"
|
||||
command: "npm test"
|
||||
on_fail: "retry"
|
||||
- name: "lint"
|
||||
command: "npm run lint"
|
||||
on_fail: "retry"
|
||||
|
||||
# Per-hat backpressure
|
||||
hats:
|
||||
builder:
|
||||
triggers: ["plan.ready"]
|
||||
publishes: ["build.done"]
|
||||
backpressure:
|
||||
gates:
|
||||
- name: "typecheck"
|
||||
command: "npx tsc --noEmit"
|
||||
on_fail: "retry"
|
||||
instructions: |
|
||||
...
|
||||
```
|
||||
|
||||
## Memories
|
||||
|
||||
Hats can use persistent memories stored in `.ralph/agent/memories.md`. These survive across iterations and sessions:
|
||||
|
||||
```yaml
|
||||
hats:
|
||||
builder:
|
||||
memory:
|
||||
path: ".ralph/agent/memories.md"
|
||||
scope: "hat" # or "global" to share across hats
|
||||
```
|
||||
|
||||
Memories are useful for capturing lessons learned, recording decisions, and avoiding repeated mistakes.
|
||||
|
||||
## Running Hat-Based Workflows
|
||||
|
||||
```bash
|
||||
# Run with hats config
|
||||
ralph run --config hats.yml
|
||||
|
||||
# With iteration limit
|
||||
ralph run --config hats.yml --max-iterations 50
|
||||
|
||||
# Resume interrupted session
|
||||
ralph run --config hats.yml --continue
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
**Too many hats.** If you have more than 5, you're probably overengineering. Each hat adds coordination overhead.
|
||||
|
||||
**Circular event chains without an exit.** Every cycle must have a path to LOOP_COMPLETE. If planner → builder → reviewer → planner, the reviewer must sometimes emit completion instead of always cycling back.
|
||||
|
||||
**Hats that duplicate work.** If the builder is also doing planning, your planner hat is wasted.
|
||||
|
||||
**Overly prescriptive hat instructions.** The instructions should say WHAT to achieve, not HOW. Let the agent figure out the approach.
|
||||
|
||||
**Missing the PROMPT.md reference.** Hat instructions should always tell the agent to read PROMPT.md for the overall task context. Without it, hats lose sight of the bigger picture.
|
||||
@@ -0,0 +1,167 @@
|
||||
# Simple Prompt Reference
|
||||
|
||||
## Overview
|
||||
|
||||
Traditional mode is Ralph at its simplest: a single agent loops against a PROMPT.md until it outputs LOOP_COMPLETE or hits the iteration limit. No hats, no events — just a loop.
|
||||
|
||||
This is the right choice for most tasks. Don't reach for hats unless you genuinely need distinct phases with different mindsets.
|
||||
|
||||
## ralph.yml Configuration
|
||||
|
||||
```yaml
|
||||
cli:
|
||||
backend: "claude" # or: kiro, gemini, codex, amp, copilot, opencode
|
||||
|
||||
event_loop:
|
||||
completion_promise: "LOOP_COMPLETE"
|
||||
max_iterations: 50 # Start conservative, increase if needed
|
||||
```
|
||||
|
||||
### Backend Options
|
||||
|
||||
| Backend | CLI Tool | Notes |
|
||||
|---------|----------|-------|
|
||||
| claude | Claude Code | Recommended. Best reasoning, large context window |
|
||||
| kiro | Kiro | AWS-integrated |
|
||||
| gemini | Gemini CLI | Cost-effective |
|
||||
| codex | Codex | OpenAI agent |
|
||||
| amp | Amp | Sourcegraph agent |
|
||||
| copilot | Copilot CLI | GitHub integrated |
|
||||
| opencode | OpenCode | Open source |
|
||||
|
||||
## PROMPT.md Examples
|
||||
|
||||
### Example 1: Build a Feature
|
||||
|
||||
```markdown
|
||||
# Task: Add User Authentication to Express API
|
||||
|
||||
Add JWT-based authentication to the existing Express.js API.
|
||||
|
||||
## Requirements
|
||||
|
||||
- POST /auth/login accepts email + password, returns JWT
|
||||
- POST /auth/register creates a new user account
|
||||
- Middleware protects all /users/* routes
|
||||
- Tokens expire after 24 hours
|
||||
- Passwords are hashed with bcrypt
|
||||
|
||||
## Success Criteria
|
||||
|
||||
All of the following must be true:
|
||||
- [ ] POST /auth/register creates a user and returns 201
|
||||
- [ ] POST /auth/login returns a valid JWT for correct credentials
|
||||
- [ ] POST /auth/login returns 401 for incorrect credentials
|
||||
- [ ] Protected routes return 401 without a valid token
|
||||
- [ ] Protected routes work normally with a valid token
|
||||
- [ ] All existing tests still pass
|
||||
- [ ] New tests cover all auth endpoints
|
||||
- [ ] TypeScript compiles with zero errors
|
||||
|
||||
## Constraints
|
||||
|
||||
- Use jsonwebtoken for JWT handling
|
||||
- Use bcrypt for password hashing
|
||||
- Follow existing code patterns in src/
|
||||
- Do not modify existing endpoint behaviour
|
||||
|
||||
## Status
|
||||
|
||||
Track progress here. When all success criteria are met, print LOOP_COMPLETE.
|
||||
```
|
||||
|
||||
### Example 2: Fix a Bug
|
||||
|
||||
```markdown
|
||||
# Task: Fix Race Condition in WebSocket Handler
|
||||
|
||||
The WebSocket message handler has a race condition where concurrent connections
|
||||
can corrupt shared state. Messages are being delivered to wrong clients.
|
||||
|
||||
## Current Behaviour
|
||||
|
||||
When 2+ clients send messages simultaneously, responses sometimes go to the
|
||||
wrong client. See issue #247 for reproduction steps.
|
||||
|
||||
## Expected Behaviour
|
||||
|
||||
Each client receives only their own responses, regardless of concurrency.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] Concurrent WebSocket test passes (test/ws-concurrent.test.ts)
|
||||
- [ ] Existing WebSocket tests still pass
|
||||
- [ ] No shared mutable state between connection handlers
|
||||
- [ ] Load test with 50 concurrent connections shows zero cross-talk
|
||||
|
||||
## Constraints
|
||||
|
||||
- Do not change the public WebSocket API
|
||||
- Fix must work with the existing Redis pub/sub setup
|
||||
|
||||
## Status
|
||||
|
||||
Track progress here. When all success criteria are met, print LOOP_COMPLETE.
|
||||
```
|
||||
|
||||
### Example 3: Write a Script
|
||||
|
||||
```markdown
|
||||
# Task: CSV Data Migration Script
|
||||
|
||||
Create a Python script that migrates data from the legacy CSV format to the
|
||||
new database schema.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Read CSV files from data/legacy/*.csv
|
||||
- Transform fields according to the mapping in docs/migration-map.md
|
||||
- Insert into PostgreSQL using the existing SQLAlchemy models
|
||||
- Handle duplicates by updating existing records
|
||||
- Log all skipped/failed rows to migration_errors.log
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] Script processes all CSV files in data/legacy/
|
||||
- [ ] All valid rows are inserted or updated in the database
|
||||
- [ ] Duplicate handling works correctly (update, don't duplicate)
|
||||
- [ ] Error log captures all skipped rows with reasons
|
||||
- [ ] Script completes without unhandled exceptions
|
||||
- [ ] Unit tests cover the transformation logic
|
||||
|
||||
## Constraints
|
||||
|
||||
- Python 3.11+
|
||||
- Use existing SQLAlchemy models from src/models/
|
||||
- Must be idempotent (safe to run multiple times)
|
||||
|
||||
## Status
|
||||
|
||||
Track progress here. When all success criteria are met, print LOOP_COMPLETE.
|
||||
```
|
||||
|
||||
## Running
|
||||
|
||||
```bash
|
||||
# Basic run
|
||||
ralph run
|
||||
|
||||
# With iteration limit
|
||||
ralph run --max-iterations 30
|
||||
|
||||
# Resume an interrupted session
|
||||
ralph run --continue
|
||||
|
||||
# Quiet mode (no TUI)
|
||||
ralph run -q
|
||||
```
|
||||
|
||||
## When to Upgrade to Hats
|
||||
|
||||
If you find the simple prompt struggling because:
|
||||
- The agent keeps flip-flopping between planning and coding
|
||||
- It loses track of the overall architecture while implementing details
|
||||
- It writes code but never stops to review/test properly
|
||||
- The task is too large for a single coherent prompt
|
||||
|
||||
...then consider switching to hat-based mode. But try simplifying the prompt first — often the issue is a vague prompt, not a need for hats.
|
||||
Reference in New Issue
Block a user