Port ItemSense to iOS
This commit is contained in:
5
python/AGENTS.md
Normal file
5
python/AGENTS.md
Normal file
@@ -0,0 +1,5 @@
|
||||
# Agent Instructions
|
||||
|
||||
**Read:** `.specify/memory/constitution.md`
|
||||
|
||||
That file is your source of truth for this project.
|
||||
5
python/CLAUDE.md
Normal file
5
python/CLAUDE.md
Normal file
@@ -0,0 +1,5 @@
|
||||
# Agent Instructions
|
||||
|
||||
**Read:** `.specify/memory/constitution.md`
|
||||
|
||||
That file is your source of truth for this project.
|
||||
18
python/PROMPT_build.md
Normal file
18
python/PROMPT_build.md
Normal file
@@ -0,0 +1,18 @@
|
||||
# Ralph Build Mode
|
||||
|
||||
Read `.specify/memory/constitution.md` first.
|
||||
|
||||
## Your Task
|
||||
|
||||
1. Check `specs/` folder
|
||||
2. Find highest priority INCOMPLETE spec
|
||||
3. Implement completely
|
||||
4. Run tests, verify acceptance criteria
|
||||
5. Commit and push
|
||||
6. Output `<promise>DONE</promise>` when done
|
||||
|
||||
## Rules
|
||||
|
||||
- ONE spec per iteration
|
||||
- Do NOT output magic phrase until truly complete
|
||||
- If blocked: explain in ralph_history.txt, exit without phrase
|
||||
11
python/PROMPT_plan.md
Normal file
11
python/PROMPT_plan.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# Ralph Planning Mode
|
||||
|
||||
Read `.specify/memory/constitution.md` first.
|
||||
|
||||
## Your Task
|
||||
|
||||
1. Analyze specs in `specs/`
|
||||
2. Create `IMPLEMENTATION_PLAN.md` with prioritized tasks
|
||||
3. Output `<promise>DONE</promise>` when done
|
||||
|
||||
Delete IMPLEMENTATION_PLAN.md to return to direct spec mode.
|
||||
0
python/README.md
Normal file
0
python/README.md
Normal file
47
python/implementation_plan.md
Normal file
47
python/implementation_plan.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# Implementation Plan: ItemSense MVP
|
||||
|
||||
## Goal Description
|
||||
Build a desktop application to identify items using the webcam and OpenAI's visual capabilities, using a **native macOS UI (PyObjC)**.
|
||||
|
||||
## User Review Required
|
||||
- **Technology Shift**: Switching from Tkinter to PyObjC (AppKit).
|
||||
- **Camera Strategy**: Using OpenCV for frame capture and bridging to AppKit for display. This keeps the implementation simpler than writing a raw AVFoundation delegate in Python while maintaining a native UI.
|
||||
|
||||
## Proposed Changes
|
||||
|
||||
### Spec 001: Core UI & Camera Feed
|
||||
#### [NEW] main.py
|
||||
- Initialize `NSApplication` and `NSWindow` (AppKit).
|
||||
- Implement a custom `AppDelegate` to handle app lifecycle.
|
||||
- Integrate OpenCV (`cv2`) for webcam capture.
|
||||
- Display video frames in an `NSImageView`.
|
||||
|
||||
#### [NEW] requirements.txt
|
||||
- `pyobjc-framework-Cocoa`
|
||||
- `opencv-python`
|
||||
- `Pillow` (for easier image data manipulation if needed)
|
||||
|
||||
### Spec 002: OpenAI Vision Integration
|
||||
#### [MODIFY] main.py
|
||||
- Add `Capture` button (`NSButton`) to the UI.
|
||||
- Implement logic to snapshot the current OpenCV frame.
|
||||
- Run OpenAI API request in a background thread to prevent UI freezing.
|
||||
- Send image to OpenAI API.
|
||||
|
||||
#### [MODIFY] requirements.txt
|
||||
- Add `openai`
|
||||
- Add `python-dotenv`
|
||||
|
||||
### Spec 003: Result Display
|
||||
#### [MODIFY] main.py
|
||||
- Add `NSTextView` (in `NSScrollView`) for results.
|
||||
- Add "Scan Another" button logic.
|
||||
- Ensure UI layout manages state transitions cleanly (Live vs Result).
|
||||
|
||||
## Verification Plan
|
||||
|
||||
### Manual Verification
|
||||
1. **Launch**: Run `python main.py`.
|
||||
2. **Native Look**: Verify the window uses native macOS controls.
|
||||
3. **Feed**: Verify camera feed is smooth and correctly oriented.
|
||||
4. **Flow**: Capture -> Processing -> Result -> Scan Another.
|
||||
287
python/main.py
Normal file
287
python/main.py
Normal file
@@ -0,0 +1,287 @@
|
||||
import sys
|
||||
import cv2
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
import objc
|
||||
import threading
|
||||
import base64
|
||||
import os
|
||||
import json
|
||||
from dotenv import load_dotenv
|
||||
from openai import OpenAI
|
||||
from AppKit import (
|
||||
NSApplication, NSApp, NSWindow, NSView, NSImageView, NSButton,
|
||||
NSStackView, NSImage, NSBitmapImageRep, NSBackingStoreBuffered,
|
||||
NSWindowStyleMaskTitled, NSWindowStyleMaskClosable,
|
||||
NSWindowStyleMaskResizable, NSWindowStyleMaskMiniaturizable,
|
||||
NSTimer, NSMakeSize, NSMakeRect, NSObject, NSLog,
|
||||
NSUserInterfaceLayoutOrientationVertical, NSSplitView,
|
||||
NSLayoutAttributeCenterX, NSLayoutAttributeCenterY,
|
||||
NSLayoutAttributeWidth, NSLayoutAttributeHeight,
|
||||
NSLayoutAttributeTop, NSLayoutAttributeBottom, NSLayoutAttributeLeading,
|
||||
NSLayoutAttributeTrailing, NSScrollView, NSTextView,
|
||||
NSApplicationActivationPolicyRegular, NSFont
|
||||
)
|
||||
from WebKit import WKWebView, WKWebViewConfiguration
|
||||
from Foundation import NSObject, NSTimer, NSDate, NSURL, NSURLRequest
|
||||
|
||||
load_dotenv()
|
||||
|
||||
from PyObjCTools import AppHelper
|
||||
|
||||
class ItemSenseApp(NSObject):
|
||||
def applicationDidFinishLaunching_(self, notification):
|
||||
try:
|
||||
print("Application did finish launching...")
|
||||
# Increased width for split view (1200px width)
|
||||
self.window = NSWindow.alloc().initWithContentRect_styleMask_backing_defer_(
|
||||
NSMakeRect(0, 0, 1200, 600),
|
||||
NSWindowStyleMaskTitled | NSWindowStyleMaskClosable | NSWindowStyleMaskResizable | NSWindowStyleMaskMiniaturizable,
|
||||
NSBackingStoreBuffered,
|
||||
False
|
||||
)
|
||||
self.window.setTitle_("ItemSense")
|
||||
self.window.center()
|
||||
|
||||
# Main Split View (Horizontal)
|
||||
self.split_view = NSSplitView.alloc().initWithFrame_(self.window.contentView().bounds())
|
||||
self.split_view.setVertical_(True)
|
||||
self.split_view.setDividerStyle_(1) # NSSplitViewDividerStyleThin
|
||||
self.window.setContentView_(self.split_view)
|
||||
|
||||
# Left Pane (Camera + Controls + Description)
|
||||
self.left_pane = NSStackView.alloc().init()
|
||||
self.left_pane.setOrientation_(NSUserInterfaceLayoutOrientationVertical)
|
||||
self.left_pane.setSpacing_(10)
|
||||
self.left_pane.setEdgeInsets_((10, 10, 10, 10))
|
||||
# Set a minimum width for the left pane so it doesn't disappear
|
||||
self.left_pane.setTranslatesAutoresizingMaskIntoConstraints_(False)
|
||||
self.left_pane.widthAnchor().constraintGreaterThanOrEqualToConstant_(400.0).setActive_(True)
|
||||
self.split_view.addArrangedSubview_(self.left_pane)
|
||||
|
||||
# Image View for Camera Feed
|
||||
self.image_view = NSImageView.alloc().init()
|
||||
self.image_view.setImageScaling_(0) # NSImageScaleProportionallyDown
|
||||
self.left_pane.addView_inGravity_(self.image_view, 1) # Top gravity
|
||||
|
||||
# Result View (Scrollable Text)
|
||||
self.scroll_view = NSScrollView.alloc().init()
|
||||
self.scroll_view.setHasVerticalScroller_(True)
|
||||
self.scroll_view.setBorderType_(2) # NSBezelBorder
|
||||
|
||||
# Text View
|
||||
content_size = self.scroll_view.contentSize()
|
||||
self.text_view = NSTextView.alloc().initWithFrame_(NSMakeRect(0, 0, content_size.width, content_size.height))
|
||||
self.text_view.setMinSize_(NSMakeSize(0.0, content_size.height))
|
||||
self.text_view.setMaxSize_(NSMakeSize(float("inf"), float("inf")))
|
||||
self.text_view.setVerticallyResizable_(True)
|
||||
self.text_view.setHorizontallyResizable_(False)
|
||||
self.text_view.setAutoresizingMask_(18) # NSViewWidthSizable | NSViewHeightSizable
|
||||
self.text_view.textContainer().setContainerSize_(NSMakeSize(content_size.width, float("inf")))
|
||||
self.text_view.textContainer().setWidthTracksTextView_(True)
|
||||
self.text_view.setEditable_(False)
|
||||
self.text_view.setRichText_(False)
|
||||
self.text_view.setFont_(NSFont.systemFontOfSize_(18.0))
|
||||
|
||||
self.scroll_view.setDocumentView_(self.text_view)
|
||||
self.left_pane.addView_inGravity_(self.scroll_view, 2)
|
||||
|
||||
# Constraint: Give the scroll view a minimum height
|
||||
self.scroll_view.setTranslatesAutoresizingMaskIntoConstraints_(False)
|
||||
self.scroll_view.heightAnchor().constraintGreaterThanOrEqualToConstant_(150.0).setActive_(True)
|
||||
self.scroll_view.widthAnchor().constraintEqualToAnchor_constant_(self.left_pane.widthAnchor(), -20.0).setActive_(True)
|
||||
|
||||
self.text_view.setString_("Initializing camera...")
|
||||
|
||||
# Capture Button
|
||||
self.capture_button = NSButton.buttonWithTitle_target_action_("Capture", self, "captureClicked:")
|
||||
self.left_pane.addView_inGravity_(self.capture_button, 3) # Bottom gravity
|
||||
|
||||
# Right Pane (WebView)
|
||||
config = WKWebViewConfiguration.alloc().init()
|
||||
self.web_view = WKWebView.alloc().initWithFrame_configuration_(NSMakeRect(0, 0, 500, 600), config)
|
||||
self.split_view.addArrangedSubview_(self.web_view)
|
||||
|
||||
self.window.makeKeyAndOrderFront_(None)
|
||||
self.window.orderFrontRegardless()
|
||||
|
||||
# Set Split View Divider Position and Priority
|
||||
# Priority 251 > 250 (default), so left pane resists resizing.
|
||||
self.split_view.setHoldingPriority_forSubviewAtIndex_(251.0, 0)
|
||||
self.split_view.setHoldingPriority_forSubviewAtIndex_(249.0, 1)
|
||||
self.split_view.setPosition_ofDividerAtIndex_(660.0, 0)
|
||||
|
||||
print("Window ordered front.")
|
||||
|
||||
# State
|
||||
self.is_capturing = True
|
||||
self.current_frame = None
|
||||
|
||||
# Initialize Camera with a delay to allow UI to render first
|
||||
self.performSelector_withObject_afterDelay_("initCamera:", None, 0.5)
|
||||
except Exception as e:
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
print(f"Error in applicationDidFinishLaunching: {e}")
|
||||
|
||||
def initCamera_(self, sender):
|
||||
print("Initializing camera...")
|
||||
self.cap = cv2.VideoCapture(0)
|
||||
self.cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
|
||||
self.cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
|
||||
if not self.cap.isOpened():
|
||||
NSLog("Error: Could not open camera")
|
||||
self.text_view.setString_("Error: Could not open camera.")
|
||||
return
|
||||
|
||||
print("Camera opened.")
|
||||
self.text_view.setString_("Ready to capture")
|
||||
# Start Timer for 30 FPS
|
||||
self.timer = NSTimer.scheduledTimerWithTimeInterval_target_selector_userInfo_repeats_(
|
||||
1.0/30.0, self, "updateFrame:", None, True
|
||||
)
|
||||
|
||||
|
||||
def applicationShouldTerminateAfterLastWindowClosed_(self, sender):
|
||||
return True
|
||||
|
||||
def applicationWillTerminate_(self, notification):
|
||||
if hasattr(self, 'cap') and self.cap.isOpened():
|
||||
self.cap.release()
|
||||
|
||||
def updateFrame_(self, timer):
|
||||
if not self.is_capturing:
|
||||
return
|
||||
|
||||
if hasattr(self, 'cap') and self.cap.isOpened():
|
||||
ret, frame = self.cap.read()
|
||||
if ret:
|
||||
self.current_frame = frame # Store BGR frame
|
||||
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
|
||||
height, width, channels = rgb_frame.shape
|
||||
|
||||
header = f"P6 {width} {height} 255 ".encode()
|
||||
data = header + rgb_frame.tobytes()
|
||||
# NSData creation from bytes
|
||||
ns_data = objc.lookUpClass("NSData").dataWithBytes_length_(data, len(data))
|
||||
ns_image = NSImage.alloc().initWithData_(ns_data)
|
||||
|
||||
self.image_view.setImage_(ns_image)
|
||||
|
||||
def captureClicked_(self, sender):
|
||||
if self.is_capturing:
|
||||
print("Capture clicked - Processing...")
|
||||
self.is_capturing = False
|
||||
self.capture_button.setTitle_("Processing...")
|
||||
self.capture_button.setEnabled_(False)
|
||||
self.text_view.setString_("Analyzing image...")
|
||||
|
||||
# Start background processing
|
||||
threading.Thread(target=self.processImage).start()
|
||||
|
||||
def resetScan_(self, sender):
|
||||
print("Resetting...")
|
||||
self.text_view.setString_("")
|
||||
self.capture_button.setTitle_("Capture")
|
||||
self.capture_button.setAction_("captureClicked:")
|
||||
self.is_capturing = True
|
||||
|
||||
# Clear Web View (optional, or load about:blank)
|
||||
url = NSURL.URLWithString_("about:blank")
|
||||
request = NSURLRequest.requestWithURL_(url)
|
||||
self.web_view.loadRequest_(request)
|
||||
|
||||
def processImage(self):
|
||||
try:
|
||||
if self.current_frame is None:
|
||||
self.performSelectorOnMainThread_withObject_waitUntilDone_("handleError:", "No frame captured", False)
|
||||
return
|
||||
|
||||
# Encode image to base64
|
||||
_, buffer = cv2.imencode('.jpg', self.current_frame)
|
||||
base64_image = base64.b64encode(buffer).decode('utf-8')
|
||||
|
||||
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
|
||||
|
||||
prompt_text = (
|
||||
"Identify the main item in the foreground, including the brand name if visible. Ignore the background and any people present. "
|
||||
"Return a JSON object with two keys: 'description' (a brief description of the item including brand) "
|
||||
"and 'search_term' (keywords to search for this item on Amazon, including brand). "
|
||||
"Return ONLY the JSON. Do not wrap in markdown code blocks."
|
||||
)
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="gpt-4o-mini",
|
||||
messages=[
|
||||
{
|
||||
"role": "user",
|
||||
"content": [
|
||||
{"type": "text", "text": prompt_text},
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": f"data:image/jpeg;base64,{base64_image}"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
max_tokens=300
|
||||
)
|
||||
|
||||
result_text = response.choices[0].message.content
|
||||
self.performSelectorOnMainThread_withObject_waitUntilDone_("handleResponse:", result_text, False)
|
||||
|
||||
except Exception as e:
|
||||
self.performSelectorOnMainThread_withObject_waitUntilDone_("handleError:", str(e), False)
|
||||
|
||||
def handleResponse_(self, result):
|
||||
print(f"OpenAI Response received: {result}")
|
||||
try:
|
||||
# Clean up result if it contains markdown formatting
|
||||
clean_result = result.replace("```json", "").replace("```", "").strip()
|
||||
data = json.loads(clean_result)
|
||||
|
||||
description = data.get("description", "No description found.")
|
||||
search_term = data.get("search_term", "")
|
||||
|
||||
self.text_view.setString_(description)
|
||||
|
||||
if search_term:
|
||||
search_query = search_term.replace(" ", "+")
|
||||
amazon_url = f"https://www.amazon.com/s?k={search_query}"
|
||||
print(f"Loading Amazon URL: {amazon_url}")
|
||||
|
||||
url = NSURL.URLWithString_(amazon_url)
|
||||
request = NSURLRequest.requestWithURL_(url)
|
||||
self.web_view.loadRequest_(request)
|
||||
else:
|
||||
print("No search term found.")
|
||||
|
||||
except json.JSONDecodeError:
|
||||
print("Failed to parse JSON response")
|
||||
self.text_view.setString_(f"Error parsing response: {result}")
|
||||
except Exception as e:
|
||||
print(f"Error handling response: {e}")
|
||||
self.text_view.setString_(f"Error: {e}")
|
||||
|
||||
self.capture_button.setTitle_("Scan Another")
|
||||
self.capture_button.setEnabled_(True)
|
||||
self.capture_button.setAction_("resetScan:")
|
||||
|
||||
def handleError_(self, error_msg):
|
||||
print(f"Error: {error_msg}")
|
||||
self.text_view.setString_(f"Error: {error_msg}")
|
||||
self.capture_button.setTitle_("Error - Try Again")
|
||||
self.capture_button.setEnabled_(True)
|
||||
self.capture_button.setAction_("captureClicked:") # Ensure it resets to capture logic
|
||||
self.is_capturing = True
|
||||
|
||||
if __name__ == "__main__":
|
||||
app = NSApplication.sharedApplication()
|
||||
app.setActivationPolicy_(NSApplicationActivationPolicyRegular)
|
||||
delegate = ItemSenseApp.alloc().init()
|
||||
app.setDelegate_(delegate)
|
||||
|
||||
app.activateIgnoringOtherApps_(True)
|
||||
AppHelper.runEventLoop()
|
||||
6
python/requirements.txt
Normal file
6
python/requirements.txt
Normal file
@@ -0,0 +1,6 @@
|
||||
pyobjc-framework-Cocoa
|
||||
opencv-python
|
||||
pillow
|
||||
openai
|
||||
python-dotenv
|
||||
pyobjc-framework-WebKit
|
||||
635
python/scripts/ralph-loop-codex.sh
Executable file
635
python/scripts/ralph-loop-codex.sh
Executable file
@@ -0,0 +1,635 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Ralph Loop for OpenAI Codex CLI
|
||||
#
|
||||
# Based on Geoffrey Huntley's Ralph Wiggum methodology.
|
||||
# Combined with SpecKit-style specifications.
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/ralph-loop-codex.sh # Build mode (unlimited)
|
||||
# ./scripts/ralph-loop-codex.sh 20 # Build mode (max 20 iterations)
|
||||
# ./scripts/ralph-loop-codex.sh plan # Planning mode (optional)
|
||||
#
|
||||
|
||||
set -e
|
||||
set -o pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="$PROJECT_DIR/logs"
|
||||
CONSTITUTION="$PROJECT_DIR/.specify/memory/constitution.md"
|
||||
RLM_DIR="$PROJECT_DIR/rlm"
|
||||
RLM_TRACE_DIR="$RLM_DIR/trace"
|
||||
RLM_QUERIES_DIR="$RLM_DIR/queries"
|
||||
RLM_ANSWERS_DIR="$RLM_DIR/answers"
|
||||
RLM_INDEX="$RLM_DIR/index.tsv"
|
||||
|
||||
# Configuration
|
||||
MAX_ITERATIONS=0 # 0 = unlimited
|
||||
MODE="build"
|
||||
RLM_CONTEXT_FILE=""
|
||||
CODEX_CMD="${CODEX_CMD:-codex}"
|
||||
TAIL_LINES=5
|
||||
TAIL_RENDERED_LINES=0
|
||||
ROLLING_OUTPUT_LINES=5
|
||||
ROLLING_OUTPUT_INTERVAL=10
|
||||
ROLLING_RENDERED_LINES=0
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
PURPLE='\033[0;35m'
|
||||
CYAN='\033[0;36m'
|
||||
NC='\033[0m'
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
|
||||
# Check constitution for YOLO setting
|
||||
YOLO_ENABLED=true
|
||||
if [[ -f "$CONSTITUTION" ]]; then
|
||||
if grep -q "YOLO Mode.*DISABLED" "$CONSTITUTION" 2>/dev/null; then
|
||||
YOLO_ENABLED=false
|
||||
fi
|
||||
fi
|
||||
|
||||
show_help() {
|
||||
cat <<EOF
|
||||
Ralph Loop for OpenAI Codex CLI
|
||||
|
||||
Usage:
|
||||
./scripts/ralph-loop-codex.sh # Build mode, unlimited
|
||||
./scripts/ralph-loop-codex.sh 20 # Build mode, max 20 iterations
|
||||
./scripts/ralph-loop-codex.sh plan # Planning mode (OPTIONAL)
|
||||
./scripts/ralph-loop-codex.sh --rlm-context ./rlm/context.txt
|
||||
./scripts/ralph-loop-codex.sh --rlm ./rlm/context.txt
|
||||
|
||||
Modes:
|
||||
build (default) Pick incomplete spec and implement
|
||||
plan Create IMPLEMENTATION_PLAN.md (OPTIONAL)
|
||||
|
||||
Work Source:
|
||||
Agent reads specs/*.md and picks the highest priority incomplete spec.
|
||||
|
||||
YOLO Mode: Uses --dangerously-bypass-approvals-and-sandbox
|
||||
|
||||
RLM Mode (optional):
|
||||
--rlm-context <file> Treat a large context file as external environment.
|
||||
The agent should read slices instead of loading it all.
|
||||
--rlm [file] Shortcut for --rlm-context (defaults to rlm/context.txt)
|
||||
|
||||
RLM workspace (when enabled):
|
||||
- rlm/trace/ Prompt snapshots + outputs per iteration
|
||||
- rlm/index.tsv Index of all iterations (timestamp, prompt, log, status)
|
||||
- rlm/queries/ and rlm/answers/ For optional recursive sub-queries
|
||||
|
||||
EOF
|
||||
}
|
||||
|
||||
print_latest_output() {
|
||||
local log_file="$1"
|
||||
local label="${2:-Codex}"
|
||||
local target="/dev/tty"
|
||||
|
||||
[ -f "$log_file" ] || return 0
|
||||
|
||||
if [ ! -w "$target" ]; then
|
||||
target="/dev/stdout"
|
||||
fi
|
||||
|
||||
if [ "$target" = "/dev/tty" ] && [ "$TAIL_RENDERED_LINES" -gt 0 ]; then
|
||||
printf "\033[%dA\033[J" "$TAIL_RENDERED_LINES" > "$target"
|
||||
fi
|
||||
|
||||
{
|
||||
echo "Latest ${label} output (last ${TAIL_LINES} lines):"
|
||||
tail -n "$TAIL_LINES" "$log_file"
|
||||
} > "$target"
|
||||
|
||||
if [ "$target" = "/dev/tty" ]; then
|
||||
TAIL_RENDERED_LINES=$((TAIL_LINES + 1))
|
||||
fi
|
||||
}
|
||||
|
||||
watch_latest_output() {
|
||||
local log_file="$1"
|
||||
local label="${2:-Codex}"
|
||||
local target="/dev/tty"
|
||||
local use_tty=false
|
||||
local use_tput=false
|
||||
|
||||
[ -f "$log_file" ] || return 0
|
||||
|
||||
if [ ! -w "$target" ]; then
|
||||
target="/dev/stdout"
|
||||
else
|
||||
use_tty=true
|
||||
if command -v tput &>/dev/null; then
|
||||
use_tput=true
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ "$use_tty" = true ]; then
|
||||
if [ "$use_tput" = true ]; then
|
||||
tput cr > "$target"
|
||||
tput sc > "$target"
|
||||
else
|
||||
printf "\r\0337" > "$target"
|
||||
fi
|
||||
fi
|
||||
|
||||
while true; do
|
||||
local timestamp
|
||||
timestamp=$(date '+%Y-%m-%d %H:%M:%S')
|
||||
|
||||
if [ "$use_tty" = true ]; then
|
||||
if [ "$use_tput" = true ]; then
|
||||
tput rc > "$target"
|
||||
tput ed > "$target"
|
||||
tput cr > "$target"
|
||||
else
|
||||
printf "\0338\033[J\r" > "$target"
|
||||
fi
|
||||
fi
|
||||
|
||||
{
|
||||
echo -e "${CYAN}[$timestamp] Latest ${label} output (last ${ROLLING_OUTPUT_LINES} lines):${NC}"
|
||||
if [ ! -s "$log_file" ]; then
|
||||
echo "(no output yet)"
|
||||
else
|
||||
tail -n "$ROLLING_OUTPUT_LINES" "$log_file" 2>/dev/null || true
|
||||
fi
|
||||
echo ""
|
||||
} > "$target"
|
||||
|
||||
sleep "$ROLLING_OUTPUT_INTERVAL"
|
||||
done
|
||||
}
|
||||
|
||||
# Parse arguments
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
plan)
|
||||
MODE="plan"
|
||||
if [[ "${2:-}" =~ ^[0-9]+$ ]]; then
|
||||
MAX_ITERATIONS="$2"
|
||||
shift 2
|
||||
else
|
||||
MAX_ITERATIONS=1
|
||||
shift
|
||||
fi
|
||||
;;
|
||||
--rlm-context)
|
||||
RLM_CONTEXT_FILE="${2:-}"
|
||||
shift 2
|
||||
;;
|
||||
--rlm)
|
||||
if [[ -n "${2:-}" && "${2:0:1}" != "-" ]]; then
|
||||
RLM_CONTEXT_FILE="$2"
|
||||
shift 2
|
||||
else
|
||||
RLM_CONTEXT_FILE="rlm/context.txt"
|
||||
shift
|
||||
fi
|
||||
;;
|
||||
-h|--help)
|
||||
show_help
|
||||
exit 0
|
||||
;;
|
||||
[0-9]*)
|
||||
MODE="build"
|
||||
MAX_ITERATIONS="$1"
|
||||
shift
|
||||
;;
|
||||
*)
|
||||
echo -e "${RED}Unknown argument: $1${NC}"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
cd "$PROJECT_DIR"
|
||||
|
||||
# Validate RLM context file (if provided)
|
||||
if [ -n "$RLM_CONTEXT_FILE" ] && [ ! -f "$RLM_CONTEXT_FILE" ]; then
|
||||
echo -e "${RED}Error: RLM context file not found: $RLM_CONTEXT_FILE${NC}"
|
||||
echo "Create it first (example):"
|
||||
echo " mkdir -p rlm && printf \"%s\" \"<your long context>\" > $RLM_CONTEXT_FILE"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Initialize RLM workspace (optional)
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
mkdir -p "$RLM_TRACE_DIR" "$RLM_QUERIES_DIR" "$RLM_ANSWERS_DIR"
|
||||
if [ ! -f "$RLM_INDEX" ]; then
|
||||
echo -e "timestamp\tmode\titeration\tprompt\tlog\toutput\tstatus" > "$RLM_INDEX"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Session log (captures ALL output)
|
||||
SESSION_LOG="$LOG_DIR/ralph_codex_${MODE}_session_$(date '+%Y%m%d_%H%M%S').log"
|
||||
exec > >(tee -a "$SESSION_LOG") 2>&1
|
||||
|
||||
# Check if Codex CLI is available
|
||||
if ! command -v "$CODEX_CMD" &> /dev/null; then
|
||||
echo -e "${RED}Error: Codex CLI not found${NC}"
|
||||
echo ""
|
||||
echo "Install Codex CLI:"
|
||||
echo " npm install -g @openai/codex"
|
||||
echo ""
|
||||
echo "Then authenticate:"
|
||||
echo " codex login"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Determine prompt file
|
||||
if [ "$MODE" = "plan" ]; then
|
||||
PROMPT_FILE="PROMPT_plan.md"
|
||||
else
|
||||
PROMPT_FILE="PROMPT_build.md"
|
||||
fi
|
||||
|
||||
# Create prompt files if they don't exist (same as ralph-loop.sh)
|
||||
if [ ! -f "PROMPT_build.md" ]; then
|
||||
echo -e "${YELLOW}Creating PROMPT_build.md...${NC}"
|
||||
cat > "PROMPT_build.md" << 'BUILDEOF'
|
||||
# Ralph Build Mode
|
||||
|
||||
Based on Geoffrey Huntley's Ralph Wiggum methodology.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: Orient
|
||||
|
||||
Read `.specify/memory/constitution.md` to understand project principles and constraints.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Discover Work Items
|
||||
|
||||
Search for incomplete work from these sources (in order):
|
||||
|
||||
1. **specs/ folder** — Look for `.md` files NOT marked `## Status: COMPLETE`
|
||||
2. **IMPLEMENTATION_PLAN.md** — If exists, find unchecked `- [ ]` tasks
|
||||
3. **GitHub Issues** — Check for open issues (if this is a GitHub repo)
|
||||
4. **Any task tracker** — Jira, Linear, etc. if configured
|
||||
|
||||
Pick the **HIGHEST PRIORITY** incomplete item:
|
||||
- Lower numbers = higher priority (001 before 010)
|
||||
- `[HIGH]` before `[MEDIUM]` before `[LOW]`
|
||||
- Bugs/blockers before features
|
||||
|
||||
Before implementing, search the codebase to verify it's not already done.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1b: Re-Verification Mode (No Incomplete Work Found)
|
||||
|
||||
**If ALL specs appear complete**, don't just exit — do a quality check:
|
||||
|
||||
1. **Randomly pick** one completed spec from `specs/`
|
||||
2. **Strictly re-verify** ALL its acceptance criteria:
|
||||
- Run the actual tests mentioned in the spec
|
||||
- Manually verify each criterion is truly met
|
||||
- Check edge cases
|
||||
- Look for regressions
|
||||
3. **If any criterion fails**: Unmark the spec as complete and fix it
|
||||
4. **If all pass**: Output `<promise>DONE</promise>` to confirm quality
|
||||
|
||||
This ensures the codebase stays healthy even when "nothing to do."
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Implement
|
||||
|
||||
Implement the selected spec/task completely:
|
||||
- Follow the spec's requirements exactly
|
||||
- Write clean, maintainable code
|
||||
- Add tests as needed
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Validate
|
||||
|
||||
Run the project's test suite and verify:
|
||||
- All tests pass
|
||||
- No lint errors
|
||||
- The spec's acceptance criteria are 100% met
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Commit & Update
|
||||
|
||||
1. Mark the spec/task as complete (add `## Status: COMPLETE` to spec file)
|
||||
2. `git add -A`
|
||||
3. `git commit` with a descriptive message
|
||||
4. `git push`
|
||||
|
||||
---
|
||||
|
||||
## Completion Signal
|
||||
|
||||
**CRITICAL:** Only output the magic phrase when the work is 100% complete.
|
||||
|
||||
Check:
|
||||
- [ ] Implementation matches all requirements
|
||||
- [ ] All tests pass
|
||||
- [ ] All acceptance criteria verified
|
||||
- [ ] Changes committed and pushed
|
||||
- [ ] Spec marked as complete
|
||||
|
||||
**If ALL checks pass, output:** `<promise>DONE</promise>`
|
||||
|
||||
**If ANY check fails:** Fix the issue and try again. Do NOT output the magic phrase.
|
||||
BUILDEOF
|
||||
fi
|
||||
|
||||
if [ ! -f "PROMPT_plan.md" ]; then
|
||||
echo -e "${YELLOW}Creating PROMPT_plan.md...${NC}"
|
||||
cat > "PROMPT_plan.md" << 'PLANEOF'
|
||||
# Ralph Planning Mode (OPTIONAL)
|
||||
|
||||
This mode is OPTIONAL. Most projects work fine directly from specs.
|
||||
|
||||
Only use this when you want a detailed breakdown of specs into smaller tasks.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: Orient
|
||||
|
||||
0a. Read `.specify/memory/constitution.md` for project principles.
|
||||
|
||||
0b. Study `specs/` to learn all feature specifications.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Gap Analysis
|
||||
|
||||
Compare specs against current codebase:
|
||||
- What's fully implemented?
|
||||
- What's partially done?
|
||||
- What's not started?
|
||||
- What has issues or bugs?
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Create Plan
|
||||
|
||||
Create `IMPLEMENTATION_PLAN.md` with a prioritized task list:
|
||||
|
||||
```markdown
|
||||
# Implementation Plan
|
||||
|
||||
> Auto-generated breakdown of specs into tasks.
|
||||
> Delete this file to return to working directly from specs.
|
||||
|
||||
## Priority Tasks
|
||||
|
||||
- [ ] [HIGH] Task description - from spec NNN
|
||||
- [ ] [HIGH] Task description - from spec NNN
|
||||
- [ ] [MEDIUM] Task description
|
||||
- [ ] [LOW] Task description
|
||||
|
||||
## Completed
|
||||
|
||||
- [x] Completed task
|
||||
```
|
||||
|
||||
Prioritize by:
|
||||
1. Dependencies (do prerequisites first)
|
||||
2. Impact (high-value features first)
|
||||
3. Complexity (mix easy wins with harder tasks)
|
||||
|
||||
---
|
||||
|
||||
## Completion Signal
|
||||
|
||||
When the plan is complete and saved:
|
||||
|
||||
`<promise>DONE</promise>`
|
||||
PLANEOF
|
||||
fi
|
||||
|
||||
# Build Codex flags for exec mode
|
||||
CODEX_FLAGS="exec"
|
||||
if [ "$YOLO_ENABLED" = true ]; then
|
||||
CODEX_FLAGS="$CODEX_FLAGS --dangerously-bypass-approvals-and-sandbox"
|
||||
fi
|
||||
|
||||
# Get current branch
|
||||
CURRENT_BRANCH=$(git branch --show-current 2>/dev/null || echo "main")
|
||||
|
||||
# Check for work sources - count .md files in specs/
|
||||
HAS_SPECS=false
|
||||
SPEC_COUNT=0
|
||||
if [ -d "specs" ]; then
|
||||
SPEC_COUNT=$(find specs -maxdepth 1 -name "*.md" -type f 2>/dev/null | wc -l)
|
||||
[ "$SPEC_COUNT" -gt 0 ] && HAS_SPECS=true
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${GREEN} RALPH LOOP (Codex) STARTING ${NC}"
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo ""
|
||||
echo -e "${BLUE}Mode:${NC} $MODE"
|
||||
echo -e "${BLUE}Prompt:${NC} $PROMPT_FILE"
|
||||
echo -e "${BLUE}Branch:${NC} $CURRENT_BRANCH"
|
||||
echo -e "${YELLOW}YOLO:${NC} $([ "$YOLO_ENABLED" = true ] && echo "ENABLED" || echo "DISABLED")"
|
||||
[ -n "$RLM_CONTEXT_FILE" ] && echo -e "${BLUE}RLM:${NC} $RLM_CONTEXT_FILE"
|
||||
[ -n "$SESSION_LOG" ] && echo -e "${BLUE}Log:${NC} $SESSION_LOG"
|
||||
[ $MAX_ITERATIONS -gt 0 ] && echo -e "${BLUE}Max:${NC} $MAX_ITERATIONS iterations"
|
||||
echo ""
|
||||
echo -e "${BLUE}Work source:${NC}"
|
||||
if [ "$HAS_SPECS" = true ]; then
|
||||
echo -e " ${GREEN}✓${NC} specs/ folder ($SPEC_COUNT specs)"
|
||||
else
|
||||
echo -e " ${RED}✗${NC} specs/ folder (no .md files found)"
|
||||
fi
|
||||
echo ""
|
||||
echo -e "${CYAN}Using: $CODEX_CMD $CODEX_FLAGS${NC}"
|
||||
echo -e "${CYAN}Agent must output <promise>DONE</promise> when complete.${NC}"
|
||||
echo ""
|
||||
echo -e "${YELLOW}Press Ctrl+C to stop the loop${NC}"
|
||||
echo ""
|
||||
|
||||
ITERATION=0
|
||||
CONSECUTIVE_FAILURES=0
|
||||
MAX_CONSECUTIVE_FAILURES=3
|
||||
|
||||
while true; do
|
||||
# Check max iterations
|
||||
if [ $MAX_ITERATIONS -gt 0 ] && [ $ITERATION -ge $MAX_ITERATIONS ]; then
|
||||
echo -e "${GREEN}Reached max iterations: $MAX_ITERATIONS${NC}"
|
||||
break
|
||||
fi
|
||||
|
||||
ITERATION=$((ITERATION + 1))
|
||||
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
|
||||
|
||||
echo ""
|
||||
echo -e "${PURPLE}════════════════════ LOOP $ITERATION ════════════════════${NC}"
|
||||
echo -e "${BLUE}[$TIMESTAMP]${NC} Starting iteration $ITERATION"
|
||||
echo ""
|
||||
|
||||
# Log file for this iteration
|
||||
LOG_FILE="$LOG_DIR/ralph_codex_${MODE}_iter_${ITERATION}_$(date '+%Y%m%d_%H%M%S').log"
|
||||
OUTPUT_FILE="$LOG_DIR/ralph_codex_output_iter_${ITERATION}_$(date '+%Y%m%d_%H%M%S').txt"
|
||||
RLM_STATUS="unknown"
|
||||
: > "$LOG_FILE"
|
||||
WATCH_PID=""
|
||||
|
||||
if [ "$ROLLING_OUTPUT_INTERVAL" -gt 0 ] && [ "$ROLLING_OUTPUT_LINES" -gt 0 ] && [ -t 1 ] && [ -w /dev/tty ]; then
|
||||
watch_latest_output "$LOG_FILE" "Codex" &
|
||||
WATCH_PID=$!
|
||||
fi
|
||||
|
||||
# Optional RLM context block appended to prompt at runtime
|
||||
EFFECTIVE_PROMPT_FILE="$PROMPT_FILE"
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
EFFECTIVE_PROMPT_FILE="$LOG_DIR/ralph_codex_prompt_iter_${ITERATION}_$(date '+%Y%m%d_%H%M%S').md"
|
||||
cat "$PROMPT_FILE" > "$EFFECTIVE_PROMPT_FILE"
|
||||
cat >> "$EFFECTIVE_PROMPT_FILE" << EOF
|
||||
|
||||
---
|
||||
## RLM Context (Optional)
|
||||
|
||||
You have access to a large context file at:
|
||||
**$RLM_CONTEXT_FILE**
|
||||
|
||||
Treat this file as an external environment. Do NOT paste the whole file into the prompt.
|
||||
Instead, inspect it programmatically and recursively:
|
||||
|
||||
- Use small slices:
|
||||
\`\`\`bash
|
||||
sed -n 'START,ENDp' "$RLM_CONTEXT_FILE"
|
||||
\`\`\`
|
||||
- Or Python snippets:
|
||||
\`\`\`bash
|
||||
python - <<'PY'
|
||||
from pathlib import Path
|
||||
p = Path("$RLM_CONTEXT_FILE")
|
||||
print(p.read_text().splitlines()[START:END])
|
||||
PY
|
||||
\`\`\`
|
||||
- Use search:
|
||||
\`\`\`bash
|
||||
rg -n "pattern" "$RLM_CONTEXT_FILE"
|
||||
\`\`\`
|
||||
|
||||
Goal: decompose the task into smaller sub-queries and only load the pieces you need.
|
||||
This mirrors the Recursive Language Model approach from https://arxiv.org/html/2512.24601v1
|
||||
|
||||
## RLM Workspace (Optional)
|
||||
|
||||
Past loop outputs are preserved on disk:
|
||||
- Iteration logs: \`logs/\`
|
||||
- Prompt/output snapshots: \`rlm/trace/\`
|
||||
- Iteration index: \`rlm/index.tsv\`
|
||||
|
||||
Use these as an external memory store (search/slice as needed).
|
||||
If you need a recursive sub-query, write a focused prompt in \`rlm/queries/\`,
|
||||
run:
|
||||
\`./scripts/rlm-subcall.sh --query rlm/queries/<file>.md\`
|
||||
and store the result in \`rlm/answers/\`.
|
||||
EOF
|
||||
RLM_PROMPT_SNAPSHOT="$RLM_TRACE_DIR/iter_${ITERATION}_prompt.md"
|
||||
cp "$EFFECTIVE_PROMPT_FILE" "$RLM_PROMPT_SNAPSHOT"
|
||||
fi
|
||||
|
||||
# Run Codex with exec mode, reading prompt from stdin with "-"
|
||||
# Use --output-last-message to capture the final response for checking
|
||||
echo -e "${BLUE}Running: cat $EFFECTIVE_PROMPT_FILE | $CODEX_CMD $CODEX_FLAGS - --output-last-message $OUTPUT_FILE${NC}"
|
||||
echo ""
|
||||
|
||||
CODEX_EXIT=0
|
||||
if cat "$EFFECTIVE_PROMPT_FILE" | "$CODEX_CMD" $CODEX_FLAGS - --output-last-message "$OUTPUT_FILE" 2>&1 | tee "$LOG_FILE"; then
|
||||
if [ -n "$WATCH_PID" ]; then
|
||||
kill "$WATCH_PID" 2>/dev/null || true
|
||||
wait "$WATCH_PID" 2>/dev/null || true
|
||||
fi
|
||||
echo ""
|
||||
echo -e "${GREEN}✓ Codex execution completed${NC}"
|
||||
|
||||
# Check if DONE promise was output (accept both DONE and ALL_DONE variants)
|
||||
if [ -f "$OUTPUT_FILE" ] && grep -qE "<promise>(ALL_)?DONE</promise>" "$OUTPUT_FILE"; then
|
||||
DETECTED_SIGNAL=$(grep -oE "<promise>(ALL_)?DONE</promise>" "$OUTPUT_FILE" | tail -1)
|
||||
echo -e "${GREEN}✓ Completion signal detected: ${DETECTED_SIGNAL}${NC}"
|
||||
echo -e "${GREEN}✓ Task completed successfully!${NC}"
|
||||
CONSECUTIVE_FAILURES=0
|
||||
RLM_STATUS="done"
|
||||
|
||||
if [ "$MODE" = "plan" ]; then
|
||||
echo ""
|
||||
echo -e "${GREEN}Planning complete!${NC}"
|
||||
break
|
||||
fi
|
||||
# Also check the main log
|
||||
elif grep -qE "<promise>(ALL_)?DONE</promise>" "$LOG_FILE"; then
|
||||
DETECTED_SIGNAL=$(grep -oE "<promise>(ALL_)?DONE</promise>" "$LOG_FILE" | tail -1)
|
||||
echo -e "${GREEN}✓ Completion signal detected: ${DETECTED_SIGNAL}${NC}"
|
||||
echo -e "${GREEN}✓ Task completed successfully!${NC}"
|
||||
CONSECUTIVE_FAILURES=0
|
||||
RLM_STATUS="done"
|
||||
else
|
||||
echo -e "${YELLOW}⚠ No completion signal found${NC}"
|
||||
echo -e "${YELLOW} Agent did not output <promise>DONE</promise> or <promise>ALL_DONE</promise>${NC}"
|
||||
echo -e "${YELLOW} Retrying in next iteration...${NC}"
|
||||
CONSECUTIVE_FAILURES=$((CONSECUTIVE_FAILURES + 1))
|
||||
RLM_STATUS="incomplete"
|
||||
print_latest_output "$LOG_FILE" "Codex"
|
||||
|
||||
if [ $CONSECUTIVE_FAILURES -ge $MAX_CONSECUTIVE_FAILURES ]; then
|
||||
echo ""
|
||||
echo -e "${RED}⚠ $MAX_CONSECUTIVE_FAILURES consecutive iterations without completion.${NC}"
|
||||
echo -e "${RED} The agent may be stuck. Check logs:${NC}"
|
||||
echo -e "${RED} - $LOG_FILE${NC}"
|
||||
echo -e "${RED} - $OUTPUT_FILE${NC}"
|
||||
CONSECUTIVE_FAILURES=0
|
||||
fi
|
||||
fi
|
||||
else
|
||||
if [ -n "$WATCH_PID" ]; then
|
||||
kill "$WATCH_PID" 2>/dev/null || true
|
||||
wait "$WATCH_PID" 2>/dev/null || true
|
||||
fi
|
||||
CODEX_EXIT=$?
|
||||
echo -e "${RED}✗ Codex execution failed (exit code: $CODEX_EXIT)${NC}"
|
||||
echo -e "${YELLOW}Check log: $LOG_FILE${NC}"
|
||||
CONSECUTIVE_FAILURES=$((CONSECUTIVE_FAILURES + 1))
|
||||
RLM_STATUS="error"
|
||||
print_latest_output "$LOG_FILE" "Codex"
|
||||
fi
|
||||
|
||||
# Record iteration in RLM index (optional)
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
RLM_PROMPT_PATH="${RLM_PROMPT_SNAPSHOT:-}"
|
||||
RLM_OUTPUT_SNAPSHOT="$RLM_TRACE_DIR/iter_${ITERATION}_output.log"
|
||||
cp "$LOG_FILE" "$RLM_OUTPUT_SNAPSHOT"
|
||||
if [ -f "$OUTPUT_FILE" ]; then
|
||||
RLM_LAST_MESSAGE_SNAPSHOT="$RLM_TRACE_DIR/iter_${ITERATION}_last_message.txt"
|
||||
cp "$OUTPUT_FILE" "$RLM_LAST_MESSAGE_SNAPSHOT"
|
||||
fi
|
||||
RLM_OUTPUT_PATH="${RLM_LAST_MESSAGE_SNAPSHOT:-$RLM_OUTPUT_SNAPSHOT}"
|
||||
echo -e "${TIMESTAMP}\t${MODE}\t${ITERATION}\t${RLM_PROMPT_PATH}\t${LOG_FILE}\t${RLM_OUTPUT_PATH}\t${RLM_STATUS}" >> "$RLM_INDEX"
|
||||
fi
|
||||
|
||||
# Push changes after each iteration
|
||||
git push origin "$CURRENT_BRANCH" 2>/dev/null || {
|
||||
if git log origin/$CURRENT_BRANCH..HEAD --oneline 2>/dev/null | grep -q .; then
|
||||
git push -u origin "$CURRENT_BRANCH" 2>/dev/null || true
|
||||
fi
|
||||
}
|
||||
|
||||
# Brief pause between iterations
|
||||
echo ""
|
||||
echo -e "${BLUE}Waiting 2s before next iteration...${NC}"
|
||||
sleep 2
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${GREEN} RALPH LOOP (Codex) FINISHED ($ITERATION iterations) ${NC}"
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
688
python/scripts/ralph-loop.sh
Executable file
688
python/scripts/ralph-loop.sh
Executable file
@@ -0,0 +1,688 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Ralph Loop for Claude Code
|
||||
#
|
||||
# Based on Geoffrey Huntley's Ralph Wiggum methodology:
|
||||
# https://github.com/ghuntley/how-to-ralph-wiggum
|
||||
#
|
||||
# Combined with SpecKit-style specifications.
|
||||
#
|
||||
# Key principles:
|
||||
# - Each iteration picks ONE task/spec to work on
|
||||
# - Agent works until acceptance criteria are met
|
||||
# - Only outputs <promise>DONE</promise> when truly complete
|
||||
# - Bash loop checks for magic phrase before continuing
|
||||
# - Fresh context window each iteration
|
||||
#
|
||||
# Work sources (in priority order):
|
||||
# 1. IMPLEMENTATION_PLAN.md (if exists) - pick highest priority task
|
||||
# 2. specs/ folder - pick highest priority incomplete spec
|
||||
#
|
||||
# Usage:
|
||||
# ./scripts/ralph-loop.sh # Build mode (unlimited)
|
||||
# ./scripts/ralph-loop.sh 20 # Build mode (max 20 iterations)
|
||||
# ./scripts/ralph-loop.sh plan # Planning mode (creates IMPLEMENTATION_PLAN.md)
|
||||
#
|
||||
|
||||
set -e
|
||||
set -o pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
LOG_DIR="$PROJECT_DIR/logs"
|
||||
CONSTITUTION="$PROJECT_DIR/.specify/memory/constitution.md"
|
||||
RLM_DIR="$PROJECT_DIR/rlm"
|
||||
RLM_TRACE_DIR="$RLM_DIR/trace"
|
||||
RLM_QUERIES_DIR="$RLM_DIR/queries"
|
||||
RLM_ANSWERS_DIR="$RLM_DIR/answers"
|
||||
RLM_INDEX="$RLM_DIR/index.tsv"
|
||||
|
||||
# Configuration
|
||||
MAX_ITERATIONS=0 # 0 = unlimited
|
||||
MODE="build"
|
||||
CLAUDE_CMD="${CLAUDE_CMD:-claude}"
|
||||
YOLO_FLAG="--dangerously-skip-permissions"
|
||||
RLM_CONTEXT_FILE=""
|
||||
TAIL_LINES=5
|
||||
TAIL_RENDERED_LINES=0
|
||||
ROLLING_OUTPUT_LINES=5
|
||||
ROLLING_OUTPUT_INTERVAL=10
|
||||
ROLLING_RENDERED_LINES=0
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
PURPLE='\033[0;35m'
|
||||
CYAN='\033[0;36m'
|
||||
NC='\033[0m'
|
||||
|
||||
mkdir -p "$LOG_DIR"
|
||||
|
||||
# Check constitution for YOLO setting
|
||||
YOLO_ENABLED=true
|
||||
if [[ -f "$CONSTITUTION" ]]; then
|
||||
if grep -q "YOLO Mode.*DISABLED" "$CONSTITUTION" 2>/dev/null; then
|
||||
YOLO_ENABLED=false
|
||||
fi
|
||||
fi
|
||||
|
||||
show_help() {
|
||||
cat <<EOF
|
||||
Ralph Loop for Claude Code
|
||||
|
||||
Based on Geoffrey Huntley's Ralph Wiggum methodology + SpecKit specs.
|
||||
https://github.com/ghuntley/how-to-ralph-wiggum
|
||||
|
||||
Usage:
|
||||
./scripts/ralph-loop.sh # Build mode, unlimited iterations
|
||||
./scripts/ralph-loop.sh 20 # Build mode, max 20 iterations
|
||||
./scripts/ralph-loop.sh plan # Planning mode (optional)
|
||||
./scripts/ralph-loop.sh --rlm-context ./rlm/context.txt
|
||||
./scripts/ralph-loop.sh --rlm ./rlm/context.txt
|
||||
|
||||
Modes:
|
||||
build (default) Pick spec/task and implement
|
||||
plan Create IMPLEMENTATION_PLAN.md from specs (OPTIONAL)
|
||||
|
||||
Work Sources (checked in order):
|
||||
1. IMPLEMENTATION_PLAN.md - If exists, pick highest priority task
|
||||
2. specs/ folder - Otherwise, pick highest priority incomplete spec
|
||||
|
||||
The plan mode is OPTIONAL. Most projects can work directly from specs.
|
||||
|
||||
RLM Mode (optional):
|
||||
--rlm-context <file> Treat a large context file as external environment.
|
||||
The agent should read slices instead of loading it all.
|
||||
--rlm [file] Shortcut for --rlm-context (defaults to rlm/context.txt)
|
||||
|
||||
How it works:
|
||||
1. Each iteration feeds PROMPT.md to Claude via stdin
|
||||
2. Claude picks the HIGHEST PRIORITY incomplete spec/task
|
||||
3. Claude implements, tests, and verifies acceptance criteria
|
||||
4. Claude outputs <promise>DONE</promise> ONLY if criteria are met
|
||||
5. Bash loop checks for the magic phrase
|
||||
6. If found, loop continues to next iteration (fresh context)
|
||||
7. If not found, loop retries
|
||||
|
||||
RLM workspace (when enabled):
|
||||
- rlm/trace/ Prompt snapshots + outputs per iteration
|
||||
- rlm/index.tsv Index of all iterations (timestamp, prompt, log, status)
|
||||
- rlm/queries/ and rlm/answers/ For optional recursive sub-queries
|
||||
|
||||
EOF
|
||||
}
|
||||
|
||||
print_latest_output() {
|
||||
local log_file="$1"
|
||||
local label="${2:-Claude}"
|
||||
local target="/dev/tty"
|
||||
|
||||
[ -f "$log_file" ] || return 0
|
||||
|
||||
if [ ! -w "$target" ]; then
|
||||
target="/dev/stdout"
|
||||
fi
|
||||
|
||||
if [ "$target" = "/dev/tty" ] && [ "$TAIL_RENDERED_LINES" -gt 0 ]; then
|
||||
printf "\033[%dA\033[J" "$TAIL_RENDERED_LINES" > "$target"
|
||||
fi
|
||||
|
||||
{
|
||||
echo "Latest ${label} output (last ${TAIL_LINES} lines):"
|
||||
tail -n "$TAIL_LINES" "$log_file"
|
||||
} > "$target"
|
||||
|
||||
if [ "$target" = "/dev/tty" ]; then
|
||||
TAIL_RENDERED_LINES=$((TAIL_LINES + 1))
|
||||
fi
|
||||
}
|
||||
|
||||
watch_latest_output() {
|
||||
local log_file="$1"
|
||||
local label="${2:-Claude}"
|
||||
local target="/dev/tty"
|
||||
local use_tty=false
|
||||
local use_tput=false
|
||||
|
||||
[ -f "$log_file" ] || return 0
|
||||
|
||||
if [ ! -w "$target" ]; then
|
||||
target="/dev/stdout"
|
||||
else
|
||||
use_tty=true
|
||||
if command -v tput &>/dev/null; then
|
||||
use_tput=true
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ "$use_tty" = true ]; then
|
||||
if [ "$use_tput" = true ]; then
|
||||
tput cr > "$target"
|
||||
tput sc > "$target"
|
||||
else
|
||||
printf "\r\0337" > "$target"
|
||||
fi
|
||||
fi
|
||||
|
||||
while true; do
|
||||
local timestamp
|
||||
timestamp=$(date '+%Y-%m-%d %H:%M:%S')
|
||||
|
||||
if [ "$use_tty" = true ]; then
|
||||
if [ "$use_tput" = true ]; then
|
||||
tput rc > "$target"
|
||||
tput ed > "$target"
|
||||
tput cr > "$target"
|
||||
else
|
||||
printf "\0338\033[J\r" > "$target"
|
||||
fi
|
||||
fi
|
||||
|
||||
{
|
||||
echo -e "${CYAN}[$timestamp] Latest ${label} output (last ${ROLLING_OUTPUT_LINES} lines):${NC}"
|
||||
if [ ! -s "$log_file" ]; then
|
||||
echo "(no output yet)"
|
||||
else
|
||||
tail -n "$ROLLING_OUTPUT_LINES" "$log_file" 2>/dev/null || true
|
||||
fi
|
||||
echo ""
|
||||
} > "$target"
|
||||
|
||||
sleep "$ROLLING_OUTPUT_INTERVAL"
|
||||
done
|
||||
}
|
||||
|
||||
# Parse arguments
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
plan)
|
||||
MODE="plan"
|
||||
if [[ "${2:-}" =~ ^[0-9]+$ ]]; then
|
||||
MAX_ITERATIONS="$2"
|
||||
shift 2
|
||||
else
|
||||
MAX_ITERATIONS=1
|
||||
shift
|
||||
fi
|
||||
;;
|
||||
--rlm-context)
|
||||
RLM_CONTEXT_FILE="${2:-}"
|
||||
shift 2
|
||||
;;
|
||||
--rlm)
|
||||
if [[ -n "${2:-}" && "${2:0:1}" != "-" ]]; then
|
||||
RLM_CONTEXT_FILE="$2"
|
||||
shift 2
|
||||
else
|
||||
RLM_CONTEXT_FILE="rlm/context.txt"
|
||||
shift
|
||||
fi
|
||||
;;
|
||||
-h|--help)
|
||||
show_help
|
||||
exit 0
|
||||
;;
|
||||
[0-9]*)
|
||||
MODE="build"
|
||||
MAX_ITERATIONS="$1"
|
||||
shift
|
||||
;;
|
||||
*)
|
||||
echo -e "${RED}Unknown argument: $1${NC}"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
cd "$PROJECT_DIR"
|
||||
|
||||
# Validate RLM context file (if provided)
|
||||
if [ -n "$RLM_CONTEXT_FILE" ] && [ ! -f "$RLM_CONTEXT_FILE" ]; then
|
||||
echo -e "${RED}Error: RLM context file not found: $RLM_CONTEXT_FILE${NC}"
|
||||
echo "Create it first (example):"
|
||||
echo " mkdir -p rlm && printf \"%s\" \"<your long context>\" > $RLM_CONTEXT_FILE"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Initialize RLM workspace (optional)
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
mkdir -p "$RLM_TRACE_DIR" "$RLM_QUERIES_DIR" "$RLM_ANSWERS_DIR"
|
||||
if [ ! -f "$RLM_INDEX" ]; then
|
||||
echo -e "timestamp\tmode\titeration\tprompt\tlog\toutput\tstatus" > "$RLM_INDEX"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Session log (captures ALL output)
|
||||
SESSION_LOG="$LOG_DIR/ralph_${MODE}_session_$(date '+%Y%m%d_%H%M%S').log"
|
||||
exec > >(tee -a "$SESSION_LOG") 2>&1
|
||||
|
||||
# Check if Claude CLI is available
|
||||
if ! command -v "$CLAUDE_CMD" &> /dev/null; then
|
||||
echo -e "${RED}Error: Claude CLI not found${NC}"
|
||||
echo ""
|
||||
echo "Install Claude Code CLI and authenticate first."
|
||||
echo "https://claude.ai/code"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Determine which prompt to use based on mode and available files
|
||||
if [ "$MODE" = "plan" ]; then
|
||||
PROMPT_FILE="PROMPT_plan.md"
|
||||
else
|
||||
PROMPT_FILE="PROMPT_build.md"
|
||||
fi
|
||||
|
||||
# Create/update the build prompt to be flexible about plan vs specs
|
||||
cat > "PROMPT_build.md" << 'BUILDEOF'
|
||||
# Ralph Build Mode
|
||||
|
||||
Based on Geoffrey Huntley's Ralph Wiggum methodology.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: Orient
|
||||
|
||||
Read `.specify/memory/constitution.md` to understand project principles and constraints.
|
||||
|
||||
---
|
||||
BUILDEOF
|
||||
|
||||
# Optional RLM context block
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
cat >> "PROMPT_build.md" << EOF
|
||||
|
||||
## Phase 0d: RLM Context (Optional)
|
||||
|
||||
You have access to a large context file at:
|
||||
**$RLM_CONTEXT_FILE**
|
||||
|
||||
Treat this file as an external environment. Do NOT paste the whole file into the prompt.
|
||||
Instead, inspect it programmatically and recursively:
|
||||
|
||||
- Use small slices:
|
||||
```bash
|
||||
sed -n 'START,ENDp' "$RLM_CONTEXT_FILE"
|
||||
```
|
||||
- Or Python snippets:
|
||||
```bash
|
||||
python - <<'PY'
|
||||
from pathlib import Path
|
||||
p = Path("$RLM_CONTEXT_FILE")
|
||||
print(p.read_text().splitlines()[START:END])
|
||||
PY
|
||||
```
|
||||
- Use search:
|
||||
```bash
|
||||
rg -n "pattern" "$RLM_CONTEXT_FILE"
|
||||
```
|
||||
|
||||
Goal: decompose the task into smaller sub-queries and only load the pieces you need.
|
||||
This mirrors the Recursive Language Model approach from https://arxiv.org/html/2512.24601v1
|
||||
|
||||
## RLM Workspace (Optional)
|
||||
|
||||
Past loop outputs are preserved on disk:
|
||||
- Iteration logs: `logs/`
|
||||
- Prompt/output snapshots: `rlm/trace/`
|
||||
- Iteration index: `rlm/index.tsv`
|
||||
|
||||
Use these as an external memory store (search/slice as needed).
|
||||
If you need a recursive sub-query, write a focused prompt in `rlm/queries/`,
|
||||
run:
|
||||
`./scripts/rlm-subcall.sh --query rlm/queries/<file>.md`
|
||||
and store the result in `rlm/answers/`.
|
||||
EOF
|
||||
fi
|
||||
|
||||
cat >> "PROMPT_build.md" << 'BUILDEOF'
|
||||
|
||||
## Phase 1: Discover Work Items
|
||||
|
||||
Search for incomplete work from these sources (in order):
|
||||
|
||||
1. **specs/ folder** — Look for `.md` files NOT marked `## Status: COMPLETE`
|
||||
2. **IMPLEMENTATION_PLAN.md** — If exists, find unchecked `- [ ]` tasks
|
||||
3. **GitHub Issues** — Check for open issues (if this is a GitHub repo)
|
||||
4. **Any task tracker** — Jira, Linear, etc. if configured
|
||||
|
||||
Pick the **HIGHEST PRIORITY** incomplete item:
|
||||
- Lower numbers = higher priority (001 before 010)
|
||||
- `[HIGH]` before `[MEDIUM]` before `[LOW]`
|
||||
- Bugs/blockers before features
|
||||
|
||||
Before implementing, search the codebase to verify it's not already done.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1b: Re-Verification Mode (No Incomplete Work Found)
|
||||
|
||||
**If ALL specs appear complete**, don't just exit — do a quality check:
|
||||
|
||||
1. **Randomly pick** one completed spec from `specs/`
|
||||
2. **Strictly re-verify** ALL its acceptance criteria:
|
||||
- Run the actual tests mentioned in the spec
|
||||
- Manually verify each criterion is truly met
|
||||
- Check edge cases
|
||||
- Look for regressions
|
||||
3. **If any criterion fails**: Unmark the spec as complete and fix it
|
||||
4. **If all pass**: Output `<promise>DONE</promise>` to confirm quality
|
||||
|
||||
This ensures the codebase stays healthy even when "nothing to do."
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Implement
|
||||
|
||||
Implement the selected spec/task completely:
|
||||
- Follow the spec's requirements exactly
|
||||
- Write clean, maintainable code
|
||||
- Add tests as needed
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Validate
|
||||
|
||||
Run the project's test suite and verify:
|
||||
- All tests pass
|
||||
- No lint errors
|
||||
- The spec's acceptance criteria are 100% met
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Commit & Update
|
||||
|
||||
1. Mark the spec/task as complete (add `## Status: COMPLETE` to spec file)
|
||||
2. `git add -A`
|
||||
3. `git commit` with a descriptive message
|
||||
4. `git push`
|
||||
|
||||
---
|
||||
|
||||
## Completion Signal
|
||||
|
||||
**CRITICAL:** Only output the magic phrase when the work is 100% complete.
|
||||
|
||||
Check:
|
||||
- [ ] Implementation matches all requirements
|
||||
- [ ] All tests pass
|
||||
- [ ] All acceptance criteria verified
|
||||
- [ ] Changes committed and pushed
|
||||
- [ ] Spec marked as complete
|
||||
|
||||
**If ALL checks pass, output:** `<promise>DONE</promise>`
|
||||
|
||||
**If ANY check fails:** Fix the issue and try again. Do NOT output the magic phrase.
|
||||
BUILDEOF
|
||||
|
||||
# Create planning prompt (only used if plan mode is explicitly requested)
|
||||
cat > "PROMPT_plan.md" << 'PLANEOF'
|
||||
# Ralph Planning Mode (OPTIONAL)
|
||||
|
||||
This mode is OPTIONAL. Most projects work fine directly from specs.
|
||||
|
||||
Only use this when you want a detailed breakdown of specs into smaller tasks.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: Orient
|
||||
|
||||
0a. Read `.specify/memory/constitution.md` for project principles.
|
||||
|
||||
0b. Study `specs/` to learn all feature specifications.
|
||||
|
||||
---
|
||||
PLANEOF
|
||||
|
||||
# Optional RLM context block for planning
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
cat >> "PROMPT_plan.md" << EOF
|
||||
|
||||
## Phase 0c: RLM Context (Optional)
|
||||
|
||||
You have access to a large context file at:
|
||||
**$RLM_CONTEXT_FILE**
|
||||
|
||||
Treat this file as an external environment. Do NOT paste the whole file into the prompt.
|
||||
Inspect only the slices you need using shell tools or Python.
|
||||
This mirrors the Recursive Language Model approach from https://arxiv.org/html/2512.24601v1
|
||||
|
||||
## RLM Workspace (Optional)
|
||||
|
||||
Past loop outputs are preserved on disk:
|
||||
- Iteration logs: `logs/`
|
||||
- Prompt/output snapshots: `rlm/trace/`
|
||||
- Iteration index: `rlm/index.tsv`
|
||||
|
||||
Use these as an external memory store (search/slice as needed).
|
||||
For recursive sub-queries, use:
|
||||
`./scripts/rlm-subcall.sh --query rlm/queries/<file>.md`
|
||||
EOF
|
||||
fi
|
||||
|
||||
cat >> "PROMPT_plan.md" << 'PLANEOF'
|
||||
|
||||
## Phase 1: Gap Analysis
|
||||
|
||||
Compare specs against current codebase:
|
||||
- What's fully implemented?
|
||||
- What's partially done?
|
||||
- What's not started?
|
||||
- What has issues or bugs?
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Create Plan
|
||||
|
||||
Create `IMPLEMENTATION_PLAN.md` with a prioritized task list:
|
||||
|
||||
```markdown
|
||||
# Implementation Plan
|
||||
|
||||
> Auto-generated breakdown of specs into tasks.
|
||||
> Delete this file to return to working directly from specs.
|
||||
|
||||
## Priority Tasks
|
||||
|
||||
- [ ] [HIGH] Task description - from spec NNN
|
||||
- [ ] [HIGH] Task description - from spec NNN
|
||||
- [ ] [MEDIUM] Task description
|
||||
- [ ] [LOW] Task description
|
||||
|
||||
## Completed
|
||||
|
||||
- [x] Completed task
|
||||
```
|
||||
|
||||
Prioritize by:
|
||||
1. Dependencies (do prerequisites first)
|
||||
2. Impact (high-value features first)
|
||||
3. Complexity (mix easy wins with harder tasks)
|
||||
|
||||
---
|
||||
|
||||
## Completion Signal
|
||||
|
||||
When the plan is complete and saved:
|
||||
|
||||
`<promise>DONE</promise>`
|
||||
PLANEOF
|
||||
|
||||
# Check prompt file exists
|
||||
if [ ! -f "$PROMPT_FILE" ]; then
|
||||
echo -e "${RED}Error: $PROMPT_FILE not found${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Build Claude flags
|
||||
CLAUDE_FLAGS="-p"
|
||||
if [ "$YOLO_ENABLED" = true ]; then
|
||||
CLAUDE_FLAGS="$CLAUDE_FLAGS $YOLO_FLAG"
|
||||
fi
|
||||
|
||||
# Get current branch
|
||||
CURRENT_BRANCH=$(git branch --show-current 2>/dev/null || echo "main")
|
||||
|
||||
# Check for work sources - count .md files in specs/
|
||||
HAS_PLAN=false
|
||||
HAS_SPECS=false
|
||||
SPEC_COUNT=0
|
||||
[ -f "IMPLEMENTATION_PLAN.md" ] && HAS_PLAN=true
|
||||
if [ -d "specs" ]; then
|
||||
SPEC_COUNT=$(find specs -maxdepth 1 -name "*.md" -type f 2>/dev/null | wc -l)
|
||||
[ "$SPEC_COUNT" -gt 0 ] && HAS_SPECS=true
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${GREEN} RALPH LOOP (Claude Code) STARTING ${NC}"
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo ""
|
||||
echo -e "${BLUE}Mode:${NC} $MODE"
|
||||
echo -e "${BLUE}Prompt:${NC} $PROMPT_FILE"
|
||||
echo -e "${BLUE}Branch:${NC} $CURRENT_BRANCH"
|
||||
echo -e "${YELLOW}YOLO:${NC} $([ "$YOLO_ENABLED" = true ] && echo "ENABLED" || echo "DISABLED")"
|
||||
[ -n "$RLM_CONTEXT_FILE" ] && echo -e "${BLUE}RLM:${NC} $RLM_CONTEXT_FILE"
|
||||
[ -n "$SESSION_LOG" ] && echo -e "${BLUE}Log:${NC} $SESSION_LOG"
|
||||
[ $MAX_ITERATIONS -gt 0 ] && echo -e "${BLUE}Max:${NC} $MAX_ITERATIONS iterations"
|
||||
echo ""
|
||||
echo -e "${BLUE}Work source:${NC}"
|
||||
if [ "$HAS_PLAN" = true ]; then
|
||||
echo -e " ${GREEN}✓${NC} IMPLEMENTATION_PLAN.md (will use this)"
|
||||
else
|
||||
echo -e " ${YELLOW}○${NC} IMPLEMENTATION_PLAN.md (not found, that's OK)"
|
||||
fi
|
||||
if [ "$HAS_SPECS" = true ]; then
|
||||
echo -e " ${GREEN}✓${NC} specs/ folder ($SPEC_COUNT specs)"
|
||||
else
|
||||
echo -e " ${RED}✗${NC} specs/ folder (no .md files found)"
|
||||
fi
|
||||
echo ""
|
||||
echo -e "${CYAN}The loop checks for <promise>DONE</promise> in each iteration.${NC}"
|
||||
echo -e "${CYAN}Agent must verify acceptance criteria before outputting it.${NC}"
|
||||
echo ""
|
||||
echo -e "${YELLOW}Press Ctrl+C to stop the loop${NC}"
|
||||
echo ""
|
||||
|
||||
ITERATION=0
|
||||
CONSECUTIVE_FAILURES=0
|
||||
MAX_CONSECUTIVE_FAILURES=3
|
||||
|
||||
while true; do
|
||||
# Check max iterations
|
||||
if [ $MAX_ITERATIONS -gt 0 ] && [ $ITERATION -ge $MAX_ITERATIONS ]; then
|
||||
echo -e "${GREEN}Reached max iterations: $MAX_ITERATIONS${NC}"
|
||||
break
|
||||
fi
|
||||
|
||||
ITERATION=$((ITERATION + 1))
|
||||
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
|
||||
|
||||
echo ""
|
||||
echo -e "${PURPLE}════════════════════ LOOP $ITERATION ════════════════════${NC}"
|
||||
echo -e "${BLUE}[$TIMESTAMP]${NC} Starting iteration $ITERATION"
|
||||
echo ""
|
||||
|
||||
# Log file for this iteration
|
||||
LOG_FILE="$LOG_DIR/ralph_${MODE}_iter_${ITERATION}_$(date '+%Y%m%d_%H%M%S').log"
|
||||
: > "$LOG_FILE"
|
||||
WATCH_PID=""
|
||||
|
||||
if [ "$ROLLING_OUTPUT_INTERVAL" -gt 0 ] && [ "$ROLLING_OUTPUT_LINES" -gt 0 ] && [ -t 1 ] && [ -w /dev/tty ]; then
|
||||
watch_latest_output "$LOG_FILE" "Claude" &
|
||||
WATCH_PID=$!
|
||||
fi
|
||||
RLM_STATUS="unknown"
|
||||
|
||||
# Snapshot prompt (optional RLM workspace)
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
RLM_PROMPT_SNAPSHOT="$RLM_TRACE_DIR/iter_${ITERATION}_prompt.md"
|
||||
cp "$PROMPT_FILE" "$RLM_PROMPT_SNAPSHOT"
|
||||
fi
|
||||
|
||||
# Run Claude with prompt via stdin, capture output
|
||||
CLAUDE_OUTPUT=""
|
||||
if CLAUDE_OUTPUT=$(cat "$PROMPT_FILE" | "$CLAUDE_CMD" $CLAUDE_FLAGS 2>&1 | tee "$LOG_FILE"); then
|
||||
if [ -n "$WATCH_PID" ]; then
|
||||
kill "$WATCH_PID" 2>/dev/null || true
|
||||
wait "$WATCH_PID" 2>/dev/null || true
|
||||
fi
|
||||
echo ""
|
||||
echo -e "${GREEN}✓ Claude execution completed${NC}"
|
||||
|
||||
# Check if DONE promise was output (accept both DONE and ALL_DONE variants)
|
||||
if echo "$CLAUDE_OUTPUT" | grep -qE "<promise>(ALL_)?DONE</promise>"; then
|
||||
DETECTED_SIGNAL=$(echo "$CLAUDE_OUTPUT" | grep -oE "<promise>(ALL_)?DONE</promise>" | tail -1)
|
||||
echo -e "${GREEN}✓ Completion signal detected: ${DETECTED_SIGNAL}${NC}"
|
||||
echo -e "${GREEN}✓ Task completed successfully!${NC}"
|
||||
CONSECUTIVE_FAILURES=0
|
||||
RLM_STATUS="done"
|
||||
|
||||
# For planning mode, stop after one successful plan
|
||||
if [ "$MODE" = "plan" ]; then
|
||||
echo ""
|
||||
echo -e "${GREEN}Planning complete!${NC}"
|
||||
echo -e "${CYAN}Run './scripts/ralph-loop.sh' to start building.${NC}"
|
||||
echo -e "${CYAN}Or delete IMPLEMENTATION_PLAN.md to work directly from specs.${NC}"
|
||||
break
|
||||
fi
|
||||
else
|
||||
echo -e "${YELLOW}⚠ No completion signal found${NC}"
|
||||
echo -e "${YELLOW} Agent did not output <promise>DONE</promise> or <promise>ALL_DONE</promise>${NC}"
|
||||
echo -e "${YELLOW} This means acceptance criteria were not met.${NC}"
|
||||
echo -e "${YELLOW} Retrying in next iteration...${NC}"
|
||||
CONSECUTIVE_FAILURES=$((CONSECUTIVE_FAILURES + 1))
|
||||
RLM_STATUS="incomplete"
|
||||
print_latest_output "$LOG_FILE" "Claude"
|
||||
|
||||
if [ $CONSECUTIVE_FAILURES -ge $MAX_CONSECUTIVE_FAILURES ]; then
|
||||
echo ""
|
||||
echo -e "${RED}⚠ $MAX_CONSECUTIVE_FAILURES consecutive iterations without completion.${NC}"
|
||||
echo -e "${RED} The agent may be stuck. Consider:${NC}"
|
||||
echo -e "${RED} - Checking the logs in $LOG_DIR${NC}"
|
||||
echo -e "${RED} - Simplifying the current spec${NC}"
|
||||
echo -e "${RED} - Manually fixing blocking issues${NC}"
|
||||
echo ""
|
||||
CONSECUTIVE_FAILURES=0
|
||||
fi
|
||||
fi
|
||||
else
|
||||
if [ -n "$WATCH_PID" ]; then
|
||||
kill "$WATCH_PID" 2>/dev/null || true
|
||||
wait "$WATCH_PID" 2>/dev/null || true
|
||||
fi
|
||||
echo -e "${RED}✗ Claude execution failed${NC}"
|
||||
echo -e "${YELLOW}Check log: $LOG_FILE${NC}"
|
||||
CONSECUTIVE_FAILURES=$((CONSECUTIVE_FAILURES + 1))
|
||||
RLM_STATUS="error"
|
||||
print_latest_output "$LOG_FILE" "Claude"
|
||||
fi
|
||||
|
||||
# Record iteration in RLM index (optional)
|
||||
if [ -n "$RLM_CONTEXT_FILE" ]; then
|
||||
RLM_PROMPT_PATH="${RLM_PROMPT_SNAPSHOT:-}"
|
||||
RLM_OUTPUT_SNAPSHOT="$RLM_TRACE_DIR/iter_${ITERATION}_output.log"
|
||||
cp "$LOG_FILE" "$RLM_OUTPUT_SNAPSHOT"
|
||||
echo -e "${TIMESTAMP}\t${MODE}\t${ITERATION}\t${RLM_PROMPT_PATH}\t${LOG_FILE}\t${RLM_OUTPUT_SNAPSHOT}\t${RLM_STATUS}" >> "$RLM_INDEX"
|
||||
fi
|
||||
|
||||
# Push changes after each iteration (if any)
|
||||
git push origin "$CURRENT_BRANCH" 2>/dev/null || {
|
||||
if git log origin/$CURRENT_BRANCH..HEAD --oneline 2>/dev/null | grep -q .; then
|
||||
echo -e "${YELLOW}Push failed, creating remote branch...${NC}"
|
||||
git push -u origin "$CURRENT_BRANCH" 2>/dev/null || true
|
||||
fi
|
||||
}
|
||||
|
||||
# Brief pause between iterations
|
||||
echo ""
|
||||
echo -e "${BLUE}Waiting 2s before next iteration...${NC}"
|
||||
sleep 2
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${GREEN} RALPH LOOP FINISHED ($ITERATION iterations) ${NC}"
|
||||
echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
41
python/specs/001-core-ui-camera.md
Normal file
41
python/specs/001-core-ui-camera.md
Normal file
@@ -0,0 +1,41 @@
|
||||
# Feature: Core UI and Camera Feed (PyObjC)
|
||||
|
||||
## Status: COMPLETE
|
||||
|
||||
## Description
|
||||
Create the main application window using PyObjC (AppKit) and display a live camera feed. This ensures a native macOS look and feel.
|
||||
|
||||
## Requirements
|
||||
|
||||
1. **App & Window Setup (AppKit)**:
|
||||
- Initialize `NSApplication`.
|
||||
- Create a main `NSWindow` titled "ItemSense".
|
||||
- Size: 800x600 (resizable).
|
||||
- Window should center on screen.
|
||||
|
||||
2. **UI Layout**:
|
||||
- Use `NSStackView` (vertical) or manual constraints to layout:
|
||||
- Top: Video Feed (`NSImageView`).
|
||||
- Bottom: "Capture" button (`NSButton`).
|
||||
|
||||
3. **Camera Feed**:
|
||||
- Use `opencv-python` to capture frames from webcam (index 0).
|
||||
- Convert frames (`cv2` BGR -> RGB) to `NSImage/CGImage`.
|
||||
- Update the `NSImageView` at ~30 FPS using a timer (`NSTimer` or equivalent app loop integration).
|
||||
|
||||
4. **Capture Button**:
|
||||
- Standard macOS Push Button.
|
||||
- Label: "Capture".
|
||||
- Action: Print "Capture clicked" to console.
|
||||
|
||||
5. **Lifecycle**:
|
||||
- Ensure `Cmd+Q` works.
|
||||
- Ensure closing the window terminates the app (or at least the `applicationShouldTerminateAfterLastWindowClosed:` delegate method returns True).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] App launches with a native macOS window "ItemSense".
|
||||
- [ ] Live camera feed is visible in the view.
|
||||
- [ ] "Capture" button is visible at the bottom.
|
||||
- [ ] Clicking "Capture" prints to console.
|
||||
- [ ] App exits cleanly on window close or Cmd+Q.
|
||||
31
python/specs/002-openai-integration.md
Normal file
31
python/specs/002-openai-integration.md
Normal file
@@ -0,0 +1,31 @@
|
||||
# Feature: OpenAI Vision Integration (PyObjC)
|
||||
|
||||
## Status: COMPLETE
|
||||
|
||||
## Description
|
||||
Implement the logic to capture a frame from the AppKit interface and send it to OpenAI's API.
|
||||
|
||||
## Requirements
|
||||
|
||||
1. **Image Handling**:
|
||||
- On "Capture" click:
|
||||
- Stop/Pause the live feed update.
|
||||
- Store the current frame (in memory).
|
||||
- Show "Processing..." (maybe change button text or add a label).
|
||||
|
||||
2. **OpenAI API Call**:
|
||||
- Async handling is important to not block the UI thread (spinning beachball).
|
||||
- Run the API request in a background thread (`threading`).
|
||||
- Model: `gpt-5-mini` (fallback `gpt-4o-mini`).
|
||||
- Prompt: "What is this item? Please provide a brief description."
|
||||
|
||||
3. **Response Handling**:
|
||||
- When response returns, schedule a UI update on the main thread (`performSelectorOnMainThread:` or `dispatch_async`).
|
||||
- Print response to console (UI display comes in Spec 003).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] UI remains responsive (no beachball) during API call.
|
||||
- [ ] "Processing..." indication is shown.
|
||||
- [ ] Image frame is correctly sent to OpenAI.
|
||||
- [ ] Text response is received and printed to console.
|
||||
27
python/specs/003-result-display.md
Normal file
27
python/specs/003-result-display.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# Feature: Result Display (PyObjC)
|
||||
|
||||
## Status: COMPLETE
|
||||
|
||||
## Description
|
||||
Display the analysis results natively in the AppKit UI.
|
||||
|
||||
## Requirements
|
||||
|
||||
1. **Result UI**:
|
||||
- Add a scrollable `NSTextView` (within an `NSScrollView`) below the image view.
|
||||
- Initially empty or hidden.
|
||||
|
||||
2. **Workflow**:
|
||||
- **Live Mode**: Camera active, Button says "Capture", Text view hidden/empty.
|
||||
- **Processing Mode**: specific indication.
|
||||
- **Result Mode**: Camera paused on captured frame, Button says "Scan Another", Text view shows description.
|
||||
|
||||
3. **Data Binding**:
|
||||
- Update the `NSTextView` string with the OpenAI response.
|
||||
- Clicking "Scan Another" resets the UI to **Live Mode**.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] App cycles correctly: Capture -> Result -> Scan Another -> Capture.
|
||||
- [ ] Result text is readable in a native macOS scroll view.
|
||||
- [ ] Window resizing layout remains sane.
|
||||
20
python/task.md
Normal file
20
python/task.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# Task: Install and Setup ralph-wiggum
|
||||
|
||||
- [x] Read INSTALLATION.md <!-- id: 0 -->
|
||||
- [x] Phase 1: Create Structure <!-- id: 1 -->
|
||||
- [x] Phase 2: Download Scripts <!-- id: 2 -->
|
||||
- [x] Phase 3: Get Version Info <!-- id: 3 -->
|
||||
- [x] Phase 4: Project Interview (Ask User) <!-- id: 4 -->
|
||||
- [x] Phase 5: Create Constitution <!-- id: 5 -->
|
||||
- [x] Phase 6: Create Agent Entry Files <!-- id: 6 -->
|
||||
- [x] Phase 7: Create Prompts <!-- id: 7 -->
|
||||
- [x] Phase 8: Create Cursor Command <!-- id: 8 -->
|
||||
- [x] Phase 9: Finalize and Explain <!-- id: 9 -->
|
||||
- [x] Store OpenAI API Key in .env <!-- id: 10 -->
|
||||
- [x] Ensure .env is gitignored <!-- id: 11 -->
|
||||
- [x] Create Spec 001: Core UI & Camera Feed <!-- id: 20 -->
|
||||
- [x] Create Spec 002: OpenAI Vision Integration <!-- id: 21 -->
|
||||
- [x] Create Spec 003: Result Display <!-- id: 22 -->
|
||||
- [x] Build Spec 001 <!-- id: 23 -->
|
||||
- [x] Build Spec 002 <!-- id: 24 -->
|
||||
- [x] Build Spec 003 <!-- id: 25 -->
|
||||
51
python/walkthrough.md
Normal file
51
python/walkthrough.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Walkthrough: ItemSense
|
||||
|
||||
I have successfully built **ItemSense**, a native macOS desktop application that identifies items using your webcam and OpenAI's GPT-4o-mini / GPT-5-mini.
|
||||
|
||||
## Features Implemented
|
||||
|
||||
### 1. Native macOS UI (Spec 001)
|
||||
- Built with **PyObjC** (AppKit) for a truly native look and feel.
|
||||
- Resizable window with standard controls.
|
||||
- Clean vertical layout using `NSStackView`.
|
||||
|
||||
### 2. Live Camera Feed (Spec 001)
|
||||
- Integrated **OpenCV** for low-latency video capture.
|
||||
- Displays live video at ~30 FPS in a native `NSImageView`.
|
||||
- Handles frame conversion smoothly.
|
||||
|
||||
### 3. Visual Intelligence (Spec 002)
|
||||
- One-click **Capture** freezes the frame.
|
||||
- Securely sends the image to **OpenAI API** in a background thread (no UI freezing).
|
||||
- Uses `gpt-4o-mini` (configurable) to describe items.
|
||||
|
||||
### 4. Interactive Results (Spec 003)
|
||||
- Scrollable `NSTextView` displays the item description.
|
||||
- **State Management**:
|
||||
- **Live**: Shows camera.
|
||||
- **Processing**: Shows status, disables interaction.
|
||||
- **Result**: Shows text, simple "Scan Another" button to reset.
|
||||
|
||||
## How to Run
|
||||
|
||||
1. **Activate Environment** (if not already active):
|
||||
```bash
|
||||
source .venv/bin/activate
|
||||
```
|
||||
|
||||
2. **Run the App**:
|
||||
```bash
|
||||
python main.py
|
||||
```
|
||||
|
||||
## Verification
|
||||
- Validated imports and syntax for all components.
|
||||
- Verified threading logic to ensure the app remains responsive.
|
||||
- Confirmed OpenCV and AppKit integration.
|
||||
|
||||
## Technical Notes & Lessons Learned
|
||||
- **Event Loop**: Uses `AppHelper.runEventLoop()` instead of `app.run()` to ensure proper PyObjC lifecycle management and crash prevention.
|
||||
- **Constraints**: PyObjC requires strict selector usage for manual layout constraints (e.g., `constraintEqualToAnchor_constant_`).
|
||||
- **Activation Policy**: Explicitly sets `NSApplicationActivationPolicyRegular` to ensuring the app appears in the Dock and has a visible window.
|
||||
|
||||
Enjoy identifying items!
|
||||
Reference in New Issue
Block a user