From 34df717a7588bad91f89abf940ee4fcb8edaa9fc Mon Sep 17 00:00:00 2001 From: jared Date: Wed, 21 Jan 2026 10:48:07 -0500 Subject: [PATCH] Add project tracking documentation --- implementation_plan.md | 47 ++++++++++++++++++++++++++++++++++++++++++ task.md | 20 ++++++++++++++++++ 2 files changed, 67 insertions(+) create mode 100644 implementation_plan.md create mode 100644 task.md diff --git a/implementation_plan.md b/implementation_plan.md new file mode 100644 index 0000000..790fe72 --- /dev/null +++ b/implementation_plan.md @@ -0,0 +1,47 @@ +# Implementation Plan: ItemSense MVP + +## Goal Description +Build a desktop application to identify items using the webcam and OpenAI's visual capabilities, using a **native macOS UI (PyObjC)**. + +## User Review Required +- **Technology Shift**: Switching from Tkinter to PyObjC (AppKit). +- **Camera Strategy**: Using OpenCV for frame capture and bridging to AppKit for display. This keeps the implementation simpler than writing a raw AVFoundation delegate in Python while maintaining a native UI. + +## Proposed Changes + +### Spec 001: Core UI & Camera Feed +#### [NEW] main.py +- Initialize `NSApplication` and `NSWindow` (AppKit). +- Implement a custom `AppDelegate` to handle app lifecycle. +- Integrate OpenCV (`cv2`) for webcam capture. +- Display video frames in an `NSImageView`. + +#### [NEW] requirements.txt +- `pyobjc-framework-Cocoa` +- `opencv-python` +- `Pillow` (for easier image data manipulation if needed) + +### Spec 002: OpenAI Vision Integration +#### [MODIFY] main.py +- Add `Capture` button (`NSButton`) to the UI. +- Implement logic to snapshot the current OpenCV frame. +- Run OpenAI API request in a background thread to prevent UI freezing. +- Send image to OpenAI API. + +#### [MODIFY] requirements.txt +- Add `openai` +- Add `python-dotenv` + +### Spec 003: Result Display +#### [MODIFY] main.py +- Add `NSTextView` (in `NSScrollView`) for results. +- Add "Scan Another" button logic. +- Ensure UI layout manages state transitions cleanly (Live vs Result). + +## Verification Plan + +### Manual Verification +1. **Launch**: Run `python main.py`. +2. **Native Look**: Verify the window uses native macOS controls. +3. **Feed**: Verify camera feed is smooth and correctly oriented. +4. **Flow**: Capture -> Processing -> Result -> Scan Another. diff --git a/task.md b/task.md new file mode 100644 index 0000000..b1586d4 --- /dev/null +++ b/task.md @@ -0,0 +1,20 @@ +# Task: Install and Setup ralph-wiggum + +- [x] Read INSTALLATION.md +- [x] Phase 1: Create Structure +- [x] Phase 2: Download Scripts +- [x] Phase 3: Get Version Info +- [x] Phase 4: Project Interview (Ask User) +- [x] Phase 5: Create Constitution +- [x] Phase 6: Create Agent Entry Files +- [x] Phase 7: Create Prompts +- [x] Phase 8: Create Cursor Command +- [x] Phase 9: Finalize and Explain +- [x] Store OpenAI API Key in .env +- [x] Ensure .env is gitignored +- [x] Create Spec 001: Core UI & Camera Feed +- [x] Create Spec 002: OpenAI Vision Integration +- [x] Create Spec 003: Result Display +- [x] Build Spec 001 +- [x] Build Spec 002 +- [x] Build Spec 003