Add project tracking documentation

2026-01-21 10:48:07 -05:00
parent c432dbf883
commit 34df717a75
2 changed files with 67 additions and 0 deletions
--- a/implementation_plan.md
+++ b/implementation_plan.md
@@ -0,0 +1,47 @@
+# Implementation Plan: ItemSense MVP
+
+## Goal Description
+Build a desktop application to identify items using the webcam and OpenAI's visual capabilities, using a **native macOS UI (PyObjC)**.
+
+## User Review Required
+- **Technology Shift**: Switching from Tkinter to PyObjC (AppKit).
+- **Camera Strategy**: Using OpenCV for frame capture and bridging to AppKit for display. This keeps the implementation simpler than writing a raw AVFoundation delegate in Python while maintaining a native UI.
+
+## Proposed Changes
+
+### Spec 001: Core UI & Camera Feed
+#### [NEW] main.py
+- Initialize `NSApplication` and `NSWindow` (AppKit).
+- Implement a custom `AppDelegate` to handle app lifecycle.
+- Integrate OpenCV (`cv2`) for webcam capture.
+- Display video frames in an `NSImageView`.
+
+#### [NEW] requirements.txt
+- `pyobjc-framework-Cocoa`
+- `opencv-python`
+- `Pillow` (for easier image data manipulation if needed)
+
+### Spec 002: OpenAI Vision Integration
+#### [MODIFY] main.py
+- Add `Capture` button (`NSButton`) to the UI.
+- Implement logic to snapshot the current OpenCV frame.
+- Run OpenAI API request in a background thread to prevent UI freezing.
+- Send image to OpenAI API.
+
+#### [MODIFY] requirements.txt
+- Add `openai`
+- Add `python-dotenv`
+
+### Spec 003: Result Display
+#### [MODIFY] main.py
+- Add `NSTextView` (in `NSScrollView`) for results.
+- Add "Scan Another" button logic.
+- Ensure UI layout manages state transitions cleanly (Live vs Result).
+
+## Verification Plan
+
+### Manual Verification
+1.  **Launch**: Run `python main.py`.
+2.  **Native Look**: Verify the window uses native macOS controls.
+3.  **Feed**: Verify camera feed is smooth and correctly oriented.
+4.  **Flow**: Capture -> Processing -> Result -> Scan Another.
--- a/task.md
+++ b/task.md
@@ -0,0 +1,20 @@
+# Task: Install and Setup ralph-wiggum
+
+- [x] Read INSTALLATION.md <!-- id: 0 -->
+- [x] Phase 1: Create Structure <!-- id: 1 -->
+- [x] Phase 2: Download Scripts <!-- id: 2 -->
+- [x] Phase 3: Get Version Info <!-- id: 3 -->
+- [x] Phase 4: Project Interview (Ask User) <!-- id: 4 -->
+- [x] Phase 5: Create Constitution <!-- id: 5 -->
+- [x] Phase 6: Create Agent Entry Files <!-- id: 6 -->
+- [x] Phase 7: Create Prompts <!-- id: 7 -->
+- [x] Phase 8: Create Cursor Command <!-- id: 8 -->
+- [x] Phase 9: Finalize and Explain <!-- id: 9 -->
+- [x] Store OpenAI API Key in .env <!-- id: 10 -->
+- [x] Ensure .env is gitignored <!-- id: 11 -->
+- [x] Create Spec 001: Core UI & Camera Feed <!-- id: 20 -->
+- [x] Create Spec 002: OpenAI Vision Integration <!-- id: 21 -->
+- [x] Create Spec 003: Result Display <!-- id: 22 -->
+- [x] Build Spec 001 <!-- id: 23 -->
+- [x] Build Spec 002 <!-- id: 24 -->
+- [x] Build Spec 003 <!-- id: 25 -->