Files
ItemSense/walkthrough.md
2026-01-21 10:47:32 -05:00

52 lines
1.9 KiB
Markdown

# Walkthrough: ItemSense
I have successfully built **ItemSense**, a native macOS desktop application that identifies items using your webcam and OpenAI's GPT-4o-mini / GPT-5-mini.
## Features Implemented
### 1. Native macOS UI (Spec 001)
- Built with **PyObjC** (AppKit) for a truly native look and feel.
- Resizable window with standard controls.
- Clean vertical layout using `NSStackView`.
### 2. Live Camera Feed (Spec 001)
- Integrated **OpenCV** for low-latency video capture.
- Displays live video at ~30 FPS in a native `NSImageView`.
- Handles frame conversion smoothly.
### 3. Visual Intelligence (Spec 002)
- One-click **Capture** freezes the frame.
- Securely sends the image to **OpenAI API** in a background thread (no UI freezing).
- Uses `gpt-4o-mini` (configurable) to describe items.
### 4. Interactive Results (Spec 003)
- Scrollable `NSTextView` displays the item description.
- **State Management**:
- **Live**: Shows camera.
- **Processing**: Shows status, disables interaction.
- **Result**: Shows text, simple "Scan Another" button to reset.
## How to Run
1. **Activate Environment** (if not already active):
```bash
source .venv/bin/activate
```
2. **Run the App**:
```bash
python main.py
```
## Verification
- Validated imports and syntax for all components.
- Verified threading logic to ensure the app remains responsive.
- Confirmed OpenCV and AppKit integration.
## Technical Notes & Lessons Learned
- **Event Loop**: Uses `AppHelper.runEventLoop()` instead of `app.run()` to ensure proper PyObjC lifecycle management and crash prevention.
- **Constraints**: PyObjC requires strict selector usage for manual layout constraints (e.g., `constraintEqualToAnchor_constant_`).
- **Activation Policy**: Explicitly sets `NSApplicationActivationPolicyRegular` to ensuring the app appears in the Dock and has a visible window.
Enjoy identifying items!