jared/ItemSense

Files

jared c432dbf883 Add walkthrough documentation

2026-01-21 10:47:32 -05:00

1.9 KiB

Raw Blame History

Walkthrough: ItemSense

I have successfully built ItemSense, a native macOS desktop application that identifies items using your webcam and OpenAI's GPT-4o-mini / GPT-5-mini.

Features Implemented

1. Native macOS UI (Spec 001)

Built with PyObjC (AppKit) for a truly native look and feel.
Resizable window with standard controls.
Clean vertical layout using NSStackView.

2. Live Camera Feed (Spec 001)

Integrated OpenCV for low-latency video capture.
Displays live video at ~30 FPS in a native NSImageView.
Handles frame conversion smoothly.

3. Visual Intelligence (Spec 002)

One-click Capture freezes the frame.
Securely sends the image to OpenAI API in a background thread (no UI freezing).
Uses gpt-4o-mini (configurable) to describe items.

4. Interactive Results (Spec 003)

Scrollable NSTextView displays the item description.
State Management:
- Live: Shows camera.
- Processing: Shows status, disables interaction.
- Result: Shows text, simple "Scan Another" button to reset.

How to Run

Activate Environment (if not already active):
```
source .venv/bin/activate
```
Run the App:
```
python main.py
```

Verification

Validated imports and syntax for all components.
Verified threading logic to ensure the app remains responsive.
Confirmed OpenCV and AppKit integration.

Technical Notes & Lessons Learned

Event Loop: Uses AppHelper.runEventLoop() instead of app.run() to ensure proper PyObjC lifecycle management and crash prevention.
Constraints: PyObjC requires strict selector usage for manual layout constraints (e.g., constraintEqualToAnchor_constant_).
Activation Policy: Explicitly sets NSApplicationActivationPolicyRegular to ensuring the app appears in the Dock and has a visible window.

Enjoy identifying items!