Files
CheapRetouch/project-specifications.txt
jared 1049057d7d Add Ralph Wiggum agent setup and project specifications
- Add project constitution with vision, principles, and autonomy settings
- Add 15 feature specifications covering full app scope
- Configure agent entry points (AGENTS.md, CLAUDE.md)
- Add build prompt and speckit command for spec creation
- Include comprehensive .gitignore for iOS development

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 23:19:41 -05:00

332 lines
10 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## CheapRetouch — Revised Specification
### Project Overview
**Platform**: iOS 17.0+
**Objective**: On-device photo editor for removing unwanted elements using only Apple's public frameworks and classical image processing. No custom ML models.
**Core Capabilities** (achievable without ML):
- Person removal (Vision handles this well)
- Foreground object removal (user-initiated, Vision-assisted)
- Wire/line removal (geometric contour detection)
**Removed from Scope**:
- Automatic fence/mesh detection (requires semantic understanding)
- Automatic identification of object types (trash cans, stop signs, etc.)
---
### Technical Stack
| Layer | Framework | Purpose |
|-------|-----------|---------|
| UI | SwiftUI + UIKit interop | Canvas, tools, state management |
| Masking | Vision | `VNGenerateForegroundInstanceMaskRequest`, `VNDetectContoursRequest` |
| Subject Interaction | VisionKit | `ImageAnalyzer`, `ImageAnalysis`, `ImageAnalysisInteraction` |
| Inpainting | Metal (custom) | Patch-based synthesis, mask feathering, blending |
| Compositing | Core Image | Color adjustments, preview pipeline |
| Fallback Processing | Accelerate/vImage | Simulator, older devices without Metal |
---
### Features
#### 1. Person Removal
**How it works**:
1. User taps a person in the photo
2. `VNGenerateForegroundInstanceMaskRequest` generates a precise mask
3. Mask is dilated and feathered
4. Custom Metal inpainting fills the region from surrounding context
**Why this works**: Vision's person segmentation is robust and well-documented for iOS 17+.
**User flow**:
```
Tap person → Mask preview shown → Confirm → Inpaint → Done
```
**Edge cases**:
- Multiple people: user taps each individually or uses "select all people" option
- Partial occlusion: Vision still provides usable mask; user can refine with brush
- No person detected: show "No person found at tap location" feedback
---
#### 2. Foreground Object Removal
**How it works**:
1. User taps an object
2. `VNGenerateForegroundInstanceMaskRequest` attempts to isolate it
3. If successful, mask is used for inpainting
4. If Vision returns no mask (object not salient), fall back to smart brush
**Smart brush fallback**:
- User paints rough selection over object
- App refines selection to nearest strong edges using gradient magnitude analysis
- User confirms refined mask
**Why this works**: Vision detects visually distinct foreground regions. It doesn't know *what* the object is, but it can separate it from the background if there's sufficient contrast.
**Limitations** (be explicit with users):
- Works best on objects that stand out from their background
- Low-contrast objects require manual brush selection
- App cannot identify object types — it sees shapes, not meanings
**User flow**:
```
Tap object → Vision attempts mask
├─ Success → Mask preview → Confirm → Inpaint
└─ Failure → "Use brush to select" prompt → User paints → Edge refinement → Confirm → Inpaint
```
---
#### 3. Wire & Line Removal
**How it works**:
1. User taps near a wire or line
2. `VNDetectContoursRequest` returns all detected contours
3. App scores contours by:
- Proximity to tap point
- Aspect ratio (thin and elongated)
- Straightness / low curvature
- Length (longer scores higher)
4. Best-scoring contour becomes mask
5. Mask is expanded to configurable width (default 6px, range 220px)
6. Inpaint along the mask
**Line brush fallback**:
When contour detection fails (low contrast, busy background):
- User switches to "Line brush" tool
- User draws along the wire
- App maintains consistent stroke width
- Stroke becomes mask for inpainting
**Why this works**: Power lines against sky have strong edges that `VNDetectContoursRequest` captures reliably. The scoring heuristics select the most "wire-like" contour.
**Limitations**:
- High-contrast lines (sky background): works well
- Low-contrast lines (against buildings, trees): requires manual line brush
- Curved wires: contour detection still works; scoring allows moderate curvature
**User flow**:
```
Tap near wire → Contour analysis
├─ Match found → Highlight line → Confirm → Inpaint
└─ No match → "Use line brush" prompt → User draws → Inpaint
```
---
### Inpainting Engine (Metal)
Since there's no public Apple API for content-aware fill, you must implement this yourself.
**Algorithm**: Exemplar-based inpainting (Criminisi-style)
**Why this approach**:
- Deterministic (same input → same output)
- Handles textures reasonably well
- No ML required
- Well-documented in academic literature
**Pipeline**:
```
1. Input: source image + binary mask
2. Dilate mask by 24px (capture edge pixels)
3. Feather mask edges (gaussian blur on alpha)
4. Build image pyramid (for preview vs export)
5. For each pixel on mask boundary (priority order):
a. Find best matching patch from known region
b. Copy patch into unknown region
c. Update boundary
6. Final edge-aware blend to reduce seams
```
**Performance targets**:
| Resolution | Target Time | Device Baseline |
|------------|-------------|-----------------|
| Preview (2048px) | < 300ms | iPhone 12 / A14 |
| Export (12MP) | < 4 seconds | iPhone 12 / A14 |
| Export (48MP) | < 12 seconds | iPhone 15 Pro / A17 |
**Memory management**:
- Tile-based processing for images > 12MP
- Peak memory budget: 1.5GB
- Release intermediate textures aggressively
---
### Data Model (Non-Destructive Editing)
**Principles**:
- Original image is never modified
- All edits stored as an operation stack
- Full undo/redo support
**Operation types**:
```swift
enum EditOperation: Codable {
case mask(MaskOperation)
case inpaint(InpaintOperation)
case adjustment(AdjustmentOperation)
}
struct MaskOperation: Codable {
let id: UUID
let toolType: ToolType // .person, .object, .wire, .brush
let maskData: Data // compressed R8 texture
let timestamp: Date
}
struct InpaintOperation: Codable {
let id: UUID
let maskOperationId: UUID
let patchRadius: Int
let featherAmount: Float
let timestamp: Date
}
```
**Persistence**:
- Project saved as JSON (operation stack) + original image reference
- Store PHAsset local identifier when sourced from Photos
- Store embedded image data when imported from Files
- Cached previews marked `isExcludedFromBackup = true`
---
### UI Specification
**Main Canvas**:
- Pinch to zoom, pan to navigate
- Mask overlay toggle (red tint / marching ants / hidden)
- Before/after comparison (long press or toggle)
**Toolbar**:
| Tool | Icon | Behavior |
|------|------|----------|
| Person | 👤 | Tap to select/remove people |
| Object | ⬭ | Tap to select foreground objects |
| Wire | ⚡ | Tap to select lines/wires |
| Brush | 🖌 | Manual selection for fallback |
| Undo | ↩ | Step back in operation stack |
| Redo | ↪ | Step forward in operation stack |
**Inspector Panel** (contextual):
- Brush size slider (when brush active)
- Feather amount slider
- Mask expansion slider (for wire tool)
- "Refine edges" toggle
**Feedback states**:
- Processing: show spinner on affected region
- No detection: toast message with fallback suggestion
- Success: brief checkmark animation
---
### Error Handling
| Scenario | Response |
|----------|----------|
| Vision returns no mask | "Couldn't detect object. Try the brush tool to select manually." |
| Vision returns low-confidence mask | Show mask preview with "Does this look right?" confirmation |
| Contour detection finds no lines | "No lines detected. Use the line brush to draw along the wire." |
| Inpaint produces visible seams | Offer "Refine" button that expands mask and re-runs |
| Memory pressure during export | "Image too large to process. Try cropping first." |
| Metal unavailable | Fall back to Accelerate with "Processing may be slower" warning |
---
### Privacy & Permissions
- All processing on-device
- No network calls for core functionality
- Photo library access via `PHPickerViewController` (limited access supported)
- Request write permission only when user saves
- No analytics or telemetry in core features
---
### Accessibility
- All tools labeled for VoiceOver
- Brush size adjustable via stepper (not just slider)
- High contrast mask visualization option
- Reduce Motion: disable transition animations
- Dynamic Type support in all UI text
---
### Testing Requirements
| Test Type | Coverage |
|-----------|----------|
| Unit | Edit stack operations, mask combination logic, contour scoring |
| Snapshot | Inpaint engine (reference images with known outputs) |
| UI | Full flow: import → edit → export |
| Performance | Render times on A14, A15, A17 devices |
| Memory | Peak usage during 48MP export |
---
### Project Structure
```
CheapRetouch/
├── App/
│ └── CheapRetouchApp.swift
├── Features/
│ ├── Editor/
│ │ ├── PhotoEditorView.swift
│ │ ├── CanvasView.swift
│ │ └── ToolbarView.swift
│ └── Export/
│ └── ExportView.swift
├── Services/
│ ├── MaskingService.swift // Vision/VisionKit wrappers
│ ├── ContourService.swift // Line detection + scoring
│ └── InpaintEngine/
│ ├── InpaintEngine.swift // Public interface
│ ├── Shaders.metal // Metal kernels
│ └── PatchMatch.swift // Algorithm implementation
├── Models/
│ ├── EditOperation.swift
│ ├── Project.swift
│ └── MaskData.swift
├── Utilities/
│ ├── ImagePipeline.swift // Preview/export rendering
│ └── EdgeRefinement.swift // Smart brush edge detection
└── Resources/
└── Assets.xcassets
```
---
### What's Explicitly Out of Scope
| Feature | Reason |
|---------|--------|
| Automatic fence/mesh detection | Requires semantic understanding (ML) |
| Object type identification | Requires classification (ML) |
| "Find all X in photo" | Requires semantic search (ML) |
| Blemish/skin retouching | Removed to keep scope focused; could add later |
| Background replacement | Different feature set; out of scope for v1 |
---
### Summary
This spec delivers three solid features using only public APIs:
1. **Person removal** — Vision handles the hard part
2. **Object removal** — Vision-assisted with brush fallback
3. **Wire removal** — Contour detection with line brush fallback
Each feature has a clear primary path and a fallback for when detection fails. The user is never stuck.