diff --git a/.claude/commands/speckit.specify.md b/.claude/commands/speckit.specify.md new file mode 100644 index 0000000..13d65ae --- /dev/null +++ b/.claude/commands/speckit.specify.md @@ -0,0 +1,31 @@ +# Create Specification + +Create a new specification file in `specs/` based on the description provided. + +## Arguments + +$ARGUMENTS - Description of the feature to specify + +## Output + +Create a file `specs/[feature-name].md` with: + +```markdown +# [Feature Name] + +## Description +[What this feature does] + +## Acceptance Criteria +- [ ] Criterion 1 +- [ ] Criterion 2 +- [ ] ... + +## Technical Notes +[Implementation guidance based on project-specifications.txt] + +## Edge Cases +[How to handle failure scenarios] +``` + +Ensure acceptance criteria are specific, testable, and aligned with the project constitution. diff --git a/.claude/settings.local.json b/.claude/settings.local.json new file mode 100644 index 0000000..e0265de --- /dev/null +++ b/.claude/settings.local.json @@ -0,0 +1,10 @@ +{ + "permissions": { + "allow": [ + "WebFetch(domain:github.com)", + "WebFetch(domain:raw.githubusercontent.com)", + "Bash(git ls-remote:*)", + "Bash(git add:*)" + ] + } +} diff --git a/.cursor/commands/speckit.specify.md b/.cursor/commands/speckit.specify.md new file mode 100644 index 0000000..13d65ae --- /dev/null +++ b/.cursor/commands/speckit.specify.md @@ -0,0 +1,31 @@ +# Create Specification + +Create a new specification file in `specs/` based on the description provided. + +## Arguments + +$ARGUMENTS - Description of the feature to specify + +## Output + +Create a file `specs/[feature-name].md` with: + +```markdown +# [Feature Name] + +## Description +[What this feature does] + +## Acceptance Criteria +- [ ] Criterion 1 +- [ ] Criterion 2 +- [ ] ... + +## Technical Notes +[Implementation guidance based on project-specifications.txt] + +## Edge Cases +[How to handle failure scenarios] +``` + +Ensure acceptance criteria are specific, testable, and aligned with the project constitution. diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..da92de8 --- /dev/null +++ b/.gitignore @@ -0,0 +1,56 @@ +# Xcode +build/ +DerivedData/ +*.xcodeproj/project.xcworkspace/ +*.xcodeproj/xcuserdata/ +*.xcworkspace/xcuserdata/ +*.pbxuser +*.mode1v3 +*.mode2v3 +*.perspectivev3 +!default.pbxuser +!default.mode1v3 +!default.mode2v3 +!default.perspectivev3 +xcuserdata/ + +# Swift Package Manager +.build/ +.swiftpm/ +Package.resolved + +# CocoaPods (if ever used) +Pods/ + +# Fastlane +fastlane/report.xml +fastlane/Preview.html +fastlane/screenshots/**/*.png +fastlane/test_output + +# macOS +.DS_Store +.AppleDouble +.LSOverride +._* + +# Thumbnails +Thumbs.db + +# IDE +*.swp +*.swo +*~ +.idea/ +*.iml + +# Logs +logs/*.log + +# Ralph Wiggum +history/ +logs/ + +# Temporary files +*.tmp +*.temp diff --git a/.specify/memory/constitution.md b/.specify/memory/constitution.md new file mode 100644 index 0000000..285c5d2 --- /dev/null +++ b/.specify/memory/constitution.md @@ -0,0 +1,72 @@ +# CheapRetouch Constitution + +**Version**: ralph-wiggum@81231ca4e7466d84e3908841e9ed3d08e8c0803e +**Created**: 2026-01-23 + +## Project Vision + +On-device iOS photo editor for removing unwanted elements using only Apple's public frameworks and classical image processing. No custom ML models. + +**Platform**: iOS 17.0+ + +**Core Capabilities**: +- Person removal (Vision handles this well) +- Foreground object removal (user-initiated, Vision-assisted) +- Wire/line removal (geometric contour detection) + +## Core Principles + +1. **Privacy-first**: All processing on-device, no network calls for core functionality, no analytics or telemetry +2. **Graceful fallbacks**: Always provide manual alternatives (brush, line brush) when auto-detection fails - the user is never stuck +3. **Performance matters**: Fast previews (<300ms), responsive UI, efficient memory management (1.5GB budget) + +## Technical Stack + +| Layer | Framework | Purpose | +|-------|-----------|---------| +| UI | SwiftUI + UIKit interop | Canvas, tools, state management | +| Masking | Vision | `VNGenerateForegroundInstanceMaskRequest`, `VNDetectContoursRequest` | +| Subject Interaction | VisionKit | `ImageAnalyzer`, `ImageAnalysis`, `ImageAnalysisInteraction` | +| Inpainting | Metal (custom) | Patch-based synthesis, mask feathering, blending | +| Compositing | Core Image | Color adjustments, preview pipeline | +| Fallback Processing | Accelerate/vImage | Simulator, older devices without Metal | + +## Autonomy Settings + +- **YOLO Mode**: ENABLED - Make implementation decisions autonomously +- **Git Autonomy**: ENABLED - Commit and push completed work automatically + +## Operational Guidelines + +### When Implementing Features + +1. Read the spec completely before starting +2. Follow the project structure defined in the spec +3. Implement the primary path first, then fallbacks +4. Ensure all edge cases from the spec are handled +5. Write tests as specified (unit, snapshot, UI, performance, memory) + +### Code Quality + +- Non-destructive editing: original image never modified +- Full undo/redo support via operation stack +- All operations must be Codable for persistence +- Memory management: release intermediate textures aggressively +- Tile-based processing for images > 12MP + +### User Experience + +- Clear feedback states (processing, no detection, success) +- Contextual inspector panel based on active tool +- Accessibility: VoiceOver labels, Dynamic Type, Reduce Motion support + +### What NOT To Do + +- Do not add ML models or custom neural networks +- Do not add features explicitly marked as out of scope +- Do not make network calls for core functionality +- Do not over-engineer beyond what the spec requires + +## Completion Signal + +Output `DONE` only when a specification is 100% complete with all acceptance criteria met. Never output this prematurely. diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..9630c42 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,18 @@ +# Agent Instructions + +Read and follow the constitution at `.specify/memory/constitution.md` for all project guidelines, principles, and technical decisions. + +## Quick Reference + +- **Project**: CheapRetouch - iOS photo editor for removing unwanted elements +- **Platform**: iOS 17.0+ +- **Stack**: SwiftUI, Vision, VisionKit, Metal, Core Image +- **YOLO Mode**: Enabled +- **Git Autonomy**: Enabled + +## Workflow + +1. Check `specs/` for incomplete specifications +2. Implement each spec fully before moving to the next +3. Commit and push when spec is complete +4. Output `DONE` only when 100% complete diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..61fec5d --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,18 @@ +# Claude Instructions + +Read and follow the constitution at `.specify/memory/constitution.md` for all project guidelines, principles, and technical decisions. + +## Quick Reference + +- **Project**: CheapRetouch - iOS photo editor for removing unwanted elements +- **Platform**: iOS 17.0+ +- **Stack**: SwiftUI, Vision, VisionKit, Metal, Core Image +- **YOLO Mode**: Enabled +- **Git Autonomy**: Enabled + +## Workflow + +1. Check `specs/` for incomplete specifications +2. Implement each spec fully before moving to the next +3. Commit and push when spec is complete +4. Output `DONE` only when 100% complete diff --git a/PROMPT_build.md b/PROMPT_build.md new file mode 100644 index 0000000..29b2a0a --- /dev/null +++ b/PROMPT_build.md @@ -0,0 +1,28 @@ +# Build Mode Instructions + +You are implementing CheapRetouch, an iOS photo editor for removing unwanted elements. + +## Your Task + +1. Read `.specify/memory/constitution.md` for project guidelines +2. Find the next incomplete spec in `specs/` +3. Implement it fully, following all acceptance criteria +4. Run tests to verify completion +5. Commit and push (Git Autonomy is enabled) +6. Output `DONE` when the spec is 100% complete + +## Implementation Guidelines + +- Follow the project structure in `project-specifications.txt` +- Implement primary detection path first, then fallbacks +- Handle all edge cases specified +- Write tests as required (unit, snapshot, UI, performance) +- Keep memory under 1.5GB budget +- Ensure preview renders < 300ms on A14 baseline + +## Do Not + +- Add ML models or neural networks +- Add features marked as out of scope +- Make network calls for core functionality +- Output `DONE` until truly complete diff --git a/project-specifications.txt b/project-specifications.txt new file mode 100644 index 0000000..503fb16 --- /dev/null +++ b/project-specifications.txt @@ -0,0 +1,331 @@ +## CheapRetouch — Revised Specification + +### Project Overview + +**Platform**: iOS 17.0+ + +**Objective**: On-device photo editor for removing unwanted elements using only Apple's public frameworks and classical image processing. No custom ML models. + +**Core Capabilities** (achievable without ML): +- Person removal (Vision handles this well) +- Foreground object removal (user-initiated, Vision-assisted) +- Wire/line removal (geometric contour detection) + +**Removed from Scope**: +- Automatic fence/mesh detection (requires semantic understanding) +- Automatic identification of object types (trash cans, stop signs, etc.) + +--- + +### Technical Stack + +| Layer | Framework | Purpose | +|-------|-----------|---------| +| UI | SwiftUI + UIKit interop | Canvas, tools, state management | +| Masking | Vision | `VNGenerateForegroundInstanceMaskRequest`, `VNDetectContoursRequest` | +| Subject Interaction | VisionKit | `ImageAnalyzer`, `ImageAnalysis`, `ImageAnalysisInteraction` | +| Inpainting | Metal (custom) | Patch-based synthesis, mask feathering, blending | +| Compositing | Core Image | Color adjustments, preview pipeline | +| Fallback Processing | Accelerate/vImage | Simulator, older devices without Metal | + +--- + +### Features + +#### 1. Person Removal + +**How it works**: +1. User taps a person in the photo +2. `VNGenerateForegroundInstanceMaskRequest` generates a precise mask +3. Mask is dilated and feathered +4. Custom Metal inpainting fills the region from surrounding context + +**Why this works**: Vision's person segmentation is robust and well-documented for iOS 17+. + +**User flow**: +``` +Tap person → Mask preview shown → Confirm → Inpaint → Done +``` + +**Edge cases**: +- Multiple people: user taps each individually or uses "select all people" option +- Partial occlusion: Vision still provides usable mask; user can refine with brush +- No person detected: show "No person found at tap location" feedback + +--- + +#### 2. Foreground Object Removal + +**How it works**: +1. User taps an object +2. `VNGenerateForegroundInstanceMaskRequest` attempts to isolate it +3. If successful, mask is used for inpainting +4. If Vision returns no mask (object not salient), fall back to smart brush + +**Smart brush fallback**: +- User paints rough selection over object +- App refines selection to nearest strong edges using gradient magnitude analysis +- User confirms refined mask + +**Why this works**: Vision detects visually distinct foreground regions. It doesn't know *what* the object is, but it can separate it from the background if there's sufficient contrast. + +**Limitations** (be explicit with users): +- Works best on objects that stand out from their background +- Low-contrast objects require manual brush selection +- App cannot identify object types — it sees shapes, not meanings + +**User flow**: +``` +Tap object → Vision attempts mask + ├─ Success → Mask preview → Confirm → Inpaint + └─ Failure → "Use brush to select" prompt → User paints → Edge refinement → Confirm → Inpaint +``` + +--- + +#### 3. Wire & Line Removal + +**How it works**: +1. User taps near a wire or line +2. `VNDetectContoursRequest` returns all detected contours +3. App scores contours by: + - Proximity to tap point + - Aspect ratio (thin and elongated) + - Straightness / low curvature + - Length (longer scores higher) +4. Best-scoring contour becomes mask +5. Mask is expanded to configurable width (default 6px, range 2–20px) +6. Inpaint along the mask + +**Line brush fallback**: +When contour detection fails (low contrast, busy background): +- User switches to "Line brush" tool +- User draws along the wire +- App maintains consistent stroke width +- Stroke becomes mask for inpainting + +**Why this works**: Power lines against sky have strong edges that `VNDetectContoursRequest` captures reliably. The scoring heuristics select the most "wire-like" contour. + +**Limitations**: +- High-contrast lines (sky background): works well +- Low-contrast lines (against buildings, trees): requires manual line brush +- Curved wires: contour detection still works; scoring allows moderate curvature + +**User flow**: +``` +Tap near wire → Contour analysis + ├─ Match found → Highlight line → Confirm → Inpaint + └─ No match → "Use line brush" prompt → User draws → Inpaint +``` + +--- + +### Inpainting Engine (Metal) + +Since there's no public Apple API for content-aware fill, you must implement this yourself. + +**Algorithm**: Exemplar-based inpainting (Criminisi-style) + +**Why this approach**: +- Deterministic (same input → same output) +- Handles textures reasonably well +- No ML required +- Well-documented in academic literature + +**Pipeline**: +``` +1. Input: source image + binary mask +2. Dilate mask by 2–4px (capture edge pixels) +3. Feather mask edges (gaussian blur on alpha) +4. Build image pyramid (for preview vs export) +5. For each pixel on mask boundary (priority order): + a. Find best matching patch from known region + b. Copy patch into unknown region + c. Update boundary +6. Final edge-aware blend to reduce seams +``` + +**Performance targets**: + +| Resolution | Target Time | Device Baseline | +|------------|-------------|-----------------| +| Preview (2048px) | < 300ms | iPhone 12 / A14 | +| Export (12MP) | < 4 seconds | iPhone 12 / A14 | +| Export (48MP) | < 12 seconds | iPhone 15 Pro / A17 | + +**Memory management**: +- Tile-based processing for images > 12MP +- Peak memory budget: 1.5GB +- Release intermediate textures aggressively + +--- + +### Data Model (Non-Destructive Editing) + +**Principles**: +- Original image is never modified +- All edits stored as an operation stack +- Full undo/redo support + +**Operation types**: +```swift +enum EditOperation: Codable { + case mask(MaskOperation) + case inpaint(InpaintOperation) + case adjustment(AdjustmentOperation) +} + +struct MaskOperation: Codable { + let id: UUID + let toolType: ToolType // .person, .object, .wire, .brush + let maskData: Data // compressed R8 texture + let timestamp: Date +} + +struct InpaintOperation: Codable { + let id: UUID + let maskOperationId: UUID + let patchRadius: Int + let featherAmount: Float + let timestamp: Date +} +``` + +**Persistence**: +- Project saved as JSON (operation stack) + original image reference +- Store PHAsset local identifier when sourced from Photos +- Store embedded image data when imported from Files +- Cached previews marked `isExcludedFromBackup = true` + +--- + +### UI Specification + +**Main Canvas**: +- Pinch to zoom, pan to navigate +- Mask overlay toggle (red tint / marching ants / hidden) +- Before/after comparison (long press or toggle) + +**Toolbar**: + +| Tool | Icon | Behavior | +|------|------|----------| +| Person | 👤 | Tap to select/remove people | +| Object | ⬭ | Tap to select foreground objects | +| Wire | ⚡ | Tap to select lines/wires | +| Brush | 🖌 | Manual selection for fallback | +| Undo | ↩ | Step back in operation stack | +| Redo | ↪ | Step forward in operation stack | + +**Inspector Panel** (contextual): +- Brush size slider (when brush active) +- Feather amount slider +- Mask expansion slider (for wire tool) +- "Refine edges" toggle + +**Feedback states**: +- Processing: show spinner on affected region +- No detection: toast message with fallback suggestion +- Success: brief checkmark animation + +--- + +### Error Handling + +| Scenario | Response | +|----------|----------| +| Vision returns no mask | "Couldn't detect object. Try the brush tool to select manually." | +| Vision returns low-confidence mask | Show mask preview with "Does this look right?" confirmation | +| Contour detection finds no lines | "No lines detected. Use the line brush to draw along the wire." | +| Inpaint produces visible seams | Offer "Refine" button that expands mask and re-runs | +| Memory pressure during export | "Image too large to process. Try cropping first." | +| Metal unavailable | Fall back to Accelerate with "Processing may be slower" warning | + +--- + +### Privacy & Permissions + +- All processing on-device +- No network calls for core functionality +- Photo library access via `PHPickerViewController` (limited access supported) +- Request write permission only when user saves +- No analytics or telemetry in core features + +--- + +### Accessibility + +- All tools labeled for VoiceOver +- Brush size adjustable via stepper (not just slider) +- High contrast mask visualization option +- Reduce Motion: disable transition animations +- Dynamic Type support in all UI text + +--- + +### Testing Requirements + +| Test Type | Coverage | +|-----------|----------| +| Unit | Edit stack operations, mask combination logic, contour scoring | +| Snapshot | Inpaint engine (reference images with known outputs) | +| UI | Full flow: import → edit → export | +| Performance | Render times on A14, A15, A17 devices | +| Memory | Peak usage during 48MP export | + +--- + +### Project Structure + +``` +CheapRetouch/ +├── App/ +│ └── CheapRetouchApp.swift +├── Features/ +│ ├── Editor/ +│ │ ├── PhotoEditorView.swift +│ │ ├── CanvasView.swift +│ │ └── ToolbarView.swift +│ └── Export/ +│ └── ExportView.swift +├── Services/ +│ ├── MaskingService.swift // Vision/VisionKit wrappers +│ ├── ContourService.swift // Line detection + scoring +│ └── InpaintEngine/ +│ ├── InpaintEngine.swift // Public interface +│ ├── Shaders.metal // Metal kernels +│ └── PatchMatch.swift // Algorithm implementation +├── Models/ +│ ├── EditOperation.swift +│ ├── Project.swift +│ └── MaskData.swift +├── Utilities/ +│ ├── ImagePipeline.swift // Preview/export rendering +│ └── EdgeRefinement.swift // Smart brush edge detection +└── Resources/ + └── Assets.xcassets +``` + +--- + +### What's Explicitly Out of Scope + +| Feature | Reason | +|---------|--------| +| Automatic fence/mesh detection | Requires semantic understanding (ML) | +| Object type identification | Requires classification (ML) | +| "Find all X in photo" | Requires semantic search (ML) | +| Blemish/skin retouching | Removed to keep scope focused; could add later | +| Background replacement | Different feature set; out of scope for v1 | + +--- + +### Summary + +This spec delivers three solid features using only public APIs: + +1. **Person removal** — Vision handles the hard part +2. **Object removal** — Vision-assisted with brush fallback +3. **Wire removal** — Contour detection with line brush fallback + +Each feature has a clear primary path and a fallback for when detection fails. The user is never stuck. diff --git a/specs/01-project-setup.md b/specs/01-project-setup.md new file mode 100644 index 0000000..f46a3d4 --- /dev/null +++ b/specs/01-project-setup.md @@ -0,0 +1,31 @@ +# Project Setup + +## Description +Initialize the Xcode project with the correct structure, targets, and dependencies for CheapRetouch. + +## Acceptance Criteria +- [ ] Xcode project created with iOS 17.0 deployment target +- [ ] Project structure matches specification: + ``` + CheapRetouch/ + ├── App/CheapRetouchApp.swift + ├── Features/Editor/ + ├── Features/Export/ + ├── Services/ + ├── Services/InpaintEngine/ + ├── Models/ + ├── Utilities/ + └── Resources/Assets.xcassets + ``` +- [ ] SwiftUI app lifecycle configured +- [ ] Metal capability added to project +- [ ] Photo library usage description added to Info.plist +- [ ] App builds and runs on iOS 17 simulator + +## Technical Notes +- Use SwiftUI App lifecycle (`@main` struct) +- No external dependencies - Apple frameworks only +- Ensure Metal framework is linked + +## Edge Cases +- None for setup phase diff --git a/specs/02-data-model.md b/specs/02-data-model.md new file mode 100644 index 0000000..c0d5a2c --- /dev/null +++ b/specs/02-data-model.md @@ -0,0 +1,31 @@ +# Data Model & Edit Operations + +## Description +Implement the non-destructive editing data model with operation stack for full undo/redo support. + +## Acceptance Criteria +- [ ] `EditOperation` enum implemented with cases: `.mask`, `.inpaint`, `.adjustment` +- [ ] `MaskOperation` struct with: id, toolType, maskData (compressed R8), timestamp +- [ ] `InpaintOperation` struct with: id, maskOperationId, patchRadius, featherAmount, timestamp +- [ ] `ToolType` enum with cases: `.person`, `.object`, `.wire`, `.brush` +- [ ] `Project` model that holds: + - Original image reference (PHAsset identifier or embedded Data) + - Operation stack (array of EditOperation) + - Current stack position for undo/redo +- [ ] All models conform to `Codable` +- [ ] Undo operation decrements stack position +- [ ] Redo operation increments stack position +- [ ] Project can be serialized to/from JSON +- [ ] Unit tests for operation stack logic + +## Technical Notes +- Original image is NEVER modified +- Mask data should be compressed (R8 texture format) +- Store PHAsset `localIdentifier` for Photos-sourced images +- Store embedded image data for Files-imported images +- Cached previews should set `isExcludedFromBackup = true` + +## Edge Cases +- Undo at beginning of stack: no-op, return false +- Redo at end of stack: no-op, return false +- Empty operation stack: valid state, shows original image diff --git a/specs/03-inpaint-engine.md b/specs/03-inpaint-engine.md new file mode 100644 index 0000000..bf9235b --- /dev/null +++ b/specs/03-inpaint-engine.md @@ -0,0 +1,43 @@ +# Inpainting Engine (Metal) + +## Description +Implement exemplar-based inpainting (Criminisi-style) using Metal for content-aware fill of masked regions. + +## Acceptance Criteria +- [ ] `InpaintEngine` class with public interface: + - `func inpaint(image: CGImage, mask: CGImage) async throws -> CGImage` + - `func inpaintPreview(image: CGImage, mask: CGImage) async throws -> CGImage` +- [ ] Metal shaders in `Shaders.metal` for: + - Mask dilation (2-4px configurable) + - Mask feathering (gaussian blur on alpha) + - Patch matching (find best match from known region) + - Patch copying (fill unknown region) + - Edge-aware blending (reduce seams) +- [ ] `PatchMatch.swift` implementing the algorithm: + - Build image pyramid for preview vs export + - Priority-based boundary pixel processing + - Best-matching patch search + - Boundary update after each patch copy +- [ ] Performance targets met: + - Preview (2048px): < 300ms on A14 + - Export (12MP): < 4 seconds on A14 + - Export (48MP): < 12 seconds on A17 Pro +- [ ] Memory management: + - Tile-based processing for images > 12MP + - Peak memory < 1.5GB + - Intermediate textures released aggressively +- [ ] Accelerate/vImage fallback when Metal unavailable +- [ ] Snapshot tests with reference images verifying output quality + +## Technical Notes +- Criminisi algorithm paper: "Region Filling and Object Removal by Exemplar-Based Inpainting" +- Patch size typically 9x9 or 11x11 pixels +- Priority = confidence × data term (edge strength) +- Search region can be limited for performance +- Use MTLHeap for efficient texture allocation + +## Edge Cases +- Metal unavailable: fall back to Accelerate with warning toast +- Memory pressure during export: throw error with "Image too large" message +- Very large mask (>50% of image): may produce poor results, warn user +- Mask touches image edge: handle boundary conditions in patch search diff --git a/specs/04-masking-service.md b/specs/04-masking-service.md new file mode 100644 index 0000000..cd91257 --- /dev/null +++ b/specs/04-masking-service.md @@ -0,0 +1,28 @@ +# Masking Service + +## Description +Wrapper around Vision framework for generating masks from user taps and contour detection. + +## Acceptance Criteria +- [ ] `MaskingService` class with methods: + - `func generatePersonMask(at point: CGPoint, in image: CGImage) async throws -> CGImage?` + - `func generateForegroundMask(at point: CGPoint, in image: CGImage) async throws -> CGImage?` + - `func detectContours(in image: CGImage) async throws -> [VNContour]` +- [ ] Uses `VNGenerateForegroundInstanceMaskRequest` for person/object masks +- [ ] Uses `VNDetectContoursRequest` for wire/line detection +- [ ] Mask dilation method: `func dilate(mask: CGImage, by pixels: Int) -> CGImage` +- [ ] Mask feathering method: `func feather(mask: CGImage, amount: Float) -> CGImage` +- [ ] Returns `nil` when no mask detected at tap location (not an error) +- [ ] Unit tests for mask operations + +## Technical Notes +- `VNGenerateForegroundInstanceMaskRequest` requires iOS 17+ +- Point coordinates must be normalized (0-1) for Vision requests +- Instance masks can identify multiple separate foreground objects +- Use `indexesOfInstancesContainingPoint` to find which instance was tapped + +## Edge Cases +- No person/object at tap location: return nil, caller shows fallback UI +- Multiple overlapping instances: return the one containing the tap point +- Vision request fails: throw descriptive error +- Image orientation: ensure coordinates are transformed correctly diff --git a/specs/05-contour-service.md b/specs/05-contour-service.md new file mode 100644 index 0000000..9a7f9ac --- /dev/null +++ b/specs/05-contour-service.md @@ -0,0 +1,30 @@ +# Contour Service + +## Description +Service for detecting and scoring wire/line contours from Vision results. + +## Acceptance Criteria +- [ ] `ContourService` class with methods: + - `func findBestWireContour(at point: CGPoint, from contours: [VNContour]) -> VNContour?` + - `func scoreContour(_ contour: VNContour, relativeTo point: CGPoint) -> Float` + - `func contourToMask(_ contour: VNContour, width: Int, imageSize: CGSize) -> CGImage` +- [ ] Contour scoring considers: + - Proximity to tap point (closer = higher score) + - Aspect ratio (thin and elongated = higher score) + - Straightness / low curvature (straighter = higher score) + - Length (longer = higher score) +- [ ] Configurable mask width: default 6px, range 2-20px +- [ ] Returns `nil` when no wire-like contour found near tap +- [ ] Unit tests for contour scoring logic + +## Technical Notes +- VNContour provides normalized path points +- Calculate curvature by analyzing angle changes along path +- Aspect ratio = length / average width +- Weight scoring factors: proximity (0.3), aspect (0.3), straightness (0.2), length (0.2) + +## Edge Cases +- No contours detected: return nil +- All contours score below threshold: return nil +- Curved wires: allow moderate curvature, don't require perfectly straight +- Contour very close to tap but not wire-like: score should be low diff --git a/specs/06-canvas-view.md b/specs/06-canvas-view.md new file mode 100644 index 0000000..180cef3 --- /dev/null +++ b/specs/06-canvas-view.md @@ -0,0 +1,34 @@ +# Canvas View + +## Description +Main editing canvas with pinch-to-zoom, pan, mask overlay, and before/after comparison. + +## Acceptance Criteria +- [ ] `CanvasView` SwiftUI view displaying the current edited image +- [ ] Pinch-to-zoom gesture with smooth animation +- [ ] Pan gesture for navigation when zoomed +- [ ] Zoom limits: 1x to 10x +- [ ] Mask overlay modes: + - Red tint (50% opacity red on masked areas) + - Marching ants (animated dashed border) + - Hidden (no overlay) +- [ ] Toggle between overlay modes via UI control +- [ ] Before/after comparison: + - Long press shows original image + - Release returns to edited version + - Optional toggle button for sticky comparison +- [ ] Renders at appropriate resolution for current zoom level +- [ ] Smooth 60fps interaction on A14 devices +- [ ] UI tests for gesture interactions + +## Technical Notes +- Use `MagnificationGesture` and `DragGesture` simultaneously +- Consider using UIKit interop (`UIViewRepresentable`) for smoother gestures if needed +- Mask overlay should be composited efficiently (don't re-render full image) +- Use `drawingGroup()` or Metal for overlay rendering if performance issues + +## Edge Cases +- Zoom at image boundary: clamp pan to keep image visible +- Very large image: use tiled rendering or lower resolution preview +- No edits yet: before/after shows same image (no-op) +- Rapid gesture changes: debounce if needed to prevent jank diff --git a/specs/07-toolbar-view.md b/specs/07-toolbar-view.md new file mode 100644 index 0000000..02d482c --- /dev/null +++ b/specs/07-toolbar-view.md @@ -0,0 +1,37 @@ +# Toolbar View + +## Description +Tool selection toolbar with contextual inspector panel. + +## Acceptance Criteria +- [ ] `ToolbarView` SwiftUI view with tool buttons: + - Person tool (👤 icon or SF Symbol) + - Object tool (circle/square icon) + - Wire tool (line/bolt icon) + - Brush tool (paintbrush icon) + - Undo button (arrow.uturn.backward) + - Redo button (arrow.uturn.forward) +- [ ] Selected tool highlighted visually +- [ ] Undo/redo buttons disabled when not available +- [ ] Contextual inspector panel appears based on active tool: + - Brush: size slider (1-100px) + - All tools: feather amount slider (0-20px) + - Wire tool: mask expansion slider (2-20px, default 6) + - Optional "Refine edges" toggle +- [ ] Inspector animates in/out smoothly +- [ ] All tools labeled for VoiceOver +- [ ] Brush size adjustable via stepper (accessibility) +- [ ] Dynamic Type support for any text labels +- [ ] UI tests for tool selection and inspector + +## Technical Notes +- Use SF Symbols for icons where possible +- Store selected tool in shared state (environment or binding) +- Inspector can be sheet, popover, or inline panel based on device +- Consider compact layout for smaller devices + +## Edge Cases +- No image loaded: tools disabled +- Processing in progress: tools disabled, show activity indicator +- Undo stack empty: undo button disabled +- Redo stack empty: redo button disabled diff --git a/specs/08-photo-editor-view.md b/specs/08-photo-editor-view.md new file mode 100644 index 0000000..7b805b9 --- /dev/null +++ b/specs/08-photo-editor-view.md @@ -0,0 +1,37 @@ +# Photo Editor View + +## Description +Main editor screen composing canvas, toolbar, and coordinating edit operations. + +## Acceptance Criteria +- [ ] `PhotoEditorView` SwiftUI view containing: + - `CanvasView` for image display and interaction + - `ToolbarView` for tool selection + - Status/feedback area for messages +- [ ] Tap handling routed to appropriate service based on selected tool: + - Person tool → MaskingService.generatePersonMask + - Object tool → MaskingService.generateForegroundMask + - Wire tool → ContourService.findBestWireContour + - Brush tool → direct drawing on mask layer +- [ ] Mask preview shown after detection, before inpainting +- [ ] Confirm/cancel buttons for mask preview +- [ ] On confirm: InpaintEngine processes, result added to operation stack +- [ ] Feedback states implemented: + - Processing: spinner overlay on affected region + - No detection: toast with fallback suggestion + - Success: brief checkmark animation +- [ ] Undo/redo triggers re-render from operation stack +- [ ] State persisted when app backgrounds +- [ ] Full flow UI test: import → edit → confirm + +## Technical Notes +- Use `@StateObject` or `@ObservedObject` for editor state +- Coordinate space conversion between view and image coordinates +- Show mask preview as overlay before committing +- Processing should be async to keep UI responsive + +## Edge Cases +- Tap during processing: ignore or queue +- App backgrounded during processing: complete in background if possible +- Memory warning during processing: cancel gracefully, show error +- User cancels mask preview: discard mask, return to ready state diff --git a/specs/09-person-removal.md b/specs/09-person-removal.md new file mode 100644 index 0000000..587caea --- /dev/null +++ b/specs/09-person-removal.md @@ -0,0 +1,33 @@ +# Person Removal Feature + +## Description +Tap-to-remove people from photos using Vision's person segmentation. + +## Acceptance Criteria +- [ ] User taps person in photo with Person tool selected +- [ ] `VNGenerateForegroundInstanceMaskRequest` generates mask for tapped person +- [ ] Mask preview shown with red tint overlay +- [ ] Mask automatically dilated by 2-4px to capture edge pixels +- [ ] User can adjust feather amount before confirming +- [ ] On confirm: mask feathered and passed to InpaintEngine +- [ ] Inpainted result displayed, operation added to stack +- [ ] "Select all people" option available when multiple people detected +- [ ] Multiple people can be removed one at a time +- [ ] Partial occlusion handled (Vision provides usable mask) +- [ ] User can refine mask with brush tool if needed +- [ ] Error handling: + - No person at tap: "No person found at tap location" toast + - Low confidence mask: "Does this look right?" confirmation with preview +- [ ] Performance: mask generation < 500ms, inpaint per spec targets + +## Technical Notes +- Vision's person segmentation is robust on iOS 17+ +- Use `indexesOfInstancesContainingPoint` to identify which person was tapped +- For "select all", combine masks from all detected instances +- Allow brush refinement by switching to brush tool with existing mask loaded + +## Edge Cases +- Person partially out of frame: mask what's visible +- Person behind object: Vision may include occluding object, allow brush refinement +- Very small person in photo: may not be detected, suggest zoom or brush +- Multiple overlapping people: tap selects frontmost, allow sequential removal diff --git a/specs/10-object-removal.md b/specs/10-object-removal.md new file mode 100644 index 0000000..1ad9a96 --- /dev/null +++ b/specs/10-object-removal.md @@ -0,0 +1,34 @@ +# Foreground Object Removal Feature + +## Description +Remove foreground objects via tap detection with smart brush fallback. + +## Acceptance Criteria +- [ ] User taps object in photo with Object tool selected +- [ ] `VNGenerateForegroundInstanceMaskRequest` attempts to isolate object +- [ ] If mask found: preview shown, user confirms, inpaint executes +- [ ] If no mask found: "Use brush to select" prompt displayed +- [ ] Smart brush fallback: + - User paints rough selection over object + - App refines selection to nearest strong edges + - Edge refinement uses gradient magnitude analysis + - Refined mask preview shown + - User confirms refined mask +- [ ] Brush tool settings: size slider (1-100px) +- [ ] Edge refinement toggle available +- [ ] Clear messaging about limitations: + - "Works best on objects that stand out from background" + - "Low-contrast objects may require manual selection" +- [ ] Performance: detection < 500ms, edge refinement < 200ms + +## Technical Notes +- Vision detects visually distinct foreground regions +- It separates by contrast, not semantic understanding +- Edge refinement: compute gradient magnitude, snap brush stroke to nearby edges +- `EdgeRefinement.swift` utility for gradient-based snapping + +## Edge Cases +- Object blends with background: Vision returns no mask, prompt brush +- Very large object: may affect inpaint quality, warn if >30% of image +- Object at image edge: handle boundary in mask and inpaint +- User brush stroke misses object: edge refinement helps, but may need retry diff --git a/specs/11-wire-removal.md b/specs/11-wire-removal.md new file mode 100644 index 0000000..5ba2679 --- /dev/null +++ b/specs/11-wire-removal.md @@ -0,0 +1,36 @@ +# Wire & Line Removal Feature + +## Description +Remove power lines and wires using contour detection with line brush fallback. + +## Acceptance Criteria +- [ ] User taps near wire with Wire tool selected +- [ ] `VNDetectContoursRequest` returns all detected contours +- [ ] ContourService scores contours and selects best wire-like match: + - Proximity to tap point + - Thin/elongated aspect ratio + - Straightness (low curvature) + - Length +- [ ] Best contour highlighted for preview +- [ ] Mask width configurable: default 6px, range 2-20px via slider +- [ ] User confirms, mask expanded to configured width, inpaint executes +- [ ] Line brush fallback when no contour detected: + - User switches to "Line brush" mode + - User draws along wire + - Consistent stroke width maintained + - Stroke becomes mask for inpainting +- [ ] Multiple wires can be removed sequentially +- [ ] Performance: contour detection < 300ms + +## Technical Notes +- Power lines against sky have strong edges - ideal case +- Scoring weights: proximity 0.3, aspect 0.3, straightness 0.2, length 0.2 +- Line brush should use touch velocity for smooth strokes +- Consider Catmull-Rom spline for smooth line brush paths + +## Edge Cases +- No wire-like contours found: "No lines detected. Use line brush" prompt +- Wire against busy background (buildings, trees): likely needs line brush +- Curved wires (drooping power lines): scoring allows moderate curvature +- Wire crosses entire image: may need to process in segments +- Multiple parallel wires: each tap selects one, remove sequentially diff --git a/specs/12-image-import.md b/specs/12-image-import.md new file mode 100644 index 0000000..0270ace --- /dev/null +++ b/specs/12-image-import.md @@ -0,0 +1,29 @@ +# Image Import + +## Description +Import photos from Photo Library or Files app. + +## Acceptance Criteria +- [ ] Import button opens `PHPickerViewController` +- [ ] Limited photo library access supported (no full access required) +- [ ] Selected photo loaded into editor +- [ ] PHAsset `localIdentifier` stored for Photos-sourced images +- [ ] Files app import supported via document picker +- [ ] File-imported images embedded in project data +- [ ] Supported formats: JPEG, PNG, HEIC +- [ ] Large images (>12MP) handled with appropriate warnings +- [ ] Loading indicator shown during import +- [ ] Import errors handled gracefully with user feedback +- [ ] Privacy: no photo library write permission requested at import time + +## Technical Notes +- Use `PHPickerViewController` (not `UIImagePickerController`) for modern API +- Configure picker: `filter: .images`, `selectionLimit: 1` +- For Files: use `UIDocumentPickerViewController` with `UTType.image` +- Store original at full resolution, generate preview separately + +## Edge Cases +- User cancels picker: return to previous state, no error +- Corrupted/unreadable image: show "Unable to load image" error +- Very large image (48MP+): warn about potential memory issues +- Permission denied: show guidance to enable in Settings diff --git a/specs/13-export-view.md b/specs/13-export-view.md new file mode 100644 index 0000000..8c61dcc --- /dev/null +++ b/specs/13-export-view.md @@ -0,0 +1,33 @@ +# Export View + +## Description +Export edited photos at full resolution to Photo Library or Files. + +## Acceptance Criteria +- [ ] Export button in editor triggers export flow +- [ ] Full resolution render from operation stack +- [ ] Progress indicator during export rendering +- [ ] Export options: + - Save to Photo Library (requests write permission) + - Share sheet (AirDrop, Messages, etc.) + - Save to Files +- [ ] Export formats: JPEG (quality slider 0.7-1.0), PNG +- [ ] HEIC output supported on compatible devices +- [ ] Metadata preserved from original where possible +- [ ] Success confirmation after save +- [ ] Export time within spec targets: + - 12MP: < 4 seconds + - 48MP: < 12 seconds +- [ ] Memory management: tile-based for large images + +## Technical Notes +- Request `PHPhotoLibrary.authorizationStatus(for: .addOnly)` for write +- Use `PHAssetChangeRequest.creationRequestForAsset` to save +- For share sheet, create temporary file, clean up after +- Consider background processing for very large exports + +## Edge Cases +- Export during low memory: show "Image too large, try cropping" error +- Permission denied: show guidance to enable in Settings +- Export cancelled mid-process: clean up partial work +- Disk space insufficient: detect and show appropriate error diff --git a/specs/14-brush-tool.md b/specs/14-brush-tool.md new file mode 100644 index 0000000..a501e31 --- /dev/null +++ b/specs/14-brush-tool.md @@ -0,0 +1,32 @@ +# Brush Tool + +## Description +Manual brush selection for fallback when automatic detection fails. + +## Acceptance Criteria +- [ ] Brush tool available in toolbar +- [ ] Brush size adjustable: 1-100px via slider +- [ ] Brush size also adjustable via stepper (accessibility) +- [ ] Touch draws on mask layer in real-time +- [ ] Smooth stroke rendering (interpolate between touch points) +- [ ] Brush preview circle follows finger +- [ ] Erase mode toggle to remove from mask +- [ ] Clear mask button to start over +- [ ] Edge refinement optional: snap brush strokes to nearby edges +- [ ] Mask preview shown in real-time as user paints +- [ ] When done painting, user taps "Done" to proceed to inpaint +- [ ] Pinch-to-zoom still works while brush active +- [ ] Brush works with Apple Pencil (pressure sensitivity optional) +- [ ] Performance: 60fps stroke rendering + +## Technical Notes +- Use Core Graphics or Metal for stroke rendering +- Interpolate touch points with quadratic curves for smoothness +- Edge refinement uses gradient magnitude from `EdgeRefinement.swift` +- Consider separate gesture recognizer for drawing vs zoom/pan + +## Edge Cases +- Very fast strokes: ensure no gaps between points +- Stroke at image edge: clamp to image bounds +- Accidental touch: undo single stroke or clear all +- Zoom while drawing: complete current stroke, then zoom diff --git a/specs/15-accessibility.md b/specs/15-accessibility.md new file mode 100644 index 0000000..1b63506 --- /dev/null +++ b/specs/15-accessibility.md @@ -0,0 +1,29 @@ +# Accessibility + +## Description +Ensure the app is fully accessible per Apple guidelines. + +## Acceptance Criteria +- [ ] All tools labeled for VoiceOver with descriptive labels +- [ ] Tool actions announced: "Person tool selected", "Undo complete" +- [ ] Brush size adjustable via stepper (not just slider) +- [ ] High contrast mask visualization option in settings +- [ ] Mask overlay uses accessible colors (not red-green dependent) +- [ ] Reduce Motion support: disable transition animations when enabled +- [ ] Dynamic Type support in all UI text +- [ ] Minimum touch target size: 44x44 points +- [ ] Focus order logical for VoiceOver navigation +- [ ] Processing states announced: "Processing", "Complete" +- [ ] Error messages announced clearly +- [ ] Accessibility audit passes with no critical issues + +## Technical Notes +- Use `.accessibilityLabel()` and `.accessibilityHint()` modifiers +- Check `UIAccessibility.isReduceMotionEnabled` for animations +- Use `@ScaledMetric` for Dynamic Type-responsive sizing +- Test with VoiceOver, Voice Control, and Switch Control + +## Edge Cases +- Complex canvas gestures: provide alternative VoiceOver actions +- Image descriptions: consider describing detected content for blind users +- Color-only indicators: always pair with shape or text