# Multipeer Connectivity (MPC) Architecture This document explains how AtTable uses Apple's Multipeer Connectivity framework to create a peer-to-peer mesh network for real-time communication between deaf and hearing users. --- ## Overview AtTable uses **Multipeer Connectivity (MPC)** to establish direct device-to-device connections without requiring a central server. The app supports connections over: - **Wi-Fi** (same network) - **Peer-to-peer Wi-Fi** (AWDL - Apple Wireless Direct Link) - **Bluetooth** When devices aren't on the same Wi-Fi network (e.g., on 5G/cellular), MPC automatically falls back to **AWDL** for peer-to-peer discovery and data transfer. --- ## User Onboarding Flow ### 1. Initial Setup (`OnboardingView.swift`) When a user launches the app: 1. They enter their **name** 2. Select their **role** (Deaf or Hearing) 3. Choose an **aura color** (for visual identity in the mesh) 4. Tap **"Start Conversation"** to enter the mesh ``` User launches app → OnboardingView → Enter details → ChatView (mesh starts) ``` ### 2. Identity Generation (`NodeIdentity.swift`) Upon first launch, the app generates a **stable Node Identity**: - **`nodeID`**: A UUID persisted in UserDefaults (stable per app installation) - **`instance`**: A monotonic counter that increments each time a session starts This identity system allows the mesh to: - Reliably identify users across reconnections - Detect and filter "ghost" peers (stale connections from previous sessions) - Handle device reboots gracefully --- ## Network Connection Process ### Discovery & Connection (`MultipeerSession.swift`) When `ChatView` appears, it calls `multipeerSession.start()`, which: 1. **Sets up the MCSession** with encryption disabled (for faster AWDL connections) 2. **Starts browsing** for nearby peers using `MCNearbyServiceBrowser` 3. **Starts advertising** (after 0.5s delay) using `MCNearbyServiceAdvertiser` ### Wi-Fi vs Cellular/5G Connections | Network Type | Connection Method | Handshake Delay | Connection Time | |--------------|-------------------|-----------------|-----------------| | **Wi-Fi (same network)** | Infrastructure Wi-Fi | 0.5 seconds | Near-instant | | **Cellular/5G** | AWDL (peer-to-peer Wi-Fi) | 1.5 seconds | Up to 60 seconds | The app uses `NetworkMonitor.swift` to detect the current network type and adjusts timing: ```swift let isWiFi = NetworkMonitor.shared.isWiFi let delay = isWiFi ? 0.5s : 1.5s // Slower for AWDL stability ``` ### Deterministic Leader/Follower Protocol To prevent connection races (both devices trying to invite each other), the app uses a **deterministic leader election**: ```swift if myNodeID > theirNodeID { // I am LEADER - I will send the invite } else { // I am FOLLOWER - I wait for their invite } ``` This ensures exactly one device initiates each connection. --- ## Handshake Protocol Once connected at the socket level, devices exchange **handshake messages** containing: ```swift struct MeshMessage { var senderNodeID: String // Stable identity var senderInstance: Int // Session counter (for ghost detection) var senderRole: UserRole // Deaf or Hearing var senderColorHex: String // Aura color var isHandshake: Bool // Identifies this as handshake } ``` The handshake: 1. Registers the peer in `connectedPeerUsers` for UI display 2. Starts a **15-second stability timer** before clearing failure counters 3. Maps the `MCPeerID` to the stable `nodeID` for reliable identification --- ## User Leaving the Conversation ### Explicit Leave (`ChatView.swift`) When a user taps **"Leave"**: ```swift Button(action: { speechRecognizer.stopRecording() // Stop audio transcription multipeerSession.stop() // Disconnect from mesh isOnboardingComplete = false // Return to onboarding }) ``` ### Disconnect Cleanup (`MultipeerSession.disconnect()`) The `disconnect()` function performs complete cleanup: 1. **Cancel pending work**: Recovery tasks, connection timers 2. **Stop services**: Advertising and browsing 3. **Clear delegates**: Prevent zombie callbacks 4. **Disconnect session**: `session?.disconnect()` 5. **Clear all state**: - `connectedPeers` / `connectedPeerUsers` - `pendingInvites` / `latestByNodeID` - `cooldownUntil` / `consecutiveFailures` 6. **Stop keep-alive heartbeats** ### Partial Transcript Preservation If a peer disconnects mid-speech, their **partial transcript is preserved** as a final message: ```swift if let partialText = liveTranscripts[peerKey], !partialText.isEmpty { let finalMessage = MeshMessage(content: partialText, ...) receivedMessages.append(finalMessage) } ``` --- ## Rejoining the Conversation ### Identity Recovery When a user returns to the conversation: 1. App resets `isOnboardingComplete = false` on every launch (intentional - forces Login screen) 2. User completes onboarding again (name/role/color preserved in `@AppStorage`) 3. `multipeerSession.start()` called again ### Instance Increment The key to reliable rejoining is the **instance counter**: ```swift myInstance = NodeIdentity.nextInstance() // Monotonically increasing ``` When other devices see the new instance: 1. **Ghost Detection**: Old connections with lower instances are rejected 2. **Cooldown Clear**: Any cooldowns from previous failures are removed 3. **Fresh Connect**: The leader initiates a new invitation ### Handling Stale Peers The mesh uses multiple mechanisms to handle rejoins: | Mechanism | Purpose | |-----------|---------| | **Ghost Filtering** | Reject messages/invites from older instances | | **Cooldown Clear** | Give returning peers a fresh chance | | **Half-Open Deadlock Fix** | If we think we're connected but they invite us, accept the new invite | | **Stability Timer** | Only reset failure counts after 15s of stable connection | --- ## Keep-Alive & Mesh Health ### Heartbeat System When connected, the mesh sends **heartbeats every 10 seconds**: ```swift let message = MeshMessage( content: "💓", isKeepAlive: true, connectedNodeIDs: connectedPeerUsers.map { $0.nodeID } // Gossip ) ``` ### Gossip Protocol Heartbeats include a list of connected peers, enabling **clique repair**: 1. Device A receives heartbeat from Device B 2. If B knows Device C but A doesn't, A can proactively invite C 3. This heals mesh partitions without requiring everyone to be discoverable --- ## Connection Recovery ### Exponential Backoff Failed connections trigger increasing cooldown periods: ```swift // 0.5s → 1.0s → 2.0s → 4.0s → ... → max 30s let delay = min(0.5 * pow(2, failures - 1), 30.0) ``` ### Smart Retry Instead of restarting everything, failed connections are retried individually: 1. Only the **leader** initiates retries (prevents race conditions) 2. Retries respect cooldown periods 3. After 5 consecutive failures → **"Poisoned State"** triggers full reset ### Poisoned State Recovery If a peer has too many consecutive failures: ```swift if failures >= 5 { restartServices(forcePoisonedRecovery: true) // Creates new MCPeerID, clears all cooldowns } ``` --- ## Summary | Event | What Happens | |-------|--------------| | **User joins** | NodeID retrieved, instance incremented, advertise + browse started | | **On Wi-Fi** | Fast handshake (0.5s), near-instant connections | | **On 5G/Cellular** | AWDL used, slower handshake (1.5s), up to 60s to connect | | **User leaves** | Full cleanup, partial transcripts preserved | | **User rejoins** | New instance number, ghosts filtered, cooldowns cleared | | **Connection fails** | Exponential backoff, smart retry by leader only | The architecture prioritizes **reliability over speed**, using defensive mechanisms like ghost filtering, stability timers, and gossip-based clique repair to maintain mesh health despite the inherent unreliability of peer-to-peer wireless connections.