AtTable iOS app with multipeer connectivity for mesh messaging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.8 KiB
Multipeer Connectivity (MPC) Architecture
This document explains how AtTable uses Apple's Multipeer Connectivity framework to create a peer-to-peer mesh network for real-time communication between deaf and hearing users.
Overview
AtTable uses Multipeer Connectivity (MPC) to establish direct device-to-device connections without requiring a central server. The app supports connections over:
- Wi-Fi (same network)
- Peer-to-peer Wi-Fi (AWDL - Apple Wireless Direct Link)
- Bluetooth
When devices aren't on the same Wi-Fi network (e.g., on 5G/cellular), MPC automatically falls back to AWDL for peer-to-peer discovery and data transfer.
User Onboarding Flow
1. Initial Setup (OnboardingView.swift)
When a user launches the app:
- They enter their name
- Select their role (Deaf or Hearing)
- Choose an aura color (for visual identity in the mesh)
- Tap "Start Conversation" to enter the mesh
User launches app → OnboardingView → Enter details → ChatView (mesh starts)
2. Identity Generation (NodeIdentity.swift)
Upon first launch, the app generates a stable Node Identity:
nodeID: A UUID persisted in UserDefaults (stable per app installation)instance: A monotonic counter that increments each time a session starts
This identity system allows the mesh to:
- Reliably identify users across reconnections
- Detect and filter "ghost" peers (stale connections from previous sessions)
- Handle device reboots gracefully
Network Connection Process
Discovery & Connection (MultipeerSession.swift)
When ChatView appears, it calls multipeerSession.start(), which:
- Sets up the MCSession with encryption disabled (for faster AWDL connections)
- Starts browsing for nearby peers using
MCNearbyServiceBrowser - Starts advertising (after 0.5s delay) using
MCNearbyServiceAdvertiser
Wi-Fi vs Cellular/5G Connections
| Network Type | Connection Method | Handshake Delay | Connection Time |
|---|---|---|---|
| Wi-Fi (same network) | Infrastructure Wi-Fi | 0.5 seconds | Near-instant |
| Cellular/5G | AWDL (peer-to-peer Wi-Fi) | 1.5 seconds | Up to 60 seconds |
The app uses NetworkMonitor.swift to detect the current network type and adjusts timing:
let isWiFi = NetworkMonitor.shared.isWiFi
let delay = isWiFi ? 0.5s : 1.5s // Slower for AWDL stability
Deterministic Leader/Follower Protocol
To prevent connection races (both devices trying to invite each other), the app uses a deterministic leader election:
if myNodeID > theirNodeID {
// I am LEADER - I will send the invite
} else {
// I am FOLLOWER - I wait for their invite
}
This ensures exactly one device initiates each connection.
Handshake Protocol
Once connected at the socket level, devices exchange handshake messages containing:
struct MeshMessage {
var senderNodeID: String // Stable identity
var senderInstance: Int // Session counter (for ghost detection)
var senderRole: UserRole // Deaf or Hearing
var senderColorHex: String // Aura color
var isHandshake: Bool // Identifies this as handshake
}
The handshake:
- Registers the peer in
connectedPeerUsersfor UI display - Starts a 15-second stability timer before clearing failure counters
- Maps the
MCPeerIDto the stablenodeIDfor reliable identification
User Leaving the Conversation
Explicit Leave (ChatView.swift)
When a user taps "Leave":
Button(action: {
speechRecognizer.stopRecording() // Stop audio transcription
multipeerSession.stop() // Disconnect from mesh
isOnboardingComplete = false // Return to onboarding
})
Disconnect Cleanup (MultipeerSession.disconnect())
The disconnect() function performs complete cleanup:
- Cancel pending work: Recovery tasks, connection timers
- Stop services: Advertising and browsing
- Clear delegates: Prevent zombie callbacks
- Disconnect session:
session?.disconnect() - Clear all state:
connectedPeers/connectedPeerUserspendingInvites/latestByNodeIDcooldownUntil/consecutiveFailures
- Stop keep-alive heartbeats
Partial Transcript Preservation
If a peer disconnects mid-speech, their partial transcript is preserved as a final message:
if let partialText = liveTranscripts[peerKey], !partialText.isEmpty {
let finalMessage = MeshMessage(content: partialText, ...)
receivedMessages.append(finalMessage)
}
Rejoining the Conversation
Identity Recovery
When a user returns to the conversation:
- App resets
isOnboardingComplete = falseon every launch (intentional - forces Login screen) - User completes onboarding again (name/role/color preserved in
@AppStorage) multipeerSession.start()called again
Instance Increment
The key to reliable rejoining is the instance counter:
myInstance = NodeIdentity.nextInstance() // Monotonically increasing
When other devices see the new instance:
- Ghost Detection: Old connections with lower instances are rejected
- Cooldown Clear: Any cooldowns from previous failures are removed
- Fresh Connect: The leader initiates a new invitation
Handling Stale Peers
The mesh uses multiple mechanisms to handle rejoins:
| Mechanism | Purpose |
|---|---|
| Ghost Filtering | Reject messages/invites from older instances |
| Cooldown Clear | Give returning peers a fresh chance |
| Half-Open Deadlock Fix | If we think we're connected but they invite us, accept the new invite |
| Stability Timer | Only reset failure counts after 15s of stable connection |
Keep-Alive & Mesh Health
Heartbeat System
When connected, the mesh sends heartbeats every 10 seconds:
let message = MeshMessage(
content: "💓",
isKeepAlive: true,
connectedNodeIDs: connectedPeerUsers.map { $0.nodeID } // Gossip
)
Gossip Protocol
Heartbeats include a list of connected peers, enabling clique repair:
- Device A receives heartbeat from Device B
- If B knows Device C but A doesn't, A can proactively invite C
- This heals mesh partitions without requiring everyone to be discoverable
Connection Recovery
Exponential Backoff
Failed connections trigger increasing cooldown periods:
// 0.5s → 1.0s → 2.0s → 4.0s → ... → max 30s
let delay = min(0.5 * pow(2, failures - 1), 30.0)
Smart Retry
Instead of restarting everything, failed connections are retried individually:
- Only the leader initiates retries (prevents race conditions)
- Retries respect cooldown periods
- After 5 consecutive failures → "Poisoned State" triggers full reset
Poisoned State Recovery
If a peer has too many consecutive failures:
if failures >= 5 {
restartServices(forcePoisonedRecovery: true)
// Creates new MCPeerID, clears all cooldowns
}
Summary
| Event | What Happens |
|---|---|
| User joins | NodeID retrieved, instance incremented, advertise + browse started |
| On Wi-Fi | Fast handshake (0.5s), near-instant connections |
| On 5G/Cellular | AWDL used, slower handshake (1.5s), up to 60s to connect |
| User leaves | Full cleanup, partial transcripts preserved |
| User rejoins | New instance number, ghosts filtered, cooldowns cleared |
| Connection fails | Exponential backoff, smart retry by leader only |
The architecture prioritizes reliability over speed, using defensive mechanisms like ghost filtering, stability timers, and gossip-based clique repair to maintain mesh health despite the inherent unreliability of peer-to-peer wireless connections.