Network Telemetry & Mesh Architecture
The Shardy Orchestrator represents the control plane of the decentralized ecosystem. While individual Shardy nodes execute atomic operations across isolated hardware environments, the Orchestrator governs how these thousands of anonymous clients function collectively as a monolithic supercomputer.
This document serves as the definitive technical blueprint of the architectural decisions underpinning the Shardy network routing, job match-making, peer-to-peer telemetry, and consensus settlement layers.
1. The Orchestrator Concept
The Orchestrator is written in high-octane TypeScript utilizing the Bun runtime to manipulate tens of thousands of concurrent generic TCP, WebSocket, and WebRTC sockets.
Its primary function is Trustless Job Management. It must act under the assumption that 100% of its connected node workers could disconnect at any moment, spoof hardware metrics, or return mathematically invalid arrays.
1.1 Multi-Tier Admission Protocol
Before a node can process neural network pipelines, it must be profiled.
- Hardware Profiling (
profile_v2): When a worker joins the WebSocket stream, it executes a fast benchmark (a WebGPU compute pass or a WASM parallel hash algorithm). - Tier Categorization (
scheduler.ts): Based on calculated GFLOPS and stable video memory allocation metrics (maxStableAllocationMB), the orchestrated assigns the node to aWorkerTier:- Tier 1: Low-end Integrated CPUs/GPUs (assigned
lite_test, simple validation) - Tier 2: Standard mid-range graphical hardware
- Tier 3: High End Desktop (HEDT) GPUs like RTX 3090/4090s or Apple M3 Maxes.
- Tier 1: Low-end Integrated CPUs/GPUs (assigned
- Strict Demotion: If a node fails to meet
passesAdmission()criteria, it is relegated to “unknown” or rejected status, cordoning off network pollution.
2. Dispatch Mechanics & Reliability (BFT)
Jobs introduced into the Shardy network aren’t simply mapped 1-to-1 to a compute node. To enforce Byzantine Fault Tolerance (BFT), identical jobs are securely duplicated and validated.
2.1 The Redundancy Matrix
REDUNDANCY_FACTOR: Standard operations require a minimum redundancy variable (e.g.,2or3).- Dispatcher Engine: The
Dispatcherclass identifies matchingidlenodes filtered explicitly by their designatedWorkerTierrequirements. - Binary Delivery Protocol: The payload avoids heavy JSON stringification. Custom bio-mechanical framing handles transit: First, a concise
metaFramesends metadata (taskId, seed), followed immediately by thebinaryDataframe mapped straight to the Web socket logic block.
2.2 Watchdog Auto-Recovery & Dead-Letter Queues (DLQ)
Network turbulence is actively managed by a localized consensus loop (the “Watchdog”):
- Ack Timeout: If a socket disconnects mid-delivery or mobile devices enter a suspended state, the worker fails to provide a
task_ack. - Execution Timeout: Tasks flagged as running that blow past complexity-based computational caps.
- Migrating Offline Loads (
tryReassignOfflineAssignment): The protocol seamlessly strips dead assignments from dropped connections and routes the chunk to the next highly-available hardware node seamlessly—the end user issuing the task assumes no network drop occurred. - DLQ Mechanics: Erratic payloads that shatter
MAX_DELIVERY_ATTEMPTSparameters are securely moved to a dead-letter state file to protect network availability constraints.
3. P2P Mesh Implementation & Libp2p
To decentralize connectivity and scale infinitely without single-point bottlenecks, the Orchestrator embeds Libp2p.
3.1 Network Topology (libp2pHost.ts)
- GossipSub v1.1 Pub/Sub: The network utilizes high-speed publish/subscribe message propagation.
- Kademlia DHT: Handshakes are handled peer-to-peer for node discovery.
- Encrypted Tunnels: Payload configurations utilize Noise protocol encryptions dynamically multiplexed via Yamux streams.
- Transaction Gossiping: Every meaningful state change (“Task Created”, Mismatch penalties, Stakes updated) floats through the
shardy.blockchain.transactions.v1channel. Any node parsing these blocks executes its deterministic sequence logic.
4. Consensus & Settlement Protocols
The most critical mechanism in Shardy is verifying untrusted computation through algorithmic consensus in state_machine.ts and consensus.ts.
4.1 State Machine Deduplication
- Nodes evaluate the ZK mathematical polynomials and compress gigantic result tensors to
checksum. applyTaskResultevaluates theseresultDigestchecksums back against SnarkJS validations.- Once the target
REDUNDANCY_FACTORis hit, the network queries internal congruence.
4.2 Congruency and Slashing
- Match Result: If node outputs match bit-for-bit (thanks to WASM deteministic EMA smoothing in worker processes), the State Machine fires the
task_verifiedcondition. - Reward Distribution: Verified outcomes execute
computeStakeWeightedReward— crediting the Web3 digital wallet attached to the worker identity inside the SQLite/RocksDb persistence core. - Consensus Mismatch: If a bad actor modifies their client script to fabricate a ZK witness and reports corrupted payloads, an internal mismatch is flagged. The orchestrator triggers
applySlashing(). - Slashed State: As a penalty mechanism for injecting polluted compute pools, malicious or heavily corrupted hardware operators have their balance metrics sharply penalized and their
isBlockedtimeouts throttled via the Reliability engine.
Technical Summary
The Orchestrator network layer ensures Shardy operates as an unstoppable force. Through rigorous validation loops, multi-tier hardware classifications, aggressive watchdog recovery timers, and deterministic Libp2p transaction ledgers - Shardy guarantees robust task-market making that mathematically guarantees trustless execution outputs without relying on monolithic AWS architectures.