Skip to Content
Network Telemetry & Mesh Architecture

Network Telemetry & Mesh Architecture

The Shardy Orchestrator represents the control plane of the decentralized ecosystem. While individual Shardy nodes execute atomic operations across isolated hardware environments, the Orchestrator governs how these thousands of anonymous clients function collectively as a monolithic supercomputer.

This document serves as the definitive technical blueprint of the architectural decisions underpinning the Shardy network routing, job match-making, peer-to-peer telemetry, and consensus settlement layers.


1. The Orchestrator Concept

The Orchestrator is written in high-octane TypeScript utilizing the Bun runtime to manipulate tens of thousands of concurrent generic TCP, WebSocket, and WebRTC sockets.

Its primary function is Trustless Job Management. It must act under the assumption that 100% of its connected node workers could disconnect at any moment, spoof hardware metrics, or return mathematically invalid arrays.

1.1 Multi-Tier Admission Protocol

Before a node can process neural network pipelines, it must be profiled.

  1. Hardware Profiling (profile_v2): When a worker joins the WebSocket stream, it executes a fast benchmark (a WebGPU compute pass or a WASM parallel hash algorithm).
  2. Tier Categorization (scheduler.ts): Based on calculated GFLOPS and stable video memory allocation metrics (maxStableAllocationMB), the orchestrated assigns the node to a WorkerTier:
    • Tier 1: Low-end Integrated CPUs/GPUs (assigned lite_test, simple validation)
    • Tier 2: Standard mid-range graphical hardware
    • Tier 3: High End Desktop (HEDT) GPUs like RTX 3090/4090s or Apple M3 Maxes.
  3. Strict Demotion: If a node fails to meet passesAdmission() criteria, it is relegated to “unknown” or rejected status, cordoning off network pollution.

2. Dispatch Mechanics & Reliability (BFT)

Jobs introduced into the Shardy network aren’t simply mapped 1-to-1 to a compute node. To enforce Byzantine Fault Tolerance (BFT), identical jobs are securely duplicated and validated.

2.1 The Redundancy Matrix

  • REDUNDANCY_FACTOR: Standard operations require a minimum redundancy variable (e.g., 2 or 3).
  • Dispatcher Engine: The Dispatcher class identifies matching idle nodes filtered explicitly by their designated WorkerTier requirements.
  • Binary Delivery Protocol: The payload avoids heavy JSON stringification. Custom bio-mechanical framing handles transit: First, a concise metaFrame sends metadata (taskId, seed), followed immediately by the binaryData frame mapped straight to the Web socket logic block.

2.2 Watchdog Auto-Recovery & Dead-Letter Queues (DLQ)

Network turbulence is actively managed by a localized consensus loop (the “Watchdog”):

  • Ack Timeout: If a socket disconnects mid-delivery or mobile devices enter a suspended state, the worker fails to provide a task_ack.
  • Execution Timeout: Tasks flagged as running that blow past complexity-based computational caps.
  • Migrating Offline Loads (tryReassignOfflineAssignment): The protocol seamlessly strips dead assignments from dropped connections and routes the chunk to the next highly-available hardware node seamlessly—the end user issuing the task assumes no network drop occurred.
  • DLQ Mechanics: Erratic payloads that shatter MAX_DELIVERY_ATTEMPTS parameters are securely moved to a dead-letter state file to protect network availability constraints.

3. P2P Mesh Implementation & Libp2p

To decentralize connectivity and scale infinitely without single-point bottlenecks, the Orchestrator embeds Libp2p.

3.1 Network Topology (libp2pHost.ts)

  • GossipSub v1.1 Pub/Sub: The network utilizes high-speed publish/subscribe message propagation.
  • Kademlia DHT: Handshakes are handled peer-to-peer for node discovery.
  • Encrypted Tunnels: Payload configurations utilize Noise protocol encryptions dynamically multiplexed via Yamux streams.
  • Transaction Gossiping: Every meaningful state change (“Task Created”, Mismatch penalties, Stakes updated) floats through the shardy.blockchain.transactions.v1 channel. Any node parsing these blocks executes its deterministic sequence logic.

4. Consensus & Settlement Protocols

The most critical mechanism in Shardy is verifying untrusted computation through algorithmic consensus in state_machine.ts and consensus.ts.

4.1 State Machine Deduplication

  1. Nodes evaluate the ZK mathematical polynomials and compress gigantic result tensors to checksum.
  2. applyTaskResult evaluates these resultDigest checksums back against SnarkJS validations.
  3. Once the target REDUNDANCY_FACTOR is hit, the network queries internal congruence.

4.2 Congruency and Slashing

  • Match Result: If node outputs match bit-for-bit (thanks to WASM deteministic EMA smoothing in worker processes), the State Machine fires the task_verified condition.
  • Reward Distribution: Verified outcomes execute computeStakeWeightedReward — crediting the Web3 digital wallet attached to the worker identity inside the SQLite/RocksDb persistence core.
  • Consensus Mismatch: If a bad actor modifies their client script to fabricate a ZK witness and reports corrupted payloads, an internal mismatch is flagged. The orchestrator triggers applySlashing().
  • Slashed State: As a penalty mechanism for injecting polluted compute pools, malicious or heavily corrupted hardware operators have their balance metrics sharply penalized and their isBlocked timeouts throttled via the Reliability engine.

Technical Summary

The Orchestrator network layer ensures Shardy operates as an unstoppable force. Through rigorous validation loops, multi-tier hardware classifications, aggressive watchdog recovery timers, and deterministic Libp2p transaction ledgers - Shardy guarantees robust task-market making that mathematically guarantees trustless execution outputs without relying on monolithic AWS architectures.

Last updated on