Skip to Content
Technical Deep Dive: Shardy Network and Node (Current Implementation)

Technical Deep Dive: Shardy Network and Node (Current Implementation)

This document reflects the current implementation of the Shardy network and the individual node, based on the actual codebase and project structure. Below is a detailed breakdown of the network state, orchestrator operations, task lifecycle, node architecture, and the detailed ZK-proof scheme.


1. Network Components and Roles

The Shardy network consists of three key planes:

  • Orchestrator (Control Plane): Accepts tasks, manages state, triggers consensus, distributes tasks, and verifies results.
  • Worker Nodes (Compute Plane): Browser-based nodes that perform computations using WebGPU/WASM and generate ZK-proofs of the results.
  • P2P Network (Gossip Plane): A libp2p mesh for presence, transactions, and block synchronization.

The codebase is structured as follows:

  • docs/orchestrator/src — Orchestrator server, consensus, and state management.
  • shardy-monorepo/apps/shardy — Node client and compute worker.
  • shardy-monorepo/apps/shardy/public/snark — Groth16 WASM/zkey/vkey/manifest files.
  • shardy-monorepo/apps/shardy/public/wasmpreprocess_node_engine.wasm for data preparation.

2. Network State: The Source of Truth

The network is state-oriented, focusing on the current state rather than an infinite transaction history. The orchestrator stores:

  • Workers: Node status, profiles, keys, tiers (Tier 1–3), and latest activity.
  • Tasks: Job definitions, status, seeds, complexity, and required tier.
  • Deliveries: Assignments of specific tasks to nodes, including delivery attempts.
  • Task Events: A complete event log of task transitions.
  • Dead Letters: Failed tasks with diagnostic reasons.
  • Campaigns: Stress tests and network benchmarks.
  • Stakes/Balances: Ledgers for rewards and penalties.
  • Blocks: Consensus blocks used for state synchronization and verification.

By default, the system uses RocksDB (SQLite is available for dev-only mode).


3. Network Consensus and State Propagation

Transactions (task creation, acks, progress updates, results, verification, mismatches, slashing, etc.) form a mempool, after which:

  1. Slot Leader proposes a block.
  2. Remaining validators vote on the proposal.
  3. The block is committed, and a state root is calculated.
  4. Nodes exchange blocks via libp2p and utilize snapshots for fast synchronization when necessary.

Key principles:

  • State Root is calculated from a deterministic state. If roots do not match, consensus halts.
  • Synchronization can be done via blocks or snapshots without the need to maintain full history.

4. Task Lifecycle (Real Pipeline)

  1. A client sends a request to POST /api/tasks.
  2. The orchestrator creates a task_create transaction and awaits commitment.
  3. The Dispatcher selects available workers and creates deliveries.
  4. The node receives two frames:
    • A protobuf meta frame containing taskId, deliveryId, seed, and verifierVersion.
    • A binary payload for computation.
  5. The node confirms with task_ack and periodically reports task_progress.
  6. After computation, the node submits a task_result:
    • For standard tasks: checksum + zk proof.
    • For test tasks: checksum only.
  7. The orchestrator validates the result, checks for quorum, and commits either task_verified or consensus_mismatch.

5. Node Architecture (Browser Worker Runtime)

The production node utilizes three execution layers:

  • UI/Orchestrator WebSocket: Handles handshakes, ACKs, telemetry, and task reception.
  • Compute Worker (compute.worker.ts): WebGPU + TypeGPU execution environment.
  • WASM preprocessing: preprocess_node_engine.wasm prepares input buffers for computation.

Additional features:

  • OPFS Checkpointing: Saves input data and metadata to allow task recovery after tab restarts.
  • Benchmark: (nodeBenchmark/gpuBenchmark) determines the Node Tier and resource allocation.
  • libp2p Presence: Signed with ECDSA P-256 and broadcasted periodically.

6. ZK-Proofs: Detailed Breakdown

6.1 Circuit: shardy_task_proof_v2.circom

The circuit design is extremely compact and enforces a strict dependency on the result:

witness + taskId + seed + outputLen === resultDigest

Public signals: taskId, seed, outputLen, resultDigest. Private parameter: witness.

6.2 How the Node Generates Public Signals

The node calculates:

  • outputLen — The length of the result array.
  • resultDigest — A deterministic digest derived from the output array.
    • Implemented as a folding operation over 32-bit words with a domain tag.
    • The result is mapped to the BN254 field.
  • witness = resultDigest - taskId - seed - outputLen.

The node then uses snarkjs.groth16.fullProve() with the WASM and zkey provided in public/snark/manifest.json.

6.3 Local Verification Before Submission

The node performs a local verification using groth16.verify() with the verification key:

If the local verification fails, the result is not submitted (or is reported as an error).

6.4 Orchestrator-Side Verification

The orchestrator verifies:

  1. The proof schema and format.
  2. Consistency of taskId and seed within the public signals.
  3. Validity of the groth16.verify() call against the verification key.
  4. Maps resultDigest to a checksum (lower 32 bits) and compares it with the submitted checksum.

Thus, the checksum and ZK-proof are cryptographically linked via the same resultDigest.


7. Security and Integrity

The system relies on several independent layers of verification:

  1. ZK-Proof: A node cannot forge a result without knowing the required witness.
  2. Redundancy Factor: A minimum of 2 independent nodes perform the same task.
  3. Consensus + State Root: The network state is committed and validated via the state root.
  4. Watchdog Timers: Tasks are reassigned upon timeout.
  5. ECDSA Signatures: Node identity and presence are cryptographically verified.

8. Why a Full Transaction History is Not Required

Shardy is a state-based network, where the consistency of the current state is critical, rather than the historical transaction log.

Consequently:

  • A node can synchronize using a snapshot (current state + recent blocks).
  • Correctness is verified via the State Root.
  • Any discrepancy in the root results in a halt of the consensus.

This means that for secure operation, the current state and recent blocks are sufficient, eliminating the need to maintain the entire transaction chain indefinitely.


9. Conclusion

The current network implementation is built upon:

  • Consensus and state roots.
  • ZK-verification of results.
  • Signed presence broadcasts.
  • Robust watchdog mechanisms.
  • Reproducible worker architecture.

This enables the system to maintain high security and integrity without the overhead of storing a full historical transaction chain.

Last updated on