Reed-Solomon Erasure Coding
The protocol applies Reed-Solomon codes for both fault tolerance and adversarial erasure protection, following Compact Proofs of Retrievability work. Files are encoded over GF(2^8) with 231 data symbols and 24 parity symbols per 255-symbol codeword, allowing reconstruction from any 231 symbols. Extractability and public retrievability. One of the PoR security properties is extractability: if a prover passes challenges with high probability, an efficient extractor can recover the entire file. Reed-Solomon’s Maximum Distance Separable (MDS) property ensures that possessing any of codeword symbols enables full reconstruction. The public encoding matrix enable public retrievability: any party with sufficient symbols can reconstruct files without secret keys. Hardness amplification against rational adversaries. Erasure coding provides hardness amplification in the sense that it transforms weak guarantees about random symbol possession into strong guarantees about complete file retrievability. Consider an adversary who stores only a fraction of codeword symbols to minimize storage costs. Under random challenge sampling, succeeds in answering individual challenges with probability , but crucially, cannot reconstruct the missing symbols to respond to challenges on deleted positions. The MDS property establishes a threshold: any code achieving the guarantees that possessing fewer than symbols renders reconstruction computationally infeasible, causing audit failure probability approaching over repeated challenges. This amplifies partial deletion into total audit failure, an adversary must maintain or face inevitable slashing.Merkle Trees and Commitments
Each file is committed through a binary Merkle tree built over its encoded symbols using Poseidon hashing. The tree’s root serves as a compact cryptographic commitment—a single 32-byte value that binds to the entire file contents with collision resistance. Symbol-to-leaf encoding. Each 31-byte symbol encodes directly as a Pallas field element through little-endian byte representation with no intermediate hashing. This direct encoding is critical for proof-of-retrievability: to prove possession of a challenged symbol, a node must possess the actual 31 bytes, encode them as a field element, and provide the Merkle path. An attacker storing only the Merkle tree (field elements) without underlying bytes cannot answer challenges—the SNARK circuit verification requires demonstrating knowledge of the bytes that encode to each challenged leaf. Poseidon hashing. The protocol uses Poseidon for all in-circuit hashing because it’s optimized for arithmetic circuits over prime fields. Poseidon requires hundreds of constraints per invocation versus thousands for SHA-256 or Blake2, reducing IVC proving cost by orders of magnitude. The hash function operates over the Pallas and Vesta curve cycle (254-bit prime fields), providing approximately 128 bits of security under current best-known attacks. Tree structure. The tree is binary—two children per node. Higher-arity trees (quaternary, octal) would reduce depth but increase proof size (more siblings per level) and circuit complexity (multiple hash inputs per verification step). When a tree level has odd node count, the final node is duplicated (hashed with itself) to maintain uniform circuit structure. Domain separation tags prevent cross-layer collision attacks: leaf encoding uses one tag, internal node hashing uses another. Merkle proofs. A proof for leaf index consists of the sibling hashes along the path from leaf to root plus direction bits indicating whether the node is a left or right child at each level. For tree depth , the proof size is field elements (around 32 bytes each). The verifier recomputes the path: starting from the leaf, hash with the sibling according to the direction bit, producing the parent; repeat until reaching the root. If the computed root matches the public commitment, the proof is valid.Bitcoin Layer
The protocol leverages Bitcoin blockchain as a primitive providing critical security properties that enable trustless coordination. Non-repudiation. Once a storage node publishes a proof to the blockchain, it cannot later deny having submitted that proof. The blockchain’s cryptographic chain of signatures and block headers creates an immutable record of all protocol actions. Non-tampering (immutability). After sufficient block confirmations, published data becomes computationally infeasible to alter or remove. No single party - not storage nodes, verifiers, or protocol operators - can retroactively modify historical protocol state. Canonical ordering. Bitcoin establishes a global, deterministic ordering of all protocol events through its block height and transaction ordering within blocks. This temporal ordering is critical for the storage protocol: challenges are issued at specific block heights, proof submission deadlines are defined relative to challenge blocks, and protocol state transitions follow a deterministic sequence. Decentralization. The blockchain operates without central authority or trusted intermediaries. Multiple independent miners or validators process transactions, and consensus rules are enforced through cryptographic and economic mechanisms rather than institutional trust. This decentralization ensures the storage protocol inherits censorship resistance (no single party can block valid proofs), permissionless participation (anyone can join as storage node or verifier), and sovereign verification (anyone can independently validate the complete protocol history).Nova Incremental Verification
The protocol uses Nova, a recursive SNARK system that enables incrementally verifiable computation. Nova allows a prover to demonstrate correct execution of a computation over many steps while producing a proof that remains constant in size regardless of step count. This property is what makes multi-challenge aggregation practical. Folding scheme. Nova works through “folding”—a technique where each step’s proof gets folded into the previous proof, accumulating evidence of correct execution without growing the proof structure. The prover maintains a running witness (the IVC accumulator) that tracks all previous steps. Each new step verifies the previous accumulator, executes the step circuit, and produces a new accumulator. The accumulator grows logarithmically with step count but compresses to a compact form through Spartan at the end. Proof-of-retrievability circuit. The step circuit verifies possession of one challenged symbol. The public inputs are the file root , a state accumulator , the challenge seed , and tree depth . The private witness includes the challenged leaf, its Merkle path siblings, and direction bits. The circuit:- Derives the challenge index deterministically from the seed and current state
- Verifies the Merkle path from the challenged leaf to the root
- Updates the state accumulator by hashing it with the verified leaf
- Outputs the updated state for the next iteration
Multi-File Proof Aggregation
Storage nodes frequently face challenges on multiple files within the proof window. Aggregating these challenges into a single proof dramatically reduces Bitcoin transaction costs. The File Ledger enables this aggregation while maintaining security. File Ledger construction. The File Ledger is a Merkle tree built over the root commitments of all active files. For each file with root and depth , the protocol computes a root commitment: This binds the file’s root and depth together, preventing depth-spoofing attacks where an adversary might try to reuse proofs across files with different tree structures. Files are ordered deterministically (lexicographically by file identifier), and the ledger is constructed from these root commitments: Each file has a canonical ledger index corresponding to its position in sorted order. The ledger root commits to all files in the system at a specific block height. Aggregated proof structure. When a node is challenged on files, it generates a single proof demonstrating possession of challenged symbols across all files. The proof includes:- For each file: Merkle path from its root commitment to the ledger root (size )
- For each challenged symbol in each file: Merkle path from symbol to file root (size )
- Nova IVC accumulator tracking all verifications across all files
- Compressed Spartan SNARK proving correct execution
Verification Process
When indexers receive a proof transaction, they extract the proof structure and challenge identifiers, reconstruct public inputs from challenge parameters, and verify the Spartan SNARK. Verification checks:- Proof timeliness: Submitted before challenge expiration
- Public input consistency: Matches expected challenge parameters (correct roots, seeds, depths)
- SNARK verification: Spartan verification passes (proving correct circuit execution)
- Challenge binding: Proof covers the exact challenges claimed (ordering matches, no substitution)