Why Validators Are the Obvious Answer to the Indexing Problem

The infrastructure for trustless indexing already exists. We've just been ignoring it.

[2026-01-25] | Shinzō Team

The Redundancy Nobody Questions

Here's a strange fact about blockchain infrastructure: the same work gets done twice.

Validators process every transaction. They execute every smart contract call. They maintain the complete state of the network. This is their job. It's what they're paid for. It's why they exist.

Then, completely separately, indexers do it all again. They run their own nodes, sync the same blockchain data, process the same transactions, and reconstruct the same state. Different organizations, different infrastructure, different trust assumptions. Same work.

This duplication isn't a minor inefficiency. It's the architectural flaw at the heart of the blockchain data problem. We built an entire parallel infrastructure to read data that validators already have, then wondered why that infrastructure ended up centralized, rent-seeking, and misaligned with ecosystem interests.

The solution has been sitting in front of us the whole time.

What Validators Already Do

To understand why validators are the natural home for indexing, you need to understand what validation actually involves.

When a validator processes a block, they're not just checking signatures and counting votes. They're executing every transaction in that block. Every token transfer, every swap, every contract interaction. The validator's node runs the actual computation, updates the actual state, and produces the actual results.

This means validators have something no external indexer can match: direct access to blockchain state at the moment it's created. They don't reconstruct state from transaction logs. They don't query RPC endpoints and hope the data is accurate. They compute the state themselves, block by block, as part of validation.

For indexing purposes, this is the difference between getting data from the source and getting it secondhand. External indexers are always working with a copy. Validators have the original.

Beyond raw data access, validators already maintain the infrastructure that indexing requires. They run high-availability nodes with redundant storage. They operate in geographically distributed locations. They're connected to the network's peer-to-peer layer. They have economic stake in the network's success.

Every piece of infrastructure you'd need to build a reliable indexing service? Validators already have it.

The Economics of Duplication

The current indexing model asks a simple question: who should process blockchain data to make it queryable?

The industry's answer has been: someone other than validators. Separate companies. Separate infrastructure. Separate economic relationships.

This answer created an entire category of businesses that exist to duplicate work validators already do. And because these businesses need to be profitable, they extract value from the ecosystem for providing access to data that was already processed once.

Think about what this means economically. Blockchain networks pay validators to process transactions and maintain state. Then applications pay indexers to process the same transactions and maintain the same state. The ecosystem pays twice for the same work, and the second payment goes to entities whose incentives aren't aligned with the network's health.

Validators are economically secured through staking. They have skin in the game. If the network fails, they lose. If the network succeeds, they benefit. Their incentives are aligned with ecosystem health by design.

External indexers have no such alignment. Their incentive is to maximize extraction while providing minimum viable service. If they can charge more for the same data, they will. If they can cut costs by reducing reliability, they will. The network's success is only relevant to them insofar as it affects demand for their services.

This isn't a criticism of indexer operators as people. It's a description of how economic incentives shape behavior. When you build infrastructure with misaligned incentives, you get misaligned outcomes. The centralization, rent-seeking, and gatekeeping we see in current indexing infrastructure isn't an accident. It's the predictable result of the economic model.

The Trust Advantage

Beyond economics, validators offer something external indexers structurally cannot: a direct path to trustless verification.

When an external indexer processes blockchain data, there's no cryptographic link between their output and the blockchain's state. You're trusting that they ran a node, that the node was synced correctly, that they processed all transactions, that they didn't make errors, that they didn't manipulate results. Trust, trust, trust, all the way down.

Validators operate differently. They're already producing cryptographic proofs as part of validation. Block headers, state roots, transaction receipts. These aren't optional features. They're fundamental to how blockchains work.

Validator-produced indexed data can inherit these cryptographic properties. When a validator indexes a transaction, they can produce a proof linking that index entry to the block's state root. When they compute a token balance, they can prove that computation against the chain's state roots. The same cryptographic machinery that makes blockchain writes trustless can make blockchain reads trustless.

External indexers can't do this. They can query a node for state proofs, but they can't produce proofs for their own transformations and aggregations. The moment they compute derived data, like a token balance or an ownership history, they're outside the blockchain's proof system. You're back to trusting them.

This isn't a limitation that can be engineered around. It's a consequence of where indexing happens. Proofs must be generated at the source of truth. For blockchain data, that source is validators.

The Incremental Cost Argument

One objection to validator-based indexing is resource overhead. Validators already have demanding computational requirements. Adding indexing workload seems like it would increase costs and reduce profitability.

This concern misunderstands where the costs actually are.

The expensive part of indexing is syncing and maintaining a full node. Storage for terabytes of blockchain history. Bandwidth for staying current with network state. Compute for processing transactions. Validators already pay these costs. They have to. It's the prerequisite for validation.

The incremental cost of indexing, given that you already have a synced node, is comparatively small. You're subscribing to an event stream you already have access to. You're storing structured data derived from state you already maintain. You're serving queries against data you already computed.

For external indexers, these costs are additive. They need their own nodes, their own storage, their own bandwidth. None of that infrastructure is shared with validators. The total ecosystem cost is the validator infrastructure plus the indexer infrastructure.

For validator-based indexing, the costs are marginal. A lightweight indexer client running alongside existing validator software. Some additional storage for indexed data. Query serving capacity. The heavy lifting is already done.

This is why validator-based indexing can be economically sustainable without extractive pricing. The infrastructure costs are already covered. Indexing becomes an additional revenue stream for validators, not a primary business requiring maximum extraction.

The Distribution Model

Validators producing indexed data solves the trust problem. But a single validator serving queries to the entire ecosystem would create availability and scalability problems just as severe as centralized indexers.

The model that makes sense is validators producing data for a distributed network of hosts. Validators index and publish. Hosts replicate and serve. Applications query the nearest available host.

This mirrors how blockchains themselves work. Validators produce and attest to blocks. Nodes throughout the network replicate and verify. Light clients query for the data they need. The same architecture that distributes blockchain writes can distribute blockchain reads.

The key insight is that indexed data, like blockchain data, is deterministic. Given the same blockchain state, any validator running the same indexing logic will produce identical results. This means hosts can verify that the data they're replicating matches what other validators produce. Consensus emerges naturally from determinism.

Multiple validators indexing the same data also provides redundancy. If one validator goes offline, others continue producing. Hosts can switch sources seamlessly. There's no single point of failure because there's no single source of truth. The truth is the blockchain state, and any validator with that state can produce the indexed view.

Validator sets are already diverse by design. Different operators, different jurisdictions, different infrastructure providers. The censorship resistance that protects block production can protect data access.

Why This Hasn't Happened Yet

If validators are such an obvious solution, why isn't this how indexing already works?

Partly, it's path dependence. Indexing services emerged to solve an immediate problem: applications needed queryable data, and blockchains didn't provide it. The fastest solution was to build separate infrastructure. That infrastructure grew, raised funding, hired teams, and became entrenched. By the time anyone questioned the architecture, there were billions of dollars invested in the status quo.

Partly, it's a data infrastructure problem. Traditional databases weren't built for this. They assume centralized control, trusted operators, and stable network connections. They don't have native support for content-addressable data, cryptographic verification, or conflict-free replication across untrusted peers. To extend blockchain's trustless properties to the read layer, you need a data layer that speaks the same language: Merkle DAGs, CRDTs, cryptographic proofs baked into the storage engine itself.

This is where edge-first data infrastructure becomes relevant. The requirements for distributed, mission-critical systems operating across untrusted environments, think edge devices, peer-to-peer networks, offline-first applications, are remarkably similar to what blockchain indexing demands. Data that's self-verifying. Replication that works without central coordination. Conflict resolution that's deterministic. The database layer needed to be rebuilt for these requirements before validator-based indexing could work. That evolution happened, just not inside the blockchain ecosystem.

But mostly, it's a failure of imagination. The industry accepted that indexing was a separate problem requiring separate infrastructure. Nobody stopped to ask whether the separation made sense. The assumption became invisible, and invisible assumptions don't get questioned.

What Changes When Validators Index

The shift from external indexers to validator-based indexing isn't incremental. It changes the fundamental trust model of blockchain data access.

Applications stop relying on intermediaries whose incentives don't align with theirs. They query a distributed network of hosts serving validator-produced data. They verify that data cryptographically instead of trusting provider reputation.

Developers stop being constrained by what indexers choose to support. New data views can be defined and deployed permissionlessly. The transformation layer becomes open infrastructure, not a product controlled by specific vendors.

Ecosystems stop bleeding value to external extractors. Indexing revenue flows to validators who are already aligned with network health. The economic loop closes.

Users stop being one API outage away from broken applications. Data availability becomes a property of the validator set, not a dependency on specific companies. The same distribution that makes blockchain writes resilient makes blockchain reads resilient.

None of this requires new technology. It requires reorganizing existing technology around a more sensible architecture. Validators can index. Hosts can distribute. Applications can verify. The pieces exist. They just need to be assembled differently.

The Path Forward

The transition from centralized indexing to validator-based indexing won't happen overnight. But the direction is clear. Every argument for decentralized blockchain writes applies equally to decentralized blockchain reads. The same principles, trustlessness, verifiability, permissionlessness, sovereignty, that make blockchains valuable make validator-based indexing necessary.

The infrastructure for trustless indexing already exists. It's running in every validator node on every blockchain network. We've been ignoring it while building parallel infrastructure that reintroduces the problems blockchains were designed to solve.

That can change. That should change. And increasingly, that will change.

Shinzo is building validator-native indexing infrastructure. Cryptographic proofs from source. Peer-to-peer distribution. No middlemen.

Join the community.
X · Telegram · Discord · GitHub · shinzo.network

When Blockchain Read Layer Breaks: Real Costs of Indexer Failures

[2026-01-27]

The Blockchain Infrastructure Nobody Talks About

[2026-01-23]