Bitcoin's transition to a trusted chain.
It is the oldest instruction in Bitcoin, and the whole of it. You do not trust a bank, a developer, a miner, or a majority. You run a node and check every rule yourself, from the genesis block forward. Trustlessness is not a feature of Bitcoin among others. It is Bitcoin; everything else is plumbing.
A full node earns that independence by recomputing. It re-derives every transaction id, rebuilds every block's merkle root, re-checks every signature against every spend. It accepts nothing it cannot reproduce from the raw bytes of the chain. That is what "verify" means: not "read the receipt," but "redo the arithmetic."
This article is about a quiet change that makes the arithmetic impossible. Not by attacking the code, and not through a bad actor, but through a serialization choice. The relay filter that capped OP_RETURN at 80 bytes was removed by default in Core v30, and the reduced-data camp is still fighting to restore it. What that fight is really about is whether everything a node must verify stays something a node can afford, and is allowed, to keep.
The load-bearing argument for lifting the limit is that the data is harmless because it can be discarded:
"OP_RETURN is the lesser evil. It's prunable. Unlike fake UTXOs, it never enters the UTXO set, so nodes can drop it. Bigger payloads are therefore safe."
The first half is true, and it is doing all the work. OP_RETURN outputs are provably unspendable, so they are never added to the chainstate. A node's permanent working set, the UTXO set it must hold forever, is genuinely untouched by them.
So the defense is correct about the UTXO set. The trouble is that the UTXO set is not the only thing a node verifies, and "prune" is being asked to mean two different things at once.
Two operations wear the same word:
The prunability defense quietly slides from the first to the second. "Nodes can drop it" is true in the disk-pruning sense: once you have downloaded a block and checked it, you may delete old data and keep only the UTXO set and headers. But you had to possess and process every byte first, including the OP_RETURN payload, to get there.
Trustless pruning, throwing the bytes away and still being able to validate the chain without them, is the property that actually matters for a node that did not personally witness the block. And for raw OP_RETURN data, Bitcoin does not have it.
Here is the wall, and it is made of arithmetic. A transaction id is the double-SHA256 of the entire serialized transaction, over version, inputs, all outputs, and locktime:
txid = SHA256( SHA256( version ‖ vin ‖ vout ‖ locktime ) )
└── includes the OP_RETURN output's raw bytes
A block's merkle root commits to the txids of its transactions; the block header commits to the merkle root; proof-of-work commits to the header. So to verify a block, a node must reproduce each txid, and reproducing a txid requires the full transaction, the OP_RETURN payload included.
Delete that payload and the txid is no longer something you can compute. It becomes something you can only be told. The node can store the 32 bytes of the id and check that they sit under the merkle root, but it can no longer confirm that those bytes are the honest hash of a real transaction. It has stopped verifying and started accepting.
A pruned data carrier is not a verified transaction. It is a claimed one. The id is held on faith, because the only thing that could check it, the content, is gone.
The obvious rebuttal:
"Nodes already skip work below assumevalid. Bitcoin Core ships with a hardcoded block hash and doesn't re-check signatures before it. Trust is already in the system. This is no different."
It is entirely different, in three ways that matter.
Bounded. assumevalid points at a recent, public, peer-reviewed block hash you can read and audit. Post-takedown trust has no boundary; it attaches to any transaction whose content was ever removed, anywhere in history.
Optional. You can run -assumevalid=0 and validate every signature from genesis yourself. The data is all still there; the skipping is a convenience you may decline. There is no flag that brings deleted OP_RETURN bytes back. You cannot choose to verify, because the thing to verify no longer exists.
Different object. assumevalid skips signature checking, an expensive computation over data you still hold. Pruned data skips the txid itself, the commitment, over data you have lost.
That is the line between a convenience and a broken promise. One you can switch off and redo. The other is permanent, because the inputs to the check are gone.
I very much wanted to find some way to include a short message, but the problem is, the whole world would be able to see the message. As much as you may keep reminding people that the message is completely non-private, it would be an accident waiting to happen.Satoshi Nakamoto, BitcoinTalk, 28 January 2010
Satoshi was describing a single short message. The unbounded data carrier is that same accident, industrialized.
Lift the limit and a single transaction can carry on the order of a hundred kilobytes of arbitrary bytes. On any open data carrier, some bytes may eventually become unlawful, or simply unacceptable for some operators to store. This is not a moral panic; it is the observed endgame of every permissionless storage layer, and it is precisely the fear the inscription debate has been circling.
So one day, by court order, by a new statute criminalizing possession, or by an operator who simply will not relay a particular blob, a node will be required to not hold some content. At that moment its operator has two doors, and only two:
txid on trust forever.There is no third door. There is no trustless takedown, because the txid was computed over the content. Removability and verifiability were welded together the moment the data went in as raw bytes inside the transaction. You cannot cut one without the other.
When the takedown reckoning is raised, the reply is always the same:
"There is already bad content on the chain. Illegal bytes were smuggled into an address or a scriptSig years ago. The line was crossed long ago, so a data carrier changes nothing."
This treats a question of intent as a question of presence, and presence as if it were binary. It is neither.
Bytes smuggled into a spend are an artifact: tiny, incidental from the protocol's view, never sanctioned as a place to put data. A node stores them as a side effect of validating a payment, not as a service. The operator's posture stays honest: "I run a transaction validator; any content is unintended."
OP_RETURN as an unbounded carrier is the opposite. It is the designated channel, built and blessed for arbitrary bytes, and lifting its limit is a deliberate decision to make Bitcoin a data store at scale. That decision deletes the two things the incidental case still has: plausible deniability, because the network now knowingly, by design, hosts whatever is posted; and the absence of intent, because it is no longer an artifact but a sanctioned feature. "Some bad content already exists" was never the question. "Is Bitcoin a sanctioned, intentional, scalable carrier of it" is, and a data carrier is precisely the line that answers yes.
This is not only a rhetorical distinction; it is one the law tends to care about. Culpability usually turns on knowledge and intent, not mere presence. Mens rea, the guilty mind, runs through most serious offences. How that lands on any given node operator depends on jurisdiction, on possession versus distribution, on what safe-harbour rules apply, but the direction is plain enough: a node that once relayed a few tainted bytes while validating a payment, and a network that votes to become a sanctioned store for them, are not in the same posture, for the same reason they are not in the same moral one. Lifting the limit does not only change what Bitcoin carries; it changes what its operators knew, and chose.
An escape exists. It is structural, not operational: you cannot prune your way out, you have to commit your way out. The data must enter the txid as its hash, with the payload carried separately:
OP_RETURN <hash(data)> // 32 bytes, inside the txid
// the data itself lives off-chain / segregated
Now every property the defense wanted is actually true:
txid from the 32-byte hash. It never needed the content.Commit, don't carry. The chain commits; the data is served; the two are decoupled. The proof lives in the commitment, not the content. It is the only arrangement that is at once trustless and take-down-able.
Bitcoin already built this, for the witness. Segregated Witness moved signature data out of the txid and committed it separately, by hash, in the coinbase. The txid does not depend on the witness at all.
Which means witness data can be pruned without breaking what OP_RETURN breaks. Delete it and a node can still recompute the txid, because the txid never committed to the witness in the first place. (Re-checking the signatures from scratch is a separate matter and still needs the witness, or assumevalid; reproducing the transaction's identity does not.) It is the one large carrier whose removal leaves the chain's structure independently verifiable, and that is not an accident of culture but of structure: the witness sits outside the commitment the chain is built from.
And that is the irony. OP_RETURN is the one carrier that sits inside the txid. Of every place to put removable data, it is the worst, and lifting the limit would make it the biggest. We took the data we are most likely to be forced to delete and welded it to the hash we are least able to delete it from.
The datacarrier fight has been waged as a culture war: spam versus freedom, filters versus neutrality, censorship versus openness. That framing misses the point entirely, because trustlessness is not a policy. It is a property of the data structure.
Raw bytes inside the txid are incompatible with trustless removal. Full stop. No relay policy changes that. No filter changes it. No amount of social consensus about what "should" go on chain changes it. The incompatibility is arithmetic, and arithmetic does not take sides.
Which also means the usual objection, "you just want to censor what goes on chain," is a category error. You can be maximally permissive about what is anchored and still insist it is anchored as a commitment, not as content. That is not censorship. It is the difference between a notary and a hard drive. A notary stamps that a thing existed; it does not become the warehouse for the thing.
The takedown does not have to end in trust. There are ways out, and they line up by how much trust each one quietly reintroduces. Only the first removes the dilemma; the rest only manage it.
The real fix, and the subject of §8. Move the payload out of the txid and commit it by its hash, exactly where Segregated Witness already put signatures. A node verifies from the 32-byte commitment, prunes the bytes at will, and trusts no one about the content. The work is not to invent anything; it is to give data carriers the shape the witness already has.
assumevalid for data (bounded trust)Extend the checkpoint Bitcoin Core already ships. A public, reviewed assertion that the data below some height was valid, so nodes may drop it and trust the marker. Auditable and bounded, but trust nonetheless, and weaker than today's assumevalid in the way that matters: you cannot switch it off and re-derive, because the bytes are gone.
A succinct zero-knowledge proof, a zk proof, that a pruned transaction was well-formed and its commitment honest, checkable without the data itself. This is not speculative. It is the load-bearing idea behind Mina, whose entire chain stays a few kilobytes by proving its own history recursively, and behind the ZK-rollups that prove the validity of data they never ask every node to store. The primitive is deployed; what is missing is anyone applying it to Bitcoin's own data carriers. And it proves only half the problem: that a pruned transaction was valid, not where its payload now lives. Validity without availability is real progress, but it still leans on one of the options above to hold the bytes. The most ambitious door, and the one already real elsewhere.
Let a few archive nodes hold the flagged content while validators discard it. It works, but the archivers absorb the distribution and legal risk the rest of the network sheds, and everyone inherits a data-availability dependency on them. The content is removable from most nodes only because someone agreed to keep being exposed.
The capitulation. Periodic checkpoints, prune everything beneath, lean on the chain's own accumulated work. Maximum trust, minimum effort, the path of least resistance, and the one the network slides into by default if it builds none of the others.
Four of the five reintroduce trust in some measure; only segregated data removes the dilemma instead of administering it. That is the choice in front of Bitcoin, and it is not a choice about content. It is a choice about whether "verify" is still a thing a node does.
Bitcoin's promise is four words: don't trust, verify. A full node can verify because it can recompute everything from the raw chain. Permit un-committed bulk data, and you manufacture content that one day must be removed, and the day it is removed, recomputation ends and verification quietly degrades into assertion. The node stops checking and starts believing.
Nothing announces this. There is no flag day, no failed block, no error in the logs. It happens one pruned transaction at a time, until "verify" has come to mean "trust whoever still has the bytes." The longer the chain runs and the more it is forced to forget, the more of its own history it can no longer prove, only repeat.
And "whoever still has the bytes" is a shrinking set, which is the part that is easy to miss. Some nodes will always try to keep the entire chain, illegal content and all, but doing so grows more expensive and more legally exposed every year, so that pool thins. This is not the reassurance it sounds like. A smaller pool is a more concentrated one, the rest of the network leaning on fewer and fewer keepers, and a more targetable one: a diffuse network cannot be censored, but a handful of named archives can be taken down one mole at a time. Trustlessness does not switch off; it erodes into a few chokepoints, and the fewer they are, the easier they are both to lean on and to shut down. The binary, keep or drop, is only how it looks from a single node. Across the network it is a slope.
That is the transition: not a coup, not a bug, but a serialization choice slowly cashing itself out. A censorship-resistant chain of proofs becoming a fragile archive of claims. The fix is as quiet as the failure (commit, don't carry), and it is still available, because most of the damage is ahead of us, not behind.
And trustlessness was never a binary. No chain is perfectly trustless; assumevalid already trades a sliver of it for speed. The goal is not a hundred per cent, which no one can promise, but to stay as close as we can and refuse the changes that walk us away for nothing. That is what the reduced-data fight is, and it is still live: the limit is contested, not buried; the filter still has defenders; the camp is still standing. Bitcoin is near the top of this slope, not the bottom, and which way it slides is a matter of will, not cryptography. Every byte the network declines to carry, every node that holds the line, keeps it nearer the trustless end than it would otherwise be, and that margin is still ours to fight for.
It comes down to two mantras, and the fight for reduced data decides which one Bitcoin keeps.
Trust, don't verify.
Don't trust, verify.