Assuming each transaction has two properties:
- has a limited expiry time(on flow through referencing a block, bounded by a max number of blocks in the past)
- has a unique way of reference (i.e. flow transaction ID is protected against malleability)
As part of the execution state on execution nodes, we could add a secondary on-chain Merkle trie (beside account states) to capture transaction inclusion. Every time a transaction is about to be executed EN nodes add the transaction to the merkle tree and if it already exist it rejects execution of the transaction
Just like the execution state, we would to only keep this state on the execution nodes and provides a secondary level of proofs to verification nodes to verify inclusion or non-inclusion of a transaction. (inclusion proof of transaction before execution shows why execution was rejected).
Clearly, this trie would keep growing and we need pruning and thats might look challenging initially. One way to think about this is we only need to keep executed but not expired transaction in the trie, if an expired transaction is included, then it won’t be executed and would make the collection invalid which already Flow protocol handles if it is not expired but executed in the past we need to check the trie and make sure its the first time we are executing this transaction.
But how do we prune this trie and remove transactions that have already expired without having this tracking on verification nodes?
Our proposal
We have designed a method that could work using two tricks
- special consideration for path construction of a transactions
- branch pruning with proofs
Let’s consider for inserting a transaction into the trie, we don’t use the transaction ID and we use a longer variation of path, concatenation of block reference and transaction ID. Note the block reference here is different from the block that the transaction is getting executed, it just simply the value that is set on the transaction for expiry. This way when we are inserting transactions into the trie, it would be automatically update sub-tries that has to be pruned after a block is not valid for referencing anymore (or all transaction in that subtrie gets expired).
This way we don’t need to track transactions in a secondary location and trie it’s encapsulating that information within. This pruning by subpath or path prefix (block hash) could be done on system chunk and be verified by verification nodes. the proof here would be simply replacing an interim node with a default node or an empty node, thus we don’t need to provide a batch of proof anymore.
How about proof sizes?
Our calculation shows the proof size even on fully matured Flow (millions of transactions) is still manageable and also since this trie is on a separate subtrie than the execution node, these proofs could be constructed in parallel to collection execution (pipelined) and could be send as a secondary package for data beside chunk data packs for verification.
About the trie?
We use a special variation of a Merkle-trie optimized for this problem and provides set commitment over the list of non-expired transaction executed recently and going to be executed in the current collection.
This special data structure
- supports variable length paths and interim node compactions
- is optimized for batch insertion of non-existing values and returning errors if any value already existed
- is optimized for easy dropping the values on a sub-trie given a sub-path (with prune-proof)
- is optimized for batch proof of non-inclusion of values (in practice batch proof of inclusion of empty values helps the verification nodes to have the full structure of the partial trie)
What about transaction ordering?
I believe transaction ordering is still possible but with explicit hints on transaction, a transaction could also reference another unexpired transaction to tell collection nodes and execution nodes to only include if the previous one has been executed. This only requires the trie to hold on to transactions double the max size of expiry in the trie.