-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Description
Parameters
MAX_THREADS
: 8
Specification
Introduce a new type of transaction (following same general format as #232):
[2, network_id, startgas, to, data, ranges]
Where ranges
is an RLP list, where each value in the list is an address prefix, representing the range of all addresses with that prefix. That is, for example:
\x01
refers to the set of addresses from 0x0100000000000000000000000000000000000000 to 0x01ffffffffffffffffffffffffffffffffffffff\x02\x48\xab
refers to the set of addresses from 0x0248ab0000000000000000000000000000000000 to 0x0248abffffffffffffffffffffffffffffffffff'l#\xfa\xce\x01O \xb3\xeb\xb6Z\xe9m\r\x7f\xf3*\xb9L\x17'
refers to the one-element set containing only the address 0x6c23face014f20b3ebb65ae96d0d7ff32ab94c17- The empty string refers to the set of all addresses
Prefixes longer than 20 bytes are illegal (ie. transactions containing such prefixes are invalid). We call the "range of a transaction" the union of all ranges specified in that transaction if the transaction is of this new type, and the set of all addresses otherwise. For example, if a transaction's ranges
field is of the form [\x01, \x02\x03, \xff\x8c\x45]
, its range is the set of all addresses that start with either 0x01, or 0x0203, or 0xff8c45.
We keep track of a starting_gas_height
and finishing_gas_height
for each transaction. We define the starting_gas_height
of a transaction T to be the maximum of all finishing_gas_height
s of all transactions T' before T such that either (i) T' is more than MAX_THREADS behind T (ie. txindex_of(T) - txindex_of(T') > MAX_THREADS
) or (ii) the ranges of T and T' intersect. We define the finishing_gas_height
of T to be the starting_gas_height
of T plus the amount of gas consumed while executing T.
The current rule that a transaction is invalid if its start_gas
plus the current total_gas_used
exceeds the block gas_limit
is removed, and replaced with a rule that a transaction T is invalid if: T.starting_gas_height + T.start_gas > gas_limit
. Notice that in the case where all transactions use the entire address space as their range, this is exactly equivalent to the current status quo, but in the case where transactions use disjoint ranges, this increases capacity.
Any CALL, CREATE, CALLCODE, DELEGATECALL or STATICCALL that attempts to access an account outside a transaction's range fails; any EXTCODESIZE, EXTCODECOPY, BALANCE or SELFDESTRUCT that attempts to access an account outside a transaction's range immediately throws an exception. Addresses 0...127 have one exception: any account can STATICCALL them. We add a binding norm on protocol development that no address may be introduced into that range which has logic such that it may return a differing value depending on transactions that take place within a block (though changes between blocks are acceptable).
Rationale
This allows transactions in the EVM to be processed in parallel much more easily, by specifying statically what addresses they can access; it also appropriately incentivizes making transactions that are easy to parallelize. It does this in a maximally backwards-compatible and non-intrusive way where old-style transactions can continue to work.
The exception for addresses 0...127 is introduced to allow access to precompiles and to system contracts like BLOCKHASH.
Note that this EIP does change the role of the block gas limit from being a limit on total gas consumed to being a limit on gas height. This means that the theoretical max capacity of the blockchain would increase by a factor of NUM_THREADS, though if we want to avoid increasing uncle rates it does require the assumption that miners are running on hardware that has NUM_THREADS CPU cores. For clarity, it may be a good idea to rename the block gas_limit
to gas_height_limit
.
Client implementation
A client can follow the following algorithm:
- Let
TXINDEX = 0
- Initialize
txgrab
as an empty list. - WHILE (i) the transaction in the block at index
TXINDEX
does not intersect with any transaction intxgrab
, (ii)len(txgrab) < NUM_THREADS
and (iii)TXINDEX < len(block.transactions)
, add the transaction in the block at indexTXINDEX
totxgrab
and setTXINDEX += 1
. - Execute all transactions in
txgrab
in parallel. - If
TXINDEX == len(block.transactions)
, exit. Otherwise, go back to step 2.
Specification, v2
Modify the new transaction type to be as follows:
[2, network_id, startgas, to, data, read_ranges, write_ranges]
The read range of a transaction is the union of all read_ranges
, and the write range of a transaction is the union of all write_ranges
. Replace "the ranges of T and T' intersect" above with "the read or write range of T intersects with the write range of T' " (where T is the current transaction and T' is a previous transaction). Read-only operations are allowed to access the read or write ranges, write-only operations are allowed to access the write range only.
This adds a further optimization where if two transactions' read ranges intersect at data that is not written to, then they can still be processed in parallel.
The exception for addresses 0...127 can be removed, as any contracts that wish to use the precompiles can simply include the addresses in their read range, and even if all transactions include those addresses in their read range parallelizability will not be significantly impacted as almost no one wants to write to those addresses.
Proposed amendment 1 (thanks @Arachnid for suggestion)
Instead of each range being an RLP list, it is simply a byte array, where each byte represents a byte prefix (eg. the byte array \x03\x35\xfe
means "the set of all addresses starting with 0x03, 0x35 or 0xfe"). An empty byte array represents the set of all addresses. This simplifies implementation at some cost to granularity, though the value of that level of granularity is arguably low.
Activity
mattdf commentedon Jun 18, 2017
I don't entirely understand the calculation for
starting_gas_height
andfinishing_gas_height
, what's the point of that above just computingstartgas*len(tx_set)
for eachtx_set
in rangeset R?chrisfranko commentedon Jun 18, 2017
interesting. Can you give an example of when this would be best taken advantage of?
VictorTaelin commentedon Jun 19, 2017
Wondering why this is at transaction level, not contract level. If a contract could specify what other contracts it can communicate with, then all transactions sent to it would be computed in parallel, and as a bonus the blockchain itself could eventually be.
Arachnid commentedon Jun 19, 2017
Given addresses are randomly distributed throughout the address space, I'm not sure how useful a prefix-based scheme is likely to be - how often will contracts you wish to interact with be conveniently clustered under a common prefix?
Arachnid commentedon Jun 19, 2017
@MaiaVictor What if the contract can communicate with other contracts specified at runtime?
heikoheiko commentedon Jun 20, 2017
Once we have #98 clients can also implement optimistic concurrent processing of txs. The assumption here would be, that most txs in a block don't share any state, except for
block.coinbase
. So we'd need a different mechanism to collect fees for the miner, but otherwise this can be done without any further protocol updates (a hint on parallelizable group of txs in the transaction list, might be helpful though).PhABC commentedon Jun 21, 2017
@Arachnid From my understanding, you could maybe have 10 contracts for a given Token (e.g. SNT) where each SNT contract only has access to 1/10 of the address space, with the ability to do cross-token-contract swaps as well. If this is possible, then you could make SNT transactions parrelizable to some extent.
That's the best example I could come up with, but I am sure there are more sensible ones.
Smithgift commentedon Jun 21, 2017
I would rather that DELEGATECALL (and for the sake of symmetry, CALLCODE) worked out-of-range. Blindly deploying a contract with libraries is unlikely to group them in the same range. Technically, one could brute-force them into doing so by grinding away at addresses and nonces, but this would be incredibly user-unfriendly. It is also backwards-incompatible.
I do realize that a DELEGATECALL could fail if the targeted contract selfdestructed. One possible solution would be that a DELEGATECALL uses the code of a contract at the beginning of a block, and ignore selfdestructs for this one purpose. That adds a wart, but it also improves the life of the average user.
All that said, I agree with @Arachnid. In a way, this penalizes non-address-grinded multi-contract dapps, For that matter, if address grinding ever became a significant factor in dapp design, users will inevitably want to cluster themselves around popular ranges, leading to pseudo-shards. Elegant and emergent in its own bizarre way, but I don't think it would be desirable compared to an actual first-class sharding system.
@heikoheiko: I'm curious why sharing
block.coinbase
is a problem. Unless that one address is involved with the transaction, there should be no difference whenever it receives rewards.vbuterin commentedon Jun 23, 2017
Agree. However, there is currently no gas incentive to make nicely parallelizable txs, and furthermore the worst case still has zero parallelization, and client devs would still have to design for the worst case. This EIP increases the gas limit for "normal usage" by up to 8x but does NOT worsen the worst case at all, and it also introduces an up to 8x subsidy for parallelizable txs.
Most transactions only need to interact with one contract; with EIP86 maybe two. To give one trivial example, any transactions that use two distinct ERC20 tokens will usually have independent read/write ranges.
Consider things this way:
Ideally, we want to retain support for 1b and 2b, but recognize and subsidize 1a and 2a. If we recognize 1a in protocol, then we don't recognize all of 2a, because there are cases where a contract could theoretically interact with anything (quick example: decentralized exchange), but the transaction can be more static (one particular order in the DEX with a known counterparty). However, any instance of 1a is also an instance of 2a, as if a given contract C only affects D1, D2 and D3 then any transaction sending to C can only affect C, D1, 2 and D3. Hence, 1a is a subset of 2a, and so 2a is the more general thing to recognize and subsidize in protocol.
Some quick examples:
I would even go so far as to say that nearly all activity on the current ETH chain would be parallelizable.
This is what specification v2 is for. Libraries are generally read-only, and v2 allows for intersecting read ranges, so it should cover all of these use cases.
vbuterin commentedon Jun 23, 2017
I think you might have misunderstood. My scheme does not require you to choose a single prefix. You can set the read or write range for a transaction as the union of multiple prefixes. So address grinding should not be necessary or particularly helpful.
vbuterin commentedon Jun 23, 2017
Not sure I understand your proposal. The intuition behind my proposal is that the finishing_gas_height of a transaction can be thought of as an upper bound on the computational "depth" of the transcation - that is, the number of clock cycles of time needed to execute that transaction and all of its dependencies that can't be parallelized.
I'll give an example. Suppose you have four transactions, and suppose that the full address range is the letters ABCDEFGHIJ for simplicity. The first tx has 25000 gas and read/write ranges ABCD. The second tx has 50000 gas and read ranges EFGHIJ and write ranges EFG. The third tx has 65000 gas and read ranges EFGHIJ and write ranges HIJ. All three of those txs can be executed in parallel. Now, suppose the fourth tx has 20000 gas and has read/write ranges CDEF. Executing this tx requires having already executed the first and the second, though not the third (as the third does not write anything that the fourth reads or writes); hence its finishing_gas_height is max(25000, 50000) + 20000 = 70000. A computer with 8 threads would be able to finish processing all four transactions in 70000 cycles (assuming 1 cycle = 1 gas); hence these 4 transactions could fit into a block with a 70000 gas limit.
Hope that makes some sense.
aakilfernandes commentedon Jun 23, 2017
I like this proposal a lot. Given that developers usually opt towards "pull" rather than "push" transactions, most transactions should be able to take advantage of this.
If the subsidy is removed, could this be implemented as a soft fork? Client devs may struggle with multi-threaded implementations and I don't see a pressing need for non-miners to implement this.
LefterisJP commentedon Jun 23, 2017
@vbuterin I have a question on
finishing_gas_height
which you described as the computational depth of the transaction.In your example 4 transactions you calculate:
finishing_gas_height(T3) = max(25000, 50000) + 20000 = 70000
The number
70000
is the maximum gas needed to execute that transaction and all of its dependencies that can't be parallelized. Correct?But then you mention that:
Unless I am missing something, the block's max gas limit required shouldn't change due to the way you process the transactions .Shouldn't it be
160000
, basically the sum of all transactions of the block?LefterisJP commentedon Jun 23, 2017
Some additional comments on the actual spec:
I really like the idea and believe it will really help relieve the network pressure in Status-like ICO situations.
Even though I suppose it would be a bit more work, I think that
spec V2
where you specifically specify both read and write ranges of the transaction is far superior since it will allow for more efficient parallelization.vbuterin commentedon Jun 23, 2017
This EIP will change the role of the block gas limit. So a 4.7m gas limit would be not just 4.7m gas worth of computation, but rather 4.7m gas worth of time on an 8-core machine. Perhaps renaming it to
gas_height_limit
would make that clearer?Agree. I definitely believe v2 is superior.
29 remaining items
jannikluhn commentedon Mar 29, 2018
As discussed at the sharding workshop last week, above definitions of starting and finishing gas heights can lead to suboptimal results if transactions are not ordered properly (or I have misunderstood the proposal). Example:
MAX_THREADS = 2
gas_limit = 150k
tx1.gas_used = 50k
tx2.gas_used = 50k
tx3.gas_used = 100k
tx4.gas_used = 100k
If the transactions are ordered
[tx1, tx2, tx3, tx4]
the "gas ranges" (starting_gas_height
andfinishing_gas_height
are as follows:tx1: 0 - 50k
tx2: 0 - 50k
tx3: 50k - 150k
tx4: 50k - 150k
However, if the order is
[tx3, tx1, tx2, tx4]
:tx3: 0 - 100k
tx1: 0 - 50k
tx2: 100k - 150k
tx4: 100k - 200k
So in this order the global gas limit would be exceeded, although a sensible miner strategy can execute the transactions in the same amount of time ("schedule the next transaction if there's a free thread and it doesn't overlap with any non-executed earlier transaction").
I think this could be fixed by keeping track of
MAX_THREAD
separate gas heights and, for each transaction, choosing the lowest possible one under the constraint of no conflicting parallel transactions.I wanted to point this out, but in my opinion it is not worth implementing this more complicated algorithm. Both methods appear be equivalent if the miner (or, more generally, the proposer) chooses an optimal transaction order in the first place. This needs to be done anyway in order to maximize gas usage if transactions do have overlapping access lists.
joeykrug commentedon May 18, 2018
Whatever happened to this? Long ago it was slated for metropolis
cdetrio commentedon Jun 8, 2018
@joeykrug, cross-posting my reply from reddit:
It was discussed on an AllCoreDevs call around the time (it was never "slated for Metropolis", as in it was never officially
Accepted
for inclusion in the next HF on an All Core Devs call). As I recall, the feedback was that increasing the number of CPU threads doesn't address the actual bottleneck in scaling tx throughput, which is disk I/O (not CPU processing). You can see this feedback in the github issue itself and in vbuterin's reply:Once it became widely realized that disk I/O is the bottleneck, research began on optimizing the state trie's data structure, and how it might be possible to parallelize SSD reads: Roadmap for Turbo-Geth.
After EIP 648, the next research thrust was on stateless clients: "this is in many ways a supercharged version of EIP 648." Stateless clients would enable a sort of 'parallelized state i/o' because all accessed state is pre-specified in the witness list (i.e. all accessed state is pre-loaded from disk into memory before the transaction is processed; in theory disk I/O of the validating node is no longer the bottleneck).
In subsequent months, discussions around sharding and stateless clients mainly revolve around the question of "stateless clients vs storage rent". And that brings us to the present day, ime.
AlexeyAkhunov commentedon Dec 11, 2018
Would this not allow the DAO-style soft forks, and with it, censorship? http://hackingdistributed.com/2016/07/05/eth-is-more-resilient-to-censorship/
If the ranges are tight, then miners can cheaply figure out what are the effects of the transaction, and censor them. If the range are broad, then there is less effect of the parallelisation. They way I see it - it is trading off censorship resistance for more parallel execution. Not sure this tradeoff is worth doing
github-actions commentedon Jan 2, 2022
There has been no activity on this issue for two months. It will be closed in a week if no further activity occurs. If you would like to move this EIP forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review.
github-actions commentedon Jan 16, 2022
This issue was closed due to inactivity. If you are still pursuing it, feel free to reopen it and respond to any feedback or request a review in a comment.
borispovod commentedon Mar 29, 2024
Why not use a similar approach to BlockSTM? I think Polygon has already implemented it for EVM or at least an MVP.
laurentyzhang commentedon Jul 19, 2024
This belongs to pessimistic concurrency control, but BlockSTM is based on optimistic concurrency control. They are not very similar.