Skip to content

Easy parallelizability #648

@vbuterin

Description

@vbuterin
Contributor

Parameters

  • MAX_THREADS: 8

Specification

Introduce a new type of transaction (following same general format as #232):

[2, network_id, startgas, to, data, ranges]

Where ranges is an RLP list, where each value in the list is an address prefix, representing the range of all addresses with that prefix. That is, for example:

  • \x01 refers to the set of addresses from 0x0100000000000000000000000000000000000000 to 0x01ffffffffffffffffffffffffffffffffffffff
  • \x02\x48\xab refers to the set of addresses from 0x0248ab0000000000000000000000000000000000 to 0x0248abffffffffffffffffffffffffffffffffff
  • 'l#\xfa\xce\x01O \xb3\xeb\xb6Z\xe9m\r\x7f\xf3*\xb9L\x17' refers to the one-element set containing only the address 0x6c23face014f20b3ebb65ae96d0d7ff32ab94c17
  • The empty string refers to the set of all addresses

Prefixes longer than 20 bytes are illegal (ie. transactions containing such prefixes are invalid). We call the "range of a transaction" the union of all ranges specified in that transaction if the transaction is of this new type, and the set of all addresses otherwise. For example, if a transaction's ranges field is of the form [\x01, \x02\x03, \xff\x8c\x45], its range is the set of all addresses that start with either 0x01, or 0x0203, or 0xff8c45.

We keep track of a starting_gas_height and finishing_gas_height for each transaction. We define the starting_gas_height of a transaction T to be the maximum of all finishing_gas_heights of all transactions T' before T such that either (i) T' is more than MAX_THREADS behind T (ie. txindex_of(T) - txindex_of(T') > MAX_THREADS) or (ii) the ranges of T and T' intersect. We define the finishing_gas_height of T to be the starting_gas_height of T plus the amount of gas consumed while executing T.

The current rule that a transaction is invalid if its start_gas plus the current total_gas_used exceeds the block gas_limit is removed, and replaced with a rule that a transaction T is invalid if: T.starting_gas_height + T.start_gas > gas_limit. Notice that in the case where all transactions use the entire address space as their range, this is exactly equivalent to the current status quo, but in the case where transactions use disjoint ranges, this increases capacity.

Any CALL, CREATE, CALLCODE, DELEGATECALL or STATICCALL that attempts to access an account outside a transaction's range fails; any EXTCODESIZE, EXTCODECOPY, BALANCE or SELFDESTRUCT that attempts to access an account outside a transaction's range immediately throws an exception. Addresses 0...127 have one exception: any account can STATICCALL them. We add a binding norm on protocol development that no address may be introduced into that range which has logic such that it may return a differing value depending on transactions that take place within a block (though changes between blocks are acceptable).

Rationale

This allows transactions in the EVM to be processed in parallel much more easily, by specifying statically what addresses they can access; it also appropriately incentivizes making transactions that are easy to parallelize. It does this in a maximally backwards-compatible and non-intrusive way where old-style transactions can continue to work.

The exception for addresses 0...127 is introduced to allow access to precompiles and to system contracts like BLOCKHASH.

Note that this EIP does change the role of the block gas limit from being a limit on total gas consumed to being a limit on gas height. This means that the theoretical max capacity of the blockchain would increase by a factor of NUM_THREADS, though if we want to avoid increasing uncle rates it does require the assumption that miners are running on hardware that has NUM_THREADS CPU cores. For clarity, it may be a good idea to rename the block gas_limit to gas_height_limit.

Client implementation

A client can follow the following algorithm:

  1. Let TXINDEX = 0
  2. Initialize txgrab as an empty list.
  3. WHILE (i) the transaction in the block at index TXINDEX does not intersect with any transaction in txgrab, (ii) len(txgrab) < NUM_THREADS and (iii) TXINDEX < len(block.transactions), add the transaction in the block at index TXINDEX to txgrab and set TXINDEX += 1.
  4. Execute all transactions in txgrab in parallel.
  5. If TXINDEX == len(block.transactions), exit. Otherwise, go back to step 2.

Specification, v2

Modify the new transaction type to be as follows:

[2, network_id, startgas, to, data, read_ranges, write_ranges]

The read range of a transaction is the union of all read_ranges, and the write range of a transaction is the union of all write_ranges. Replace "the ranges of T and T' intersect" above with "the read or write range of T intersects with the write range of T' " (where T is the current transaction and T' is a previous transaction). Read-only operations are allowed to access the read or write ranges, write-only operations are allowed to access the write range only.

This adds a further optimization where if two transactions' read ranges intersect at data that is not written to, then they can still be processed in parallel.

The exception for addresses 0...127 can be removed, as any contracts that wish to use the precompiles can simply include the addresses in their read range, and even if all transactions include those addresses in their read range parallelizability will not be significantly impacted as almost no one wants to write to those addresses.

Proposed amendment 1 (thanks @Arachnid for suggestion)

Instead of each range being an RLP list, it is simply a byte array, where each byte represents a byte prefix (eg. the byte array \x03\x35\xfe means "the set of all addresses starting with 0x03, 0x35 or 0xfe"). An empty byte array represents the set of all addresses. This simplifies implementation at some cost to granularity, though the value of that level of granularity is arguably low.

Activity

mattdf

mattdf commented on Jun 18, 2017

@mattdf
Member

I don't entirely understand the calculation for starting_gas_height and finishing_gas_height, what's the point of that above just computing startgas*len(tx_set) for each tx_set in rangeset R?

chrisfranko

chrisfranko commented on Jun 18, 2017

@chrisfranko

interesting. Can you give an example of when this would be best taken advantage of?

VictorTaelin

VictorTaelin commented on Jun 19, 2017

@VictorTaelin

Wondering why this is at transaction level, not contract level. If a contract could specify what other contracts it can communicate with, then all transactions sent to it would be computed in parallel, and as a bonus the blockchain itself could eventually be.

Arachnid

Arachnid commented on Jun 19, 2017

@Arachnid
Contributor

Given addresses are randomly distributed throughout the address space, I'm not sure how useful a prefix-based scheme is likely to be - how often will contracts you wish to interact with be conveniently clustered under a common prefix?

Arachnid

Arachnid commented on Jun 19, 2017

@Arachnid
Contributor

@MaiaVictor What if the contract can communicate with other contracts specified at runtime?

heikoheiko

heikoheiko commented on Jun 20, 2017

@heikoheiko
Member

Once we have #98 clients can also implement optimistic concurrent processing of txs. The assumption here would be, that most txs in a block don't share any state, except for block.coinbase. So we'd need a different mechanism to collect fees for the miner, but otherwise this can be done without any further protocol updates (a hint on parallelizable group of txs in the transaction list, might be helpful though).

PhABC

PhABC commented on Jun 21, 2017

@PhABC
Contributor

@Arachnid From my understanding, you could maybe have 10 contracts for a given Token (e.g. SNT) where each SNT contract only has access to 1/10 of the address space, with the ability to do cross-token-contract swaps as well. If this is possible, then you could make SNT transactions parrelizable to some extent.

That's the best example I could come up with, but I am sure there are more sensible ones.

Smithgift

Smithgift commented on Jun 21, 2017

@Smithgift

I would rather that DELEGATECALL (and for the sake of symmetry, CALLCODE) worked out-of-range. Blindly deploying a contract with libraries is unlikely to group them in the same range. Technically, one could brute-force them into doing so by grinding away at addresses and nonces, but this would be incredibly user-unfriendly. It is also backwards-incompatible.

I do realize that a DELEGATECALL could fail if the targeted contract selfdestructed. One possible solution would be that a DELEGATECALL uses the code of a contract at the beginning of a block, and ignore selfdestructs for this one purpose. That adds a wart, but it also improves the life of the average user.

All that said, I agree with @Arachnid. In a way, this penalizes non-address-grinded multi-contract dapps, For that matter, if address grinding ever became a significant factor in dapp design, users will inevitably want to cluster themselves around popular ranges, leading to pseudo-shards. Elegant and emergent in its own bizarre way, but I don't think it would be desirable compared to an actual first-class sharding system.

@heikoheiko: I'm curious why sharing block.coinbase is a problem. Unless that one address is involved with the transaction, there should be no difference whenever it receives rewards.

vbuterin

vbuterin commented on Jun 23, 2017

@vbuterin
ContributorAuthor

Once we have #98 clients can also implement optimistic concurrent processing of txs. The assumption here would be, that most txs in a block don't share any state, except for block.coinbase. So we'd need a different mechanism to collect fees for the miner, but otherwise this can be done without any further protocol updates (a hint on parallelizable group of txs in the transaction list, might be helpful though).

Agree. However, there is currently no gas incentive to make nicely parallelizable txs, and furthermore the worst case still has zero parallelization, and client devs would still have to design for the worst case. This EIP increases the gas limit for "normal usage" by up to 8x but does NOT worsen the worst case at all, and it also introduces an up to 8x subsidy for parallelizable txs.

Given addresses are randomly distributed throughout the address space, I'm not sure how useful a prefix-based scheme is likely to be - how often will contracts you wish to interact with be conveniently clustered under a common prefix?

Most transactions only need to interact with one contract; with EIP86 maybe two. To give one trivial example, any transactions that use two distinct ERC20 tokens will usually have independent read/write ranges.

If a contract could specify what other contracts it can communicate with, then all transactions sent to it would be computed in parallel, and as a bonus the blockchain itself could eventually be.

Consider things this way:

  1. (a) There exist contracts that only need to interact with a few other contracts. (b) There also exist contracts that could theoretically interact with anything.
  2. (a) There exist transactions that only need to interact with a few known contracts. (b) There also exist transactions that could theoretically interact with any address.

Ideally, we want to retain support for 1b and 2b, but recognize and subsidize 1a and 2a. If we recognize 1a in protocol, then we don't recognize all of 2a, because there are cases where a contract could theoretically interact with anything (quick example: decentralized exchange), but the transaction can be more static (one particular order in the DEX with a known counterparty). However, any instance of 1a is also an instance of 2a, as if a given contract C only affects D1, D2 and D3 then any transaction sending to C can only affect C, D1, 2 and D3. Hence, 1a is a subset of 2a, and so 2a is the more general thing to recognize and subsidize in protocol.

interesting. Can you give an example of when this would be best taken advantage of?

Some quick examples:

  1. Transactions going to an ICO can happen in parallel to almost everything else, so the ICO would not interfere with normal traffic the same way. In fact, Status-style network congestion would not be possible unless there were 8 ICOs happening at the same time.
  2. ERC20 tokens can be parallelized (ie. transactions sending MKR and transactions sending REP can be totally separated)

I would even go so far as to say that nearly all activity on the current ETH chain would be parallelizable.

I would rather that DELEGATECALL (and for the sake of symmetry, CALLCODE) worked out-of-range.

This is what specification v2 is for. Libraries are generally read-only, and v2 allows for intersecting read ranges, so it should cover all of these use cases.

vbuterin

vbuterin commented on Jun 23, 2017

@vbuterin
ContributorAuthor

how often will contracts you wish to interact with be conveniently clustered under a common prefix?

I think you might have misunderstood. My scheme does not require you to choose a single prefix. You can set the read or write range for a transaction as the union of multiple prefixes. So address grinding should not be necessary or particularly helpful.

vbuterin

vbuterin commented on Jun 23, 2017

@vbuterin
ContributorAuthor

I don't entirely understand the calculation for starting_gas_height and finishing_gas_height, what's the point of that above just computing startgas*len(tx_set) for each tx_set in rangeset R?

Not sure I understand your proposal. The intuition behind my proposal is that the finishing_gas_height of a transaction can be thought of as an upper bound on the computational "depth" of the transcation - that is, the number of clock cycles of time needed to execute that transaction and all of its dependencies that can't be parallelized.

I'll give an example. Suppose you have four transactions, and suppose that the full address range is the letters ABCDEFGHIJ for simplicity. The first tx has 25000 gas and read/write ranges ABCD. The second tx has 50000 gas and read ranges EFGHIJ and write ranges EFG. The third tx has 65000 gas and read ranges EFGHIJ and write ranges HIJ. All three of those txs can be executed in parallel. Now, suppose the fourth tx has 20000 gas and has read/write ranges CDEF. Executing this tx requires having already executed the first and the second, though not the third (as the third does not write anything that the fourth reads or writes); hence its finishing_gas_height is max(25000, 50000) + 20000 = 70000. A computer with 8 threads would be able to finish processing all four transactions in 70000 cycles (assuming 1 cycle = 1 gas); hence these 4 transactions could fit into a block with a 70000 gas limit.

Hope that makes some sense.

aakilfernandes

aakilfernandes commented on Jun 23, 2017

@aakilfernandes

I like this proposal a lot. Given that developers usually opt towards "pull" rather than "push" transactions, most transactions should be able to take advantage of this.

If the subsidy is removed, could this be implemented as a soft fork? Client devs may struggle with multi-threaded implementations and I don't see a pressing need for non-miners to implement this.

LefterisJP

LefterisJP commented on Jun 23, 2017

@LefterisJP
Contributor

@vbuterin I have a question on finishing_gas_height which you described as the computational depth of the transaction.

In your example 4 transactions you calculate: finishing_gas_height(T3) = max(25000, 50000) + 20000 = 70000

The number 70000 is the maximum gas needed to execute that transaction and all of its dependencies that can't be parallelized. Correct?

But then you mention that:

these 4 transactions could fit into a block with a 70000 gas limit

Unless I am missing something, the block's max gas limit required shouldn't change due to the way you process the transactions .Shouldn't it be 160000, basically the sum of all transactions of the block?

LefterisJP

LefterisJP commented on Jun 23, 2017

@LefterisJP
Contributor

Some additional comments on the actual spec:

I really like the idea and believe it will really help relieve the network pressure in Status-like ICO situations.

Even though I suppose it would be a bit more work, I think that spec V2 where you specifically specify both read and write ranges of the transaction is far superior since it will allow for more efficient parallelization.

vbuterin

vbuterin commented on Jun 23, 2017

@vbuterin
ContributorAuthor

Unless I am missing something, the block's max gas limit required shouldn't change due to the way you process the transactions .Shouldn't it be 160000, basically the sum of all transactions of the block?

This EIP will change the role of the block gas limit. So a 4.7m gas limit would be not just 4.7m gas worth of computation, but rather 4.7m gas worth of time on an 8-core machine. Perhaps renaming it to gas_height_limit would make that clearer?

Even though I suppose it would be a bit more work, I think that spec V2 where you specifically specify both read and write ranges of the transaction is far superior since it will allow for more efficient parallelization.

Agree. I definitely believe v2 is superior.

29 remaining items

jannikluhn

jannikluhn commented on Mar 29, 2018

@jannikluhn

As discussed at the sharding workshop last week, above definitions of starting and finishing gas heights can lead to suboptimal results if transactions are not ordered properly (or I have misunderstood the proposal). Example:

  • MAX_THREADS = 2
  • gas_limit = 150k
  • tx1.gas_used = 50k
  • tx2.gas_used = 50k
  • tx3.gas_used = 100k
  • tx4.gas_used = 100k
  • All access lists are non-overlapping

If the transactions are ordered [tx1, tx2, tx3, tx4] the "gas ranges" (starting_gas_height and finishing_gas_height are as follows:

  • tx1: 0 - 50k
  • tx2: 0 - 50k
  • tx3: 50k - 150k
  • tx4: 50k - 150k

However, if the order is [tx3, tx1, tx2, tx4]:

  • tx3: 0 - 100k
  • tx1: 0 - 50k
  • tx2: 100k - 150k
  • tx4: 100k - 200k

So in this order the global gas limit would be exceeded, although a sensible miner strategy can execute the transactions in the same amount of time ("schedule the next transaction if there's a free thread and it doesn't overlap with any non-executed earlier transaction").

I think this could be fixed by keeping track of MAX_THREAD separate gas heights and, for each transaction, choosing the lowest possible one under the constraint of no conflicting parallel transactions.

I wanted to point this out, but in my opinion it is not worth implementing this more complicated algorithm. Both methods appear be equivalent if the miner (or, more generally, the proposer) chooses an optimal transaction order in the first place. This needs to be done anyway in order to maximize gas usage if transactions do have overlapping access lists.

joeykrug

joeykrug commented on May 18, 2018

@joeykrug

Whatever happened to this? Long ago it was slated for metropolis

cdetrio

cdetrio commented on Jun 8, 2018

@cdetrio
Member

@joeykrug, cross-posting my reply from reddit:

It was discussed on an AllCoreDevs call around the time (it was never "slated for Metropolis", as in it was never officially Accepted for inclusion in the next HF on an All Core Devs call). As I recall, the feedback was that increasing the number of CPU threads doesn't address the actual bottleneck in scaling tx throughput, which is disk I/O (not CPU processing). You can see this feedback in the github issue itself and in vbuterin's reply:

This assumes that disk/memory operations are parallelizable, when in fact we usually find these to be some of the largest bottlenecks while optimizing.

I'd definitely be very interested in getting more data on parallelization and SSD reads.

Once it became widely realized that disk I/O is the bottleneck, research began on optimizing the state trie's data structure, and how it might be possible to parallelize SSD reads: Roadmap for Turbo-Geth.

After EIP 648, the next research thrust was on stateless clients: "this is in many ways a supercharged version of EIP 648." Stateless clients would enable a sort of 'parallelized state i/o' because all accessed state is pre-specified in the witness list (i.e. all accessed state is pre-loaded from disk into memory before the transaction is processed; in theory disk I/O of the validating node is no longer the bottleneck).

In subsequent months, discussions around sharding and stateless clients mainly revolve around the question of "stateless clients vs storage rent". And that brings us to the present day, ime.

AlexeyAkhunov

AlexeyAkhunov commented on Dec 11, 2018

@AlexeyAkhunov
Contributor

Would this not allow the DAO-style soft forks, and with it, censorship? http://hackingdistributed.com/2016/07/05/eth-is-more-resilient-to-censorship/

If the ranges are tight, then miners can cheaply figure out what are the effects of the transaction, and censor them. If the range are broad, then there is less effect of the parallelisation. They way I see it - it is trading off censorship resistance for more parallel execution. Not sure this tradeoff is worth doing

github-actions

github-actions commented on Jan 2, 2022

@github-actions

There has been no activity on this issue for two months. It will be closed in a week if no further activity occurs. If you would like to move this EIP forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review.

github-actions

github-actions commented on Jan 16, 2022

@github-actions

This issue was closed due to inactivity. If you are still pursuing it, feel free to reopen it and respond to any feedback or request a review in a comment.

borispovod

borispovod commented on Mar 29, 2024

@borispovod

Why not use a similar approach to BlockSTM? I think Polygon has already implemented it for EVM or at least an MVP.

laurentyzhang

laurentyzhang commented on Jul 19, 2024

@laurentyzhang

Why not use a similar approach to BlockSTM? I think Polygon has already implemented it for EVM or at least an MVP.

This belongs to pessimistic concurrency control, but BlockSTM is based on optimistic concurrency control. They are not very similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @Arachnid@robcat@cdetrio@aakilfernandes@LefterisJP

        Issue actions

          Easy parallelizability · Issue #648 · ethereum/EIPs