Skip to content

EIP 105 (Serenity): Binary sharding plus contract calling semantics #53

Closed
@vbuterin

Description

@vbuterin
Contributor

This EIP is a solidification of a proposal to implement a simple binary sharding scheme in Ethereum, in the spirit of what was shown at Devcon 1 here: https://docs.google.com/presentation/d/1CjD0W4l4-CwHKUvfF5Vlps76fKLEC6pIwu1a_kC_YRQ

Design goals include:

  • Minimal initial implementation complexity, while at the same time serving as an effective "on-ramp" easing the transition toward a fully scalable Ethereum 2.0
  • Backward-compatibility with the existing Ethereum 1.0 contracts/state/applications (though the economics of ethereum 2.0 may eventually render some of the contract interactions that are currently happening prohibitively expensive even if they remain theoretically possible)
  • Flexibility, including the ability to expand and decrease the de-facto number of "shards" as needed
  • The ability to offer something equivalent to sharding at a "sub-contract" level, or some similar mechanism in order to allow individual contract developers to easily create applications that are themselves sharded and scalable.
  • The ability to offer a flexible tradeoff to developers between transaction cost and being able to make synchronous calls across a large state

We can describe the scheme as follows:

  1. There is now a new type of object, called a transaction group. A transaction group specifies a list of transactions, a gas limit, and an activity range of size 2**k between 2**k * m and 2**k * (m+1), for some values k, m where 144 <= k <= 160,m >= 0 and 2**k * (m+1) <= 2**160. The intuition is that any valid activity range constitutes a "shard", with each block of 2**144 being a "bottom level shard". Every shard that is not bottom level can also be decomposed into two "child shards", one for the left half of its range and one for the right half.
  2. Transactions now also specify activity ranges. Transaction groups cannot include transactions whose activity range is outside their own.
  3. Instead of containing a tree of transactions, a block now contains a list of transaction groups, which MUST have disjoint activity ranges.
  4. When executing a transaction, any CALL or other operation that attempts to access an address which is outside of the transaction's containing group's activity range immediately triggers an out-of-range exception. New RANGEMIN and RANGEMAX opcodes are added to make it easier for developers to deal with this.
  5. When creating a new account, a transaction will now automatically set the first 160-k bits of the address so that the account that it creates will fit into the containing transaction group's range. CREATE operations work similarly, except that they always set the first 16 bits of the target address to fit into the same bottom-level shard.
  6. New specialized SSTORE and SLOAD opcodes are introduced along the existing ones, such that these opcodes take an additional "shard ID" as an argument, where the shard ID should be in the range 0 < k <= 65535. A contract at address A can use these opcodes to read and write the storage of addressk * 2**144 + A % 2**144, provided that this address is within the current activity range. Use of these opcodes will also fill in the target address's code to equal the sender's, if it is not yet set.
  7. Contracts now store a code hash at key '' in the tree, not the actual code (this reduces data duplication arising from (5)).
  8. The receipts for a transaction are now saved in the leftmost shard allowed by the transaction group (ie. at MINRANGE + (d % 2**144) where d is the current address of the receipt storing contract).
  9. If GL is the global gas limit, the transaction group gas limits must satisfy the formula sum([max(L[i]**1.5, L[i] *GL / 4) for L in transaction_group_gas_limits]) < GL**1.5. This allows blocks to include more gas, up to a maximum 4x increase, if they include transactions that can be executed in parallel (this limit can later on be removed or relaxed once proper sharding is implemented).

Philosophically speaking, this introduces the following considerations:

  • There are now three types of exceptional conditions, rather than two as before: (i) out-of-gas, (ii) preventable errors arising from badly written code, and (iii, newly) out-of-range. Out of range should be viewed as being philosophically similar to out of gas, and similar kinds of guards in the code should be used to prevent attacks in both cases.
  • The preferred paradigm for making scalable applications will now be to create a receipt in one shard representing half of a completed operation, and then consume the receipt in another shard, verifying it using a merkle branch plus the STATEROOT opcode. A precompile contract for doing this will likely be added. Applications designed to be asynchronous will thus always benefit from the highest gas discounts as each transaction will only need to touch one contract.
  • High-level languages will likely include a shardedMap primitive, perhaps allowing a user-specified sharding schema (eg. shard = address // 2**144, shard = sha3(name) % 65536, etc), allowing contracts to store state across multiple shards.
  • Contract code can now be safely processed in parallel, introducing an immediate ~2-8x scalability benefit for the public ethereum blockchain assuming that miners have multicore processors, and a much larger scalability benefit for ethereum private chains; in a private chain context, the problems in http://www.multichain.com/blog/2015/11/smart-contracts-slow-blockchains/ would be completely solved.
  • It may be possible to make Ethereum massively scalable (defined either as "can safely process 10000+ tx/sec" or "transaction processing capacity quadratic in the processing power of a single node") with no further changes, if almost all validators under this scheme become comfortable mining/staking without running full validating nodes, instead employing collaborative validation strategies where they randomly poll other nodes on the validity of blocks on some shards and individually validate other shards. Hence, in the event that the Ethereum developers get blown up by [insert crazies here] or the foundation goes bankrupt, Ethereum will be much more well-suited to scale over time with only minimal further work on the core protocol if need be.

From an economics perspective, the following observations can be made:

  • Contract creators now have the ability to make a choice between being in shards where the contracts that they care to interact with are highly concentrated (and thereby benefit from network effects) and being in shards where contracts are rarefied (and thereby benefit from cost savings). This is very similar to the tradeoff faced by humans deciding whether to live in a big city or in a smaller city or the countryside, and so many insights from urban economics may be transferable.
  • Existing contracts do NOT have the ability to move between shards if they are unhappy with a change in the congestion/cost tradeoff of a given shard that arises from changes in the ecosystem; addresses are static. However, they do have an alternative escape valve: de-facto splitting and merging the shards that they are in. If a given shard becomes too congested, its gas price will go up, and so users will increasingly prefer to make operations that are limited to one of its two sub-shards; if enough people do this, then transaction groups at the higher shard level will be rare and transaction groups at the sub-shard level will be more frequent, thereby increasing the "distance" between the shards. Instead of (or rather, alongside) the citizens moving, the city itself shrinks or grows.

In order to process all transaction execution in parallel between transaction groups in a completely disjoint fashion, there are two possible routes from a data structure standpoint. The first is to modify the tree structure so as to have a depth-16 binary tree at the top level. The second is to implement Patricia tree splitting and merging (see python implemented code here: ethereum/pyethereum@81d37fc ), and do it all with the existing hexary Patricia tree: when processing every block, the process becomes split -> process in parallel -> merge. The former route may be slightly more efficient in the long term, the latter seems more efficient in the short term and may require writing slightly less code (though the code that does need to be written for split/merge is somewhat involved).

Note that if the 144-bit address plus 16-bit sharding scheme is undesirable because it unreasonably splits up existing contracts, we can also take the route of 160-bit addresses plus 16-bit shards for a total of 176 bit addresses.

Activity

janx

janx commented on Jan 13, 2016

@janx
Member

Very cool.

New specialized SSTORE and SLOAD opcodes are introduced along the existing ones, such that these opcodes take an additional "shard ID" as an argument, where the shard ID should be in the range 0 < k <= 65535.

Shard id can't be zero or typo?

176-bits address sounds more compatible with existing tools.

chriseth

chriseth commented on Jan 13, 2016

@chriseth
Contributor

Thinking about the "prime use case" of a sharded currency contract, a transfer would always touch at most two shards, but it could be hard to get the activity ranges of multiple transactions disjoint even though each shard is only touched by at most one transaction. What would be the drawbacks of allowing a list of activity ranges for each transaction group?

janx

janx commented on Jan 13, 2016

@janx
Member

How does this work with account abstraction? Does transactions from address 0 require special treatment? Will this make the shard 0 a special one because of pre-deployed contracts?

Smithgift

Smithgift commented on Jan 13, 2016

@Smithgift

I see a issue with Out-Of-Range being philosophically treated like Out-of-Gas. If I run out out of gas, well, the miner did the work, so I still should pay them. If my code is bad, that's my fault, and the miner should still get paid.

But activity groups, and therefore whether a transaction goes out of range, is totally under the miner's control. Suppose they use introspection to find if my high-gas high-price transaction crosses at least one shard boundary, and then manipulate the transaction group to glean as much possible gas from me without actually executing the transaction. Or perhaps the miner made a legitimate bad guess about the activity range, and only after spending over a million gas in computation did they discover the problem. Should the miner still get paid?

Something makes me uneasy about having two types of errors, one of which is the user's fault and the miner gets paid, and another which is the miner's fault, and there's not an obvious answer whether the miner should be paid or not. Sure, the programmer via the silent assistance of the compiler can deal with it, but there's a significant philosophical difference.

On a similar topic, what algorithm can miners use to determine transaction groups? Seems to be a kind of packing problem, except the bins themselves have to be packed into bigger bins and objects can grow and shrink depending on arbitrary bounded computation. There's probably a reasonably good algorithm to get an answer, in the same way that current miners deal with packing transactions into blocks.

Off the top of my head, a naive system might be:

  1. Sort the transactions by effective gas price and divide them into queues based on the shard of the contract they'll start with (or hit directly after the origin address.)
  2. Start processing the queues into their own proto-transaction groups, and when a transaction crosses an activity range delete the queue that pays less and dump those transactions on the "winning" queue.
  3. When all transactions have been processed, pick the set of proto-transaction groups that pays the most and make those the final transaction groups.

There's clearly issues, though (How much gas should each proto-transaction group be allowed?)

jpritikin

jpritikin commented on Jan 13, 2016

@jpritikin

"Applications designed to be asynchronous will this always benefit from the highest gas" -- Typo here but the meaning seems clear enough

vbuterin

vbuterin commented on Jan 13, 2016

@vbuterin
ContributorAuthor

Ok, I added an important point: transactions now also specify their own activity ranges, and a transaction group's activity range must be a superset of that of all transactions that are contained in it. This should resolve most of the issues around OOR being something that miners can "force" to happen to any transaction.

vbuterin

vbuterin commented on Jan 14, 2016

@vbuterin
ContributorAuthor

Regarding special addresses, the zero address is just a virtual sender account, so I don't see any problems there wrt sharding; if really desired I suppose we can make the sender 0 + m * 2**k to be in the same shard, but not sure that's necessary. Precompiles can probably be copied to every shard. Casper, we'll see; there should be some way to architect it so that it benefits from parallelization but that may have to wait for v2.

simondlr

simondlr commented on Jan 14, 2016

@simondlr

Very awesome. 2 questions:

I'm unsure precisely how this choice works/looks?

Contract creators now have the ability to make a choice between being in shards where the contracts that they care to interact with are highly concentrated (and thereby benefit from network effects) and being in shards where contracts are rarefied (and thereby benefit from cost savings).

Would you at a contract level choose which shard to store your contract's state?

What if you don't have knowledge on how far the activity range could be? It could be a tx with long cascading hops between contracts, but you are not sure what will be exeuted as a result of it. I assume, one would be able to check, like with current web3 tools to see what happens with a potential state change before submitting it. ie, run this transaction to find the activity range, then submit this along with the transaction? Is this a correct assumption?

vbuterin

vbuterin commented on Jan 14, 2016

@vbuterin
ContributorAuthor

Would you at a contract level choose which shard to store your contract's state?

This happens at two levels. First, when creating a contract, you can choose the prefix at which you create it. Second, the contract itself can choose which shard IDs it tries to SSTORE data at.

What if you don't have knowledge on how far the activity range could be? It could be a tx with long cascading hops between contracts, but you are not sure what will be exeuted as a result of it.

Then, you would have to set the activity range to be very large, and thus wait a long time and pay high gas costs for it to be accepted (or else set a smaller activity range and risk some of the child messages OOR'ing). Creating long and unpredictable cascading hops between contracts is precisely the thing that is deliberately economically discouraged by this scheme.

sillytuna

sillytuna commented on Jan 14, 2016

@sillytuna

Taking the Alarm contract as an example, does that mean it's best to put my 'callee' contract on the same shard as the alarm contract, and also that (in principle) the alarm contract could live on multiple shard (via multiple contracts) if that provided a good cost saving?

simondlr

simondlr commented on Jan 14, 2016

@simondlr

Creating long and unpredictable cascading hops between contracts is precisely the thing that is deliberately economically discouraged by this scheme.

It does to some extent feel like one is disincentivizing potentially emergent behaviour due to stifling costs? ie.

A thought could occur: "We could automatically build a prediction market on information being produced automatically from contract A. But, argh. The prediction market platform is on a shard too far away. If only the 2 developers saw there could be potential synergy! Contract B should perhaps have chosen to deploy the contract in a closer group to the prediction market platform. Now we won't know what the interesting outcomes would be. Too expensive and will take too long to confirm. :("

It's a concern, but a trade-off for scalability's sake. Scalability > emergence at this point in time. I guess it should just be mitigated as much as possible to ensure it keeps most of the benefits. I guess, now your mention of urban economics makes sense.

Keep up the good work!

vbuterin

vbuterin commented on Jan 14, 2016

@vbuterin
ContributorAuthor

And then a lightbulb goes off in your head and you realize that you could implement your application and get the best of both worlds by making the information grabbing from contract A asynchronous :)

vbuterin

vbuterin commented on Jan 14, 2016

@vbuterin
ContributorAuthor

You probably want an instance of the alarm contract in every shard.

simondlr

simondlr commented on Jan 14, 2016

@simondlr

And then a lightbulb goes off in your head and you realize that you could implement your application and get the best of both worlds by making the information grabbing from contract A asynchronous :)

Ah, yes! Using log proofs, right?

Smithgift

Smithgift commented on Jan 14, 2016

@Smithgift

Suggestion: define CALLCODE and DELEGATECALL to use the code of the target contract as of last block and make those two opcodes always synchronous no matter the activity range (and work regardless of it).

Reasoning: Libraries can work cross-shard without further effort. Keeping the static code of popular libraries (possibly JIT compiled as well) in memory is little burden to miners, and libraries can't alter the state of their own shard anyway. It also helps forward-compatibility: no need to make sure your libraries will be on the same shard as your contract in the current blockchain so your dapp won't become expensive in the future.

Libraries can be created and selfdestruct, true. But if we just use the last block's code I can't conceive of any normal situation where it would matter.

21 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @janx@Arachnid@simondlr@sillytuna@mcelrath

        Issue actions

          EIP 105 (Serenity): Binary sharding plus contract calling semantics · Issue #53 · ethereum/EIPs