Skip to content

Superoptimizer #900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed

Superoptimizer #900

wants to merge 7 commits into from

Conversation

kripken
Copy link
Member

@kripken kripken commented Feb 8, 2017

This adds a wasm-analyze tool that looks at a set of input files and runs a superoptimizer to find possible optimizations to suggest should be written. The general idea is based on Bansal & Aiken (2006), "Automatic Generation of Peephole Superoptimizers". In more detail:

  • Each input expression is normalized.
  • We compute a hash for each expression, based on running it (in our interpreter) on multiple inputs. The result is that expressions that are functionally identical should have identical hashes, but hopefully not others.
  • Find hash matches between different expressions.
  • Using additional brute force, try to rule out false matches (more effort than we expend in hashing, for time purposes).
  • Check if matches are interesting: run our optimizer on the input, and see that we don't already do better. What are left are possible suggested optimizations.
  • Group them by general patterns, ignoring things like specific constants (which would make the input noisy and long-tailed).
  • Sort them by the impact they would make on the inputs (i.e. improvement per instance * number of instances seen).
  • Present the sorted list. Hopefully, at the top of the list are optimizations we should add.

So far this is not much tuned, but already suggests a few new optimizations worth adding. I haven't gotten to that yet.

…code it is shown, finding patters that it believe are valid optimizations, and suggesting them based on their importance/predicted impact
@kripken
Copy link
Member Author

kripken commented Feb 8, 2017

See test.wast and test.txt for an example of this in use. The superoptimizer suggests the optimization rule (x << 1) << 1 => x << 2.

@baptistemanson
Copy link

baptistemanson commented Feb 8, 2017

I'm curious and would like to ask you a question, if you excuse my ignorance.
Why did you decide to have the superoptimizer running on the wast? And not on the LLVM representation?

@kripken
Copy link
Member Author

kripken commented Feb 8, 2017

Well, LLVM is just one thing that can produce wasm. Binaryen's optimizer should be able to optimize any wasm from any compiler, so having a superoptimizer here would benefit all those. Not many exist yet, of course, but hopefully they will.

I also think it's convenient to superoptimize in Binaryen - the technique used here requires the ability to execute code, which is trivial in Binaryen (using the built-in interpreter), but not practical in LLVM.

@kripken
Copy link
Member Author

kripken commented Feb 12, 2017

Ok, running the superoptimizer, it found a bunch of things missing in our optimizer, which I implemented in the 4 linked PRs (details in each one).

The total benefits of those PRs:

  • Code size is reduced by mostly around 0.5%, however some codebases benefit more significantly, such as Lua by 1.0%. Unity shrinks by 0.7%.
  • Most benchmarks don't benefit, but there are a few with 0.5%-1.0% speedups, and a few benefit more significantly, notably lua-scimark is 3% faster and bullet is 2% faster.

Those PRs seem to cover most of what the superoptimizer finds for now - we'll need to improve it to find more, lots of TODOs in the superoptimizer source, it's really very naive so far. Nice that even with such a simple superoptimizer we can find useful improvements.

As for this PR itself, for me personally it would be convenient to merge it, but possibly not worth it to increase build times for everyone, it could stay on this side branch. Thoughts?

@kripken
Copy link
Member Author

kripken commented Feb 17, 2017

Let's close this, the optimizations from this are helpful, but probably not much sense in getting the superoptimizer itself in-tree. It can stay in a side branch.

@kripken kripken closed this Feb 17, 2017
@sbc100 sbc100 deleted the superoptimizer branch November 29, 2018 20:24
@kripken kripken restored the superoptimizer branch August 29, 2022 20:03
@kripken
Copy link
Member Author

kripken commented Aug 29, 2022

I'm doing more experiments here, and restoring/updating this branch.

@tlively
Copy link
Member

tlively commented Aug 29, 2022

Amazing! I expect it would be very helpful to run this on SIMD programs. In particular, if we could randomly generate vectorized LLVM programs, run them through the Wasm backend, then run them through the Binaryen superoptimizer to see what we missed in ISel, I think that would help use make significant improvements in LLVM.

@kripken
Copy link
Member Author

kripken commented Aug 29, 2022

Ah, interesting, I hadn't thought about SIMD... makes sense it could help. I think the code will need some changes for v128 but hopefully not many.

My motivation was mainly wasm GC, where more toolchains are not using LLVM. The old superoptimizer results were only on LLVM output, so there might be more or different opportunities now.

Side note, I reopened this branch, but I do not have the power to reopen this PR... and new commits I've pushed are not showing up.. 🤷 I guess I'll open a new PR with the updated code.

@kripken kripken mentioned this pull request Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants