[SPARK-11389][CORE] Add support for off-heap memory to MemoryManager #9344

JoshRosen · 2015-10-29T01:02:04Z

In order to lay the groundwork for proper off-heap memory support in SQL / Tungsten, we need to extend our MemoryManager to perform bookkeeping for off-heap memory.

User-facing changes

This PR introduces a new configuration, spark.memory.offHeapSize (name subject to change), which specifies the absolute amount of off-heap memory that Spark and Spark SQL can use. If Tungsten is configured to use off-heap execution memory for allocating data pages, then all data page allocations must fit within this size limit.

Internals changes

This PR contains a lot of internal refactoring of the MemoryManager. The key change at the heart of this patch is the introduction of a MemoryPool class (name subject to change) to manage the bookkeeping for a particular category of memory (storage, on-heap execution, and off-heap execution). These MemoryPools are not fixed-size; they can be dynamically grown and shrunk according to the MemoryManager's policies. In StaticMemoryManager, these pools have fixed sizes, proportional to the legacy [storage|shuffle].memoryFraction. In the new UnifiedMemoryManager, the sizes of these pools are dynamically adjusted according to its policies.

There are two subclasses of MemoryPool: StorageMemoryPool manages storage memory and ExecutionMemoryPool manages execution memory. The MemoryManager creates two execution pools, one for on-heap memory and one for off-heap. Instances of ExecutionMemoryPool manage the logic for fair sharing of their pooled memory across running tasks (in other words, the ShuffleMemoryManager-like logic has been moved out of MemoryManager and pushed into these ExecutionMemoryPool instances).

I think that this design is substantially easier to understand and reason about than the previous design, where most of these responsibilities were handled by MemoryManager and its subclasses. To see this, take at look at how simple the logic in UnifiedMemoryManager has become: it's now very easy to see when memory is dynamically shifted between storage and execution.

TODOs

Fix handful of test failures in the MemoryManagerSuites.
Fix remaining TODO comments in code.
Document new configuration.
Fix commented-out tests / asserts:
- UnifiedMemoryManagerSuite.
Write tests that exercise the new off-heap memory management policies.

…mory.

SparkQA · 2015-10-29T01:17:49Z

Test build #44556 has finished for PR 9344 at commit f3f44fe.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

SparkQA · 2015-10-29T06:06:20Z

Test build #44570 has finished for PR 9344 at commit eb3180a.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2015-10-29T06:11:47Z

/cc @davies @andrewor14 for review. This is still WIP pending some tests, but the high-level design is ready for comments. I've posted a PR description to help guide you through the key changes.

…unting

SparkQA · 2015-10-29T22:32:01Z

Test build #44633 has finished for PR 9344 at commit 82fffab.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n * abstract class CentralMomentAgg(child: Expression) extends ImperativeAggregate with Serializable\n * case class Variance(child: Expression,\n * case class VarianceSamp(child: Expression,\n * case class VariancePop(child: Expression,\n * case class Skewness(child: Expression,\n * case class Kurtosis(child: Expression,\n * case class Kurtosis(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class Skewness(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class Variance(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class VariancePop(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class VarianceSamp(child: Expression) extends UnaryExpression with AggregateExpression1\n

…unting

JoshRosen · 2015-11-01T06:53:04Z

One bad complication: until we can completely support off-heap memory for execution, we need to perform separate accounting for on-heap and off-heap memory, so spill() will need to report accurate information on the amount of on-heap and off-heap memory that's freed.

JoshRosen · 2015-11-01T07:17:03Z

Looking a bit more closely, it looks like all existing implementations of MemoryConsumer.spill() will only end up reporting the size of Tugnsten pages that are spilled. If Tungsten pages are allocated in-heap, then it makes sense to try to spill in response to requests heap memory. If we're in off-heap mode, though, then it doesn't make sense to try to spill when we're running short in on-heap memory; in that mode, we should only spill in response to failed off-heap memory requests.

JoshRosen · 2015-11-01T07:24:20Z

The current approach of having separate methods named *OnHeap and *OffHeap seems error-prone in the long run. Instead, it might be better to create an enumeration for tracking which category memory belongs to, then to have methods which take an extra parameter to specify whether the memory is on- or off-heap. We could use a boolean parameter but I'd rather use explicit enumerations to make the types clearer in Java code where we can't use the Scala-style naming of arguments at method call sites.

SparkQA · 2015-11-01T18:23:58Z

Test build #44755 has finished for PR 9344 at commit b59dab9.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

SparkQA · 2015-11-02T01:52:37Z

Test build #44770 has finished for PR 9344 at commit 144e680.

This patch fails from timeout after a configured wait of 250m.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

JoshRosen · 2015-11-02T05:05:39Z

Jenkins, retest this please.

SparkQA · 2015-11-02T07:18:33Z

Test build #44793 has finished for PR 9344 at commit 1356cdb.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

JoshRosen · 2015-11-02T21:08:55Z

Jenkins, retest this please.

SparkQA · 2015-11-02T22:43:43Z

Test build #44843 has finished for PR 9344 at commit 8e12eb4.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(memoryManager: Object)\n * class StorageMemoryPool(memoryManager: Object) extends MemoryPool(memoryManager) with Logging\n

davies · 2015-11-06T20:30:32Z

@JoshRosen Had done a round, once you address these comments, I think it's good to go.

JoshRosen · 2015-11-06T22:06:19Z

@davies, I've updated this PR to incorporate your feedback. In the process, I found and fixed a minor bug.

JoshRosen · 2015-11-06T22:09:08Z

One more minor thing that I might want to address: adding documentation for the new configuration. I'll do that now.

JoshRosen · 2015-11-06T22:09:46Z

Actually, I may want to defer the user-facing configuration to a followup since I still might want to rename it. Will add a small followup task so I don't forget.

davies · 2015-11-06T22:49:52Z

core/src/main/scala/org/apache/spark/memory/MemoryPool.scala

+ */
+abstract class MemoryPool(lock: Object) {
+
+  @GuardedBy("lcok")


davies · 2015-11-06T22:51:50Z

LGTM, pending on tests.

SparkQA · 2015-11-07T00:12:10Z

Test build #45260 has finished for PR 9344 at commit 32398bb.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray>\n * final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray>\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n * final class DecisionTreeRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * final class GBTRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * class IsotonicRegression @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: String)\n * class LinearRegression @Since(\"1.3.0\") (@Since(\"1.3.0\") override val uid: String)\n * final class RandomForestRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * class PrefixSpanModel(JavaModelWrapper):\n * class PrefixSpan(object):\n * class FreqSequence(namedtuple(\"FreqSequence\", [\"sequence\", \"freq\"])):\n

SparkQA · 2015-11-07T00:38:18Z

Test build #45262 has finished for PR 9344 at commit eac53f1.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n

SparkQA · 2015-11-07T00:47:25Z

Test build #45265 has finished for PR 9344 at commit 55feee0.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n

SparkQA · 2015-11-07T01:36:15Z

Test build #2000 has finished for PR 9344 at commit 1e5eefa.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n

rxin · 2015-11-07T01:38:55Z

You can ignore the Python failure - since I broke the build. The test should be ok for this pr.

JoshRosen · 2015-11-07T02:16:41Z

Alright, going to merge this now.

In order to lay the groundwork for proper off-heap memory support in SQL / Tungsten, we need to extend our MemoryManager to perform bookkeeping for off-heap memory. ## User-facing changes This PR introduces a new configuration, `spark.memory.offHeapSize` (name subject to change), which specifies the absolute amount of off-heap memory that Spark and Spark SQL can use. If Tungsten is configured to use off-heap execution memory for allocating data pages, then all data page allocations must fit within this size limit. ## Internals changes This PR contains a lot of internal refactoring of the MemoryManager. The key change at the heart of this patch is the introduction of a `MemoryPool` class (name subject to change) to manage the bookkeeping for a particular category of memory (storage, on-heap execution, and off-heap execution). These MemoryPools are not fixed-size; they can be dynamically grown and shrunk according to the MemoryManager's policies. In StaticMemoryManager, these pools have fixed sizes, proportional to the legacy `[storage|shuffle].memoryFraction`. In the new UnifiedMemoryManager, the sizes of these pools are dynamically adjusted according to its policies. There are two subclasses of `MemoryPool`: `StorageMemoryPool` manages storage memory and `ExecutionMemoryPool` manages execution memory. The MemoryManager creates two execution pools, one for on-heap memory and one for off-heap. Instances of `ExecutionMemoryPool` manage the logic for fair sharing of their pooled memory across running tasks (in other words, the ShuffleMemoryManager-like logic has been moved out of MemoryManager and pushed into these ExecutionMemoryPool instances). I think that this design is substantially easier to understand and reason about than the previous design, where most of these responsibilities were handled by MemoryManager and its subclasses. To see this, take at look at how simple the logic in `UnifiedMemoryManager` has become: it's now very easy to see when memory is dynamically shifted between storage and execution. ## TODOs - [x] Fix handful of test failures in the MemoryManagerSuites. - [x] Fix remaining TODO comments in code. - [ ] Document new configuration. - [x] Fix commented-out tests / asserts: - [x] UnifiedMemoryManagerSuite. - [x] Write tests that exercise the new off-heap memory management policies. Author: Josh Rosen <joshrosen@databricks.com> Closes #9344 from JoshRosen/offheap-memory-accounting. (cherry picked from commit 30b706b) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

tedyu · 2015-11-07T13:24:48Z

core/src/main/scala/org/apache/spark/memory/ExecutionMemoryPool.scala

+   *
+   * @return the number of bytes granted to the task.
+   */
+  def acquireMemory(numBytes: Long, taskAttemptId: Long): Long = lock.synchronized {


Have you considered using ReadWriteLock (for lock) to improve performance ?

This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs. - Change `spark.memory.offHeapSize` to `spark.memory.offHeap.size`. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6). - Deprecated `spark.unsafe.offHeap` in favor of a new `spark.memory.offHeap.enabled` configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix. - Add a check which prevents users from setting `spark.memory.offHeap.enabled=true` when `spark.memory.offHeap.size == 0`. After SPARK-11389 (#9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without setting `spark.memory.offHeap.size` will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion. - Document these configurations on the configuration page. Author: Josh Rosen <joshrosen@databricks.com> Closes #10237 from JoshRosen/SPARK-12251. (cherry picked from commit 23a9e62) Signed-off-by: Andrew Or <andrew@databricks.com>

This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs. - Change `spark.memory.offHeapSize` to `spark.memory.offHeap.size`. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6). - Deprecated `spark.unsafe.offHeap` in favor of a new `spark.memory.offHeap.enabled` configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix. - Add a check which prevents users from setting `spark.memory.offHeap.enabled=true` when `spark.memory.offHeap.size == 0`. After SPARK-11389 (apache#9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without setting `spark.memory.offHeap.size` will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion. - Document these configurations on the configuration page. Author: Josh Rosen <joshrosen@databricks.com> Closes apache#10237 from JoshRosen/SPARK-12251.

JoshRosen added 3 commits October 28, 2015 00:02

Factor cross-task memory arbitration into own component.

64b7c85

Hacky WIP approach at supporting fixed limit on off-heap execution me…

8396ed6

…mory.

MemoryManager internals refactoring (WIP)

f3f44fe

Gradually fix test issues; address thread-safety (WIP).

eb3180a

JoshRosen added 2 commits October 29, 2015 13:32

Merge remote-tracking branch 'origin/master' into offheap-memory-acco…

d795524

…unting

Fix scalstyle.

82fffab

JoshRosen added 2 commits October 30, 2015 14:19

Merge remote-tracking branch 'origin/master' into offheap-memory-acco…

641e9e5

…unting

Merge remote-tracking branch 'origin/master' into offheap-memory-acco…

4d97f69

…unting

Attempt at fixing merge conflicts; refactor to use MemoryMode.

b59dab9

JoshRosen added 3 commits November 1, 2015 12:14

Fix double-free of pages.

8bbc111

Fixes to execution evicting storage.

df2168f

Avoid unnecessary ensureFreeSpace() calls

144e680

JoshRosen added 3 commits November 1, 2015 17:55

Fix memory leak detection test in TaskMemoryManager.

542dd56

Fix TaskMemoryManagerSuite.

709ecf2

Another fix to freeing pages

1356cdb

Fix failing test in UnsafeShuffleWriterSuite.

d8ffd35

Add more comments.

8e12eb4

Address review feedback and fix bug.

32398bb

JoshRosen added 2 commits November 6, 2015 14:18

Minor update to calculation of page sizes.

eac53f1

Correct previous bug fix, which misidentified problem.

55feee0

davies reviewed Nov 6, 2015
View reviewed changes

core/src/main/scala/org/apache/spark/memory/MemoryPool.scala

*/

abstract class MemoryPool(lock: Object) {

@GuardedBy("lcok")

Copy link

Contributor

davies Nov 6, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Fix typo.

1e5eefa

asfgit closed this in 30b706b Nov 7, 2015

tedyu reviewed Nov 7, 2015
View reviewed changes

JoshRosen mentioned this pull request Dec 10, 2015

[SPARK-12251] Document and improve off-heap memory configurations #10237

Closed

JoshRosen deleted the offheap-memory-accounting branch August 29, 2016 19:27

peter-toth mentioned this pull request Jun 21, 2020

[SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan exchange and subquery reuse #28885

Closed

[SPARK-11389][CORE] Add support for off-heap memory to MemoryManager #9344

[SPARK-11389][CORE] Add support for off-heap memory to MemoryManager #9344

Conversation

JoshRosen commented Oct 29, 2015

User-facing changes

Internals changes

TODOs

Uh oh!

SparkQA commented Oct 29, 2015

Uh oh!

SparkQA commented Oct 29, 2015

Uh oh!

JoshRosen commented Oct 29, 2015

Uh oh!

SparkQA commented Oct 29, 2015

Uh oh!

JoshRosen commented Nov 1, 2015

Uh oh!

JoshRosen commented Nov 1, 2015

Uh oh!

JoshRosen commented Nov 1, 2015

Uh oh!

SparkQA commented Nov 1, 2015

Uh oh!

SparkQA commented Nov 2, 2015

Uh oh!

JoshRosen commented Nov 2, 2015

Uh oh!

SparkQA commented Nov 2, 2015

Uh oh!

JoshRosen commented Nov 2, 2015

Uh oh!

SparkQA commented Nov 2, 2015

Uh oh!

davies commented Nov 6, 2015

Uh oh!

JoshRosen commented Nov 6, 2015

Uh oh!

JoshRosen commented Nov 6, 2015

Uh oh!

JoshRosen commented Nov 6, 2015

Uh oh!

davies Nov 6, 2015

Choose a reason for hiding this comment

Uh oh!

davies commented Nov 6, 2015

Uh oh!

SparkQA commented Nov 7, 2015

Uh oh!

SparkQA commented Nov 7, 2015

Uh oh!

SparkQA commented Nov 7, 2015

Uh oh!

SparkQA commented Nov 7, 2015

Uh oh!

rxin commented Nov 7, 2015

Uh oh!

JoshRosen commented Nov 7, 2015

Uh oh!

tedyu Nov 7, 2015

Choose a reason for hiding this comment

Uh oh!