Skip to content

[SPARK-11389][CORE] Add support for off-heap memory to MemoryManager #9344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 36 commits into from

Conversation

JoshRosen
Copy link
Contributor

In order to lay the groundwork for proper off-heap memory support in SQL / Tungsten, we need to extend our MemoryManager to perform bookkeeping for off-heap memory.

User-facing changes

This PR introduces a new configuration, spark.memory.offHeapSize (name subject to change), which specifies the absolute amount of off-heap memory that Spark and Spark SQL can use. If Tungsten is configured to use off-heap execution memory for allocating data pages, then all data page allocations must fit within this size limit.

Internals changes

This PR contains a lot of internal refactoring of the MemoryManager. The key change at the heart of this patch is the introduction of a MemoryPool class (name subject to change) to manage the bookkeeping for a particular category of memory (storage, on-heap execution, and off-heap execution). These MemoryPools are not fixed-size; they can be dynamically grown and shrunk according to the MemoryManager's policies. In StaticMemoryManager, these pools have fixed sizes, proportional to the legacy [storage|shuffle].memoryFraction. In the new UnifiedMemoryManager, the sizes of these pools are dynamically adjusted according to its policies.

There are two subclasses of MemoryPool: StorageMemoryPool manages storage memory and ExecutionMemoryPool manages execution memory. The MemoryManager creates two execution pools, one for on-heap memory and one for off-heap. Instances of ExecutionMemoryPool manage the logic for fair sharing of their pooled memory across running tasks (in other words, the ShuffleMemoryManager-like logic has been moved out of MemoryManager and pushed into these ExecutionMemoryPool instances).

I think that this design is substantially easier to understand and reason about than the previous design, where most of these responsibilities were handled by MemoryManager and its subclasses. To see this, take at look at how simple the logic in UnifiedMemoryManager has become: it's now very easy to see when memory is dynamically shifted between storage and execution.

TODOs

  • Fix handful of test failures in the MemoryManagerSuites.
  • Fix remaining TODO comments in code.
  • Document new configuration.
  • Fix commented-out tests / asserts:
    • UnifiedMemoryManagerSuite.
  • Write tests that exercise the new off-heap memory management policies.

Sorry, something went wrong.

@SparkQA
Copy link

SparkQA commented Oct 29, 2015

Test build #44556 has finished for PR 9344 at commit f3f44fe.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

@SparkQA
Copy link

SparkQA commented Oct 29, 2015

Test build #44570 has finished for PR 9344 at commit eb3180a.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

/cc @davies @andrewor14 for review. This is still WIP pending some tests, but the high-level design is ready for comments. I've posted a PR description to help guide you through the key changes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2015

Test build #44633 has finished for PR 9344 at commit 82fffab.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n * abstract class CentralMomentAgg(child: Expression) extends ImperativeAggregate with Serializable\n * case class Variance(child: Expression,\n * case class VarianceSamp(child: Expression,\n * case class VariancePop(child: Expression,\n * case class Skewness(child: Expression,\n * case class Kurtosis(child: Expression,\n * case class Kurtosis(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class Skewness(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class Variance(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class VariancePop(child: Expression) extends UnaryExpression with AggregateExpression1\n * case class VarianceSamp(child: Expression) extends UnaryExpression with AggregateExpression1\n

@JoshRosen
Copy link
Contributor Author

One bad complication: until we can completely support off-heap memory for execution, we need to perform separate accounting for on-heap and off-heap memory, so spill() will need to report accurate information on the amount of on-heap and off-heap memory that's freed.

@JoshRosen
Copy link
Contributor Author

Looking a bit more closely, it looks like all existing implementations of MemoryConsumer.spill() will only end up reporting the size of Tugnsten pages that are spilled. If Tungsten pages are allocated in-heap, then it makes sense to try to spill in response to requests heap memory. If we're in off-heap mode, though, then it doesn't make sense to try to spill when we're running short in on-heap memory; in that mode, we should only spill in response to failed off-heap memory requests.

@JoshRosen
Copy link
Contributor Author

The current approach of having separate methods named *OnHeap and *OffHeap seems error-prone in the long run. Instead, it might be better to create an enumeration for tracking which category memory belongs to, then to have methods which take an extra parameter to specify whether the memory is on- or off-heap. We could use a boolean parameter but I'd rather use explicit enumerations to make the types clearer in Java code where we can't use the Scala-style naming of arguments at method call sites.

@SparkQA
Copy link

SparkQA commented Nov 1, 2015

Test build #44755 has finished for PR 9344 at commit b59dab9.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

@SparkQA
Copy link

SparkQA commented Nov 2, 2015

Test build #44770 has finished for PR 9344 at commit 144e680.

  • This patch fails from timeout after a configured wait of 250m.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 2, 2015

Test build #44793 has finished for PR 9344 at commit 1356cdb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(memoryManager: Object, poolName: String) extends MemoryPool with Logging\n * abstract class MemoryPool\n * class StorageMemoryPool extends MemoryPool with Logging\n

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 2, 2015

Test build #44843 has finished for PR 9344 at commit 8e12eb4.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(memoryManager: Object)\n * class StorageMemoryPool(memoryManager: Object) extends MemoryPool(memoryManager) with Logging\n

@davies
Copy link
Contributor

davies commented Nov 6, 2015

@JoshRosen Had done a round, once you address these comments, I think it's good to go.

@JoshRosen
Copy link
Contributor Author

@davies, I've updated this PR to incorporate your feedback. In the process, I found and fixed a minor bug.

@JoshRosen
Copy link
Contributor Author

One more minor thing that I might want to address: adding documentation for the new configuration. I'll do that now.

@JoshRosen
Copy link
Contributor Author

Actually, I may want to defer the user-facing configuration to a followup since I still might want to rename it. Will add a small followup task so I don't forget.

*/
abstract class MemoryPool(lock: Object) {

@GuardedBy("lcok")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@davies
Copy link
Contributor

davies commented Nov 6, 2015

LGTM, pending on tests.

@SparkQA
Copy link

SparkQA commented Nov 7, 2015

Test build #45260 has finished for PR 9344 at commit 32398bb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * final class ShuffleSortDataFormat extends SortDataFormat<PackedRecordPointer, LongArray>\n * final class UnsafeSortDataFormat extends SortDataFormat<RecordPointerAndKeyPrefix, LongArray>\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n * final class DecisionTreeRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * final class GBTRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * class IsotonicRegression @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: String)\n * class LinearRegression @Since(\"1.3.0\") (@Since(\"1.3.0\") override val uid: String)\n * final class RandomForestRegressor @Since(\"1.4.0\") (@Since(\"1.4.0\") override val uid: String)\n * class PrefixSpanModel(JavaModelWrapper):\n * class PrefixSpan(object):\n * class FreqSequence(namedtuple(\"FreqSequence\", [\"sequence\", \"freq\"])):\n

@SparkQA
Copy link

SparkQA commented Nov 7, 2015

Test build #45262 has finished for PR 9344 at commit eac53f1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n

@SparkQA
Copy link

SparkQA commented Nov 7, 2015

Test build #45265 has finished for PR 9344 at commit 55feee0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n

@SparkQA
Copy link

SparkQA commented Nov 7, 2015

Test build #2000 has finished for PR 9344 at commit 1e5eefa.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ExecutionMemoryPool(\n * abstract class MemoryPool(lock: Object)\n * class StorageMemoryPool(lock: Object) extends MemoryPool(lock) with Logging\n

@rxin
Copy link
Contributor

rxin commented Nov 7, 2015

You can ignore the Python failure - since I broke the build. The test should be ok for this pr.

@JoshRosen
Copy link
Contributor Author

Alright, going to merge this now.

asfgit pushed a commit that referenced this pull request Nov 7, 2015
In order to lay the groundwork for proper off-heap memory support in SQL / Tungsten, we need to extend our MemoryManager to perform bookkeeping for off-heap memory.

## User-facing changes

This PR introduces a new configuration, `spark.memory.offHeapSize` (name subject to change), which specifies the absolute amount of off-heap memory that Spark and Spark SQL can use. If Tungsten is configured to use off-heap execution memory for allocating data pages, then all data page allocations must fit within this size limit.

## Internals changes

This PR contains a lot of internal refactoring of the MemoryManager. The key change at the heart of this patch is the introduction of a `MemoryPool` class (name subject to change) to manage the bookkeeping for a particular category of memory (storage, on-heap execution, and off-heap execution). These MemoryPools are not fixed-size; they can be dynamically grown and shrunk according to the MemoryManager's policies. In StaticMemoryManager, these pools have fixed sizes, proportional to the legacy `[storage|shuffle].memoryFraction`. In the new UnifiedMemoryManager, the sizes of these pools are dynamically adjusted according to its policies.

There are two subclasses of `MemoryPool`: `StorageMemoryPool` manages storage memory and `ExecutionMemoryPool` manages execution memory. The MemoryManager creates two execution pools, one for on-heap memory and one for off-heap. Instances of `ExecutionMemoryPool` manage the logic for fair sharing of their pooled memory across running tasks (in other words, the ShuffleMemoryManager-like logic has been moved out of MemoryManager and pushed into these ExecutionMemoryPool instances).

I think that this design is substantially easier to understand and reason about than the previous design, where most of these responsibilities were handled by MemoryManager and its subclasses. To see this, take at look at how simple the logic in `UnifiedMemoryManager` has become: it's now very easy to see when memory is dynamically shifted between storage and execution.

## TODOs

- [x] Fix handful of test failures in the MemoryManagerSuites.
- [x] Fix remaining TODO comments in code.
- [ ] Document new configuration.
- [x] Fix commented-out tests / asserts:
  - [x] UnifiedMemoryManagerSuite.
- [x] Write tests that exercise the new off-heap memory management policies.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #9344 from JoshRosen/offheap-memory-accounting.

(cherry picked from commit 30b706b)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
@asfgit asfgit closed this in 30b706b Nov 7, 2015
*
* @return the number of bytes granted to the task.
*/
def acquireMemory(numBytes: Long, taskAttemptId: Long): Long = lock.synchronized {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using ReadWriteLock (for lock) to improve performance ?

asfgit pushed a commit that referenced this pull request Dec 10, 2015
This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs.

- Change `spark.memory.offHeapSize` to `spark.memory.offHeap.size`. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6).
- Deprecated `spark.unsafe.offHeap` in favor of a new `spark.memory.offHeap.enabled` configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix.
- Add a check which prevents users from setting `spark.memory.offHeap.enabled=true` when `spark.memory.offHeap.size == 0`. After SPARK-11389 (#9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without setting `spark.memory.offHeap.size` will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion.
- Document these configurations on the configuration page.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10237 from JoshRosen/SPARK-12251.

(cherry picked from commit 23a9e62)
Signed-off-by: Andrew Or <andrew@databricks.com>
ghost pushed a commit to dbtsai/spark that referenced this pull request Dec 10, 2015
This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs.

- Change `spark.memory.offHeapSize` to `spark.memory.offHeap.size`. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6).
- Deprecated `spark.unsafe.offHeap` in favor of a new `spark.memory.offHeap.enabled` configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix.
- Add a check which prevents users from setting `spark.memory.offHeap.enabled=true` when `spark.memory.offHeap.size == 0`. After SPARK-11389 (apache#9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without setting `spark.memory.offHeap.size` will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion.
- Document these configurations on the configuration page.

Author: Josh Rosen <joshrosen@databricks.com>

Closes apache#10237 from JoshRosen/SPARK-12251.
@JoshRosen JoshRosen deleted the offheap-memory-accounting branch August 29, 2016 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants