32

I was curious about how the node.js pattern of nested functions works with the garbage collector of v8. here's a simple example

readfile("blah", function(str) {
   var val = getvaluefromstr(str);
   function restofprogram(val2) { ... } (val)
})

if restofprogram is long-running, doesn't that mean that str will never get garbage collected? My understanding is that with node you end up with nested functions a lot. Does this get garbage collected if restofprogram was declared outside, so str could not be in scope? Is this a recommended practice?

EDIT I didn't intend to make the problem complicated. That was just carelessness, so I've modified it.

3

3 Answers 3

73

Simple answer: if value of the str is not referenced from anywhere else (and str itself is not referenced from restofprogram) it will become unreachable as soon as the function (str) { ... } returns.

Details: V8 compiler distinguishes real local variables from so called context variables captured by a closure, shadowed by a with-statement or an eval invocation.

Local variables live on the stack and disappear as soon as function execution completes.

Context variables live in a heap allocated context structure. They disappear when the context structure dies. Important thing to note here is that context variables from the same scope live in the same structure. Let me illustrate it with an example code:

function outer () {
  var x; // real local variable
  var y; // context variable, referenced by inner1
  var z; // context variable, referenced by inner2

  function inner1 () {
    // references context 
    use(y);
  }

  function inner2 () {
    // references context 
    use(z);
  }

  function inner3 () { /* I am empty but I still capture context implicitly */ } 

  return [inner1, inner2, inner3];
}

In this example variable x will disappear as soon as outer returns but variables y and z will disappear only when both inner1, inner2 and inner3 die. This happens because y and z are allocated in the same context structure and all three closures implicitly reference this context structure (even inner3 which does not use it explicitly).

Situation gets even more complicated when you start using with-statement, try/catch-statement which on V8 contains an implicit with-statement inside catch clause or global eval.

function complication () {
  var x; // context variable

  function inner () { /* I am empty but I still capture context implicitly */ }

  try { } catch (e) { /* contains implicit with-statement */ }

  return inner;
}

In this example x will disappear only when inner dies. Because:

  • try/catch-contains implicit with-statement in catch clause
  • V8 assumes that any with-statement shadows all the locals

This forces x to become a context variable and inner captures the context so x exists until inner dies.

In general if you want to be sure that given variable does not retain some object for longer than really needed you can easily destroy this link by assigning null to that variable.

1
5

Actually your example is somewhat tricky. Was it on purpose? You seem to be masking the outer val variable with an inner lexically scoped restofprogram()'s val argument, instead of actually using it. But anyway, you're asking about str so let me ignore the trickiness of val in your example just for the sake of simplicity.

My guess would be that the str variable won't get collected before the restofprogram() function finishes, even if it doesn't use it. If the restofprogram() doesn't use str and it doesn't use eval() and new Function() then it could be safely collected but I doubt it would. This would be a tricky optimization for V8 probably not worth the trouble. If there was no eval and new Function() in the language then it would be much easier.

Now, it doesn't have to mean that it would never get collected because any event handler in a single-threaded event loop should finish almost instantly. Otherwise your whole process would be blocked and you'd have bigger problems than one useless variable in memory.

Now I wonder if you didn't mean something else than what you actually wrote in your example. The whole program in Node is just like in the browser – it just registers event callbacks that are fired asynchronously later after the main program body has already finished. Also none of the handlers are blocking so no function is actually taking any noticeable time to finish. I'm not sure if I understood what you actually meant in your question but I hope that what I've written will be helpful to understand how it all works.

Update:

After reading more info in the comments on how your program looks like I can say more.

If your program is something like:

readfile("blah", function (str) {
  var val = getvaluefromstr(str);
  // do something with val
  Server.start(function (request) {
    // do something
  });
});

Then you can also write it like this:

readfile("blah", function (str) {
  var val = getvaluefromstr(str);
  // do something with val
  Server.start(serverCallback);
});
function serverCallback(request) {
  // do something
});

It will make the str go out of scope after Server.start() is called and will eventually get collected. Also, it will make your indentation more manageable which is not to be underestimated for more complex programs.

As for the val you might make it a global variable in this case which would greatly simplify your code. Of course you don't have to, you can wrestle with closures, but in this case making val global or making it live in an outer scope common for both the readfile callback and for the serverCallback function seems like the most straightforward solution.

Remember that everywhere when you can use an anonymous function you can also use a named function, and with those you can choose in which scope do you want them to live.

5
  • yes, but if restofprogram is something like Server.start(function(request) {do something}), even though restofprogram exits instantly, the function passed to Server.start will live forever, and has str in scope.
    – Vishnu
    Mar 16, 2011 at 16:07
  • Actually, the event handler could create an anonymous function which is added as an event listener to some other event and it could do this every time it is called, thus ensuring that all the scope variables (for all calls of this handler) are never collected.
    – dhruvbird
    Mar 16, 2011 at 16:13
  • @dhruvbird: True. For those cases I recommend using named functions for which you can choose the scope in which they live.
    – rsp
    Mar 16, 2011 at 16:43
  • @Vishnu: See the update to my answer for some ideas on how to make the cases like this more manageable.
    – rsp
    Mar 16, 2011 at 16:44
  • thank you, that was the intent of my question. So unintended memory leaks are possible and using named functions when possible should alleviate the problem.
    – Vishnu
    Mar 17, 2011 at 6:29
1

My guess is that str will NOT be garbage collected because it can be used by restofprogram(). Yes, and str should get GCed if restofprogram was declared outside, except, if you do something like this:

function restofprogram(val) { ... }

readfile("blah", function(str) {
  var val = getvaluefromstr(str);
  restofprogram(val, str);
});

Or if getvaluefromstr is declared as something like this:

function getvaluefromstr(str) {
  return {
    orig: str, 
    some_funky_stuff: 23
  };
}

Follow-up-question: Does v8 do just plain'ol GC or does it do a combination of GC and ref. counting (like python?)

8
  • Technically if the v8 GC is smart enough, it should determine if str is actually used (or could conceivably be used with an eval statement) in the body of restofprogram. Whether it does this or not is a question that should be asked to someone who is knowledgeable of the details of v8.
    – MooGoo
    Mar 16, 2011 at 14:42
  • V8 uses a generational garbage collector.
    – rsp
    Mar 16, 2011 at 14:49
  • @MooGoo I doubt any GC would be smart enough to detect "str" being used in an eval (since the string to be eval'ed could be obtained from user input)
    – dhruvbird
    Mar 16, 2011 at 16:05
  • @dhruvbird if there was an eval or new Function statement in the function body, then str could conceivably be used, and would thus not be GC'd. If not, and there existed no direct references to str in the function body, then it could be GC'd. Pretty simple actually, but whether it is an efficient use of processor time is another question...
    – MooGoo
    Mar 16, 2011 at 19:56
  • @MooGoo The rules for eval are quite complicated, so I don't recall them exactly, but I guess it would be possible for an outside entity to pass a handle to eval inside the function either via scope or parameters and then you could eval within the function. (Am not sure about the scoping rules of eval called with some other alias, so don't quote me on this).
    – dhruvbird
    Mar 16, 2011 at 20:18

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.