Out of memory while adding elements in immutable set in Scala
I'm getting out of memory in a loop when I'm adding elements in an immutable set. There are a lot of objects in the set already and I guess it's consuming a lot of memory. I know that while adding elements in immutable collections Scala will first copy the existing collection in a new set, add the element in the new set and will return this new set.
So suppose if my JVM memory is 500mb and the set is consuming 400mb. Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step, it's already exceeded the JVM memory (total consumed memory 800) and hence it throws out of memory error.
The code looks little bit like below
private def getNewCollection(myMuttableSet:Set[MyType]): Set[MyType] = {
myMuttableSet.flatMap(c => {
val returnedSet = doSomeCalculationsAndreturnASet // this method returns a large collection so duing the loop the collection grows exponentially
if (returnedSet.isEmpty) Set.empty[MyType]
else doSomeCalculationsAndreturnASet + MyType(constArg1,constArg2) (I have case class of MyType)
})
}
Kindly advise if my understanding is correct.
scala collections garbage-collection out-of-memory
|
show 1 more comment
I'm getting out of memory in a loop when I'm adding elements in an immutable set. There are a lot of objects in the set already and I guess it's consuming a lot of memory. I know that while adding elements in immutable collections Scala will first copy the existing collection in a new set, add the element in the new set and will return this new set.
So suppose if my JVM memory is 500mb and the set is consuming 400mb. Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step, it's already exceeded the JVM memory (total consumed memory 800) and hence it throws out of memory error.
The code looks little bit like below
private def getNewCollection(myMuttableSet:Set[MyType]): Set[MyType] = {
myMuttableSet.flatMap(c => {
val returnedSet = doSomeCalculationsAndreturnASet // this method returns a large collection so duing the loop the collection grows exponentially
if (returnedSet.isEmpty) Set.empty[MyType]
else doSomeCalculationsAndreturnASet + MyType(constArg1,constArg2) (I have case class of MyType)
})
}
Kindly advise if my understanding is correct.
scala collections garbage-collection out-of-memory
Could you include your loop code in your question?
– hasumedic
Nov 19 '18 at 14:56
1
Your understanding is correct
– Dima
Nov 19 '18 at 15:04
Yes you're correct, but it would be nice to see your code, because if you're doing something like:var set = Set.empty[Int]; for (i <- 1 to 1000) { set = set + i}
, then, IMHO, would be better to just use a mutable set.
– Luis Miguel Mejía Suárez
Nov 19 '18 at 15:18
Preference betweenmutable val
vsimmutable var
should depend in part on visibility. See stackoverflow.com/a/11386867/3226045 for an explanation on referential transparency.
– Shane Perry
Nov 19 '18 at 20:33
added the algo that is used in the program
– whysoseriousson
Nov 20 '18 at 6:57
|
show 1 more comment
I'm getting out of memory in a loop when I'm adding elements in an immutable set. There are a lot of objects in the set already and I guess it's consuming a lot of memory. I know that while adding elements in immutable collections Scala will first copy the existing collection in a new set, add the element in the new set and will return this new set.
So suppose if my JVM memory is 500mb and the set is consuming 400mb. Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step, it's already exceeded the JVM memory (total consumed memory 800) and hence it throws out of memory error.
The code looks little bit like below
private def getNewCollection(myMuttableSet:Set[MyType]): Set[MyType] = {
myMuttableSet.flatMap(c => {
val returnedSet = doSomeCalculationsAndreturnASet // this method returns a large collection so duing the loop the collection grows exponentially
if (returnedSet.isEmpty) Set.empty[MyType]
else doSomeCalculationsAndreturnASet + MyType(constArg1,constArg2) (I have case class of MyType)
})
}
Kindly advise if my understanding is correct.
scala collections garbage-collection out-of-memory
I'm getting out of memory in a loop when I'm adding elements in an immutable set. There are a lot of objects in the set already and I guess it's consuming a lot of memory. I know that while adding elements in immutable collections Scala will first copy the existing collection in a new set, add the element in the new set and will return this new set.
So suppose if my JVM memory is 500mb and the set is consuming 400mb. Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step, it's already exceeded the JVM memory (total consumed memory 800) and hence it throws out of memory error.
The code looks little bit like below
private def getNewCollection(myMuttableSet:Set[MyType]): Set[MyType] = {
myMuttableSet.flatMap(c => {
val returnedSet = doSomeCalculationsAndreturnASet // this method returns a large collection so duing the loop the collection grows exponentially
if (returnedSet.isEmpty) Set.empty[MyType]
else doSomeCalculationsAndreturnASet + MyType(constArg1,constArg2) (I have case class of MyType)
})
}
Kindly advise if my understanding is correct.
scala collections garbage-collection out-of-memory
scala collections garbage-collection out-of-memory
edited Nov 20 '18 at 6:57
asked Nov 19 '18 at 14:51
whysoseriousson
326
326
Could you include your loop code in your question?
– hasumedic
Nov 19 '18 at 14:56
1
Your understanding is correct
– Dima
Nov 19 '18 at 15:04
Yes you're correct, but it would be nice to see your code, because if you're doing something like:var set = Set.empty[Int]; for (i <- 1 to 1000) { set = set + i}
, then, IMHO, would be better to just use a mutable set.
– Luis Miguel Mejía Suárez
Nov 19 '18 at 15:18
Preference betweenmutable val
vsimmutable var
should depend in part on visibility. See stackoverflow.com/a/11386867/3226045 for an explanation on referential transparency.
– Shane Perry
Nov 19 '18 at 20:33
added the algo that is used in the program
– whysoseriousson
Nov 20 '18 at 6:57
|
show 1 more comment
Could you include your loop code in your question?
– hasumedic
Nov 19 '18 at 14:56
1
Your understanding is correct
– Dima
Nov 19 '18 at 15:04
Yes you're correct, but it would be nice to see your code, because if you're doing something like:var set = Set.empty[Int]; for (i <- 1 to 1000) { set = set + i}
, then, IMHO, would be better to just use a mutable set.
– Luis Miguel Mejía Suárez
Nov 19 '18 at 15:18
Preference betweenmutable val
vsimmutable var
should depend in part on visibility. See stackoverflow.com/a/11386867/3226045 for an explanation on referential transparency.
– Shane Perry
Nov 19 '18 at 20:33
added the algo that is used in the program
– whysoseriousson
Nov 20 '18 at 6:57
Could you include your loop code in your question?
– hasumedic
Nov 19 '18 at 14:56
Could you include your loop code in your question?
– hasumedic
Nov 19 '18 at 14:56
1
1
Your understanding is correct
– Dima
Nov 19 '18 at 15:04
Your understanding is correct
– Dima
Nov 19 '18 at 15:04
Yes you're correct, but it would be nice to see your code, because if you're doing something like:
var set = Set.empty[Int]; for (i <- 1 to 1000) { set = set + i}
, then, IMHO, would be better to just use a mutable set.– Luis Miguel Mejía Suárez
Nov 19 '18 at 15:18
Yes you're correct, but it would be nice to see your code, because if you're doing something like:
var set = Set.empty[Int]; for (i <- 1 to 1000) { set = set + i}
, then, IMHO, would be better to just use a mutable set.– Luis Miguel Mejía Suárez
Nov 19 '18 at 15:18
Preference between
mutable val
vs immutable var
should depend in part on visibility. See stackoverflow.com/a/11386867/3226045 for an explanation on referential transparency.– Shane Perry
Nov 19 '18 at 20:33
Preference between
mutable val
vs immutable var
should depend in part on visibility. See stackoverflow.com/a/11386867/3226045 for an explanation on referential transparency.– Shane Perry
Nov 19 '18 at 20:33
added the algo that is used in the program
– whysoseriousson
Nov 20 '18 at 6:57
added the algo that is used in the program
– whysoseriousson
Nov 20 '18 at 6:57
|
show 1 more comment
2 Answers
2
active
oldest
votes
It is not quite as simple as that because it depends on the size of elements in the Set
.
Creating a new Set
is a shallow operation and does not copy the elements in the set, it just creates a new wrapper (typically a hash table of some sort) pointing to the same objects.
If you have a small set of large objects then duplicating that set might not take much storage because the objects will be shared between the two sets. Most of the memory is used by the objects in the set and these do not need to be copied to create a new set. So your 400Mb might become 450Mb and fit within the memory limit.
If you have a large set of small objects then duplicating that set may double the storage. Most of the memory is used in the Set
itself and can't be shared between the original set and the copy. In this case your 400Mb could easily become close to 800Mb.
Since you are running out of memory and you say there are a lot of objects, then it sounds like this is the problem, but we would need to see the code to tell for sure.
Thanks tim for the ans.I am checking the code and will come back
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step,
This is not correct.
Immutable collections in scala (including Sets
) are implemented as persistent data structures, which usually have a property called "structural sharing". That means, when the structure is updated, it's not fully copied, but instead most of it is reused, with only relatively small part being actually re-created from scratch.
The easiest example to illustrate that is List
, which is implemented as a single-linked list, with root pointing to the head.
For example, you have the following code:
val a = List(3,2,1)
val b = 4 :: a
val c = 5 :: b
Although the three lists combined have 3 + 4 + 5 = 12 elements in total, they physically share the nodes, and there are only 5 List
nodes.
5 → 4 → 3 → 2 → 1
↑ ↑ ↑
c b a
Similar principle applies to Set
. Set
in scala is implemented as a HashTrie. I won't go into the details about specifics of a Trie, just think about it as a tree with a high branching factor. Now when that tree is updated, it's not copied completely. Only the nodes that are in the path from the tree root to the new/updated node are copied.
For the HashTrie
the depth of the tree can not be more than 7 levels. So, when updating Set
in scala you're looking at the memory allocation proportional to O(7 * 32)
(7 levels max, each node roughly speaking is an array of 32) in the worst case, regardless of the Set size.
Looking at you code, you have following things in memory:
myMuttableSet
is present untilgetNewCollection
returns
myMuttableSet.flatMap
creates mutable buffer underneath. Also, afterflatMap
is done,buffer.result
will copy the content of the mutable buffer over to immutable set. So there is actually a brief moment when two sets exist.- on every step of
flatMap
,returnedSet
also retains the memory.
Side note: why are you calling doSomeCalculationsAndreturnASet
again if you already have it's result cached in the returnedSet
? Could it be the root of the problem?
So, at any given point of time you have in memory (whichever is larger):
myMuttableSet
+mutable result set buffer
+returnedSet
+(another?) result doSomeCalculationsAndreturnASet
myMuttableSet
+mutable result set buffer
+immutable result set
To conclude, whatever your problems with memory are, adding the element to the Set most probably is not the culprit. My suggestion would be to suspend you program in debugger and use any profiler (such as VisualVM) to make heap dumps at different stages.
thanks for the detailed analysis .. I will check and come back on this.
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53377150%2fout-of-memory-while-adding-elements-in-immutable-set-in-scala%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
It is not quite as simple as that because it depends on the size of elements in the Set
.
Creating a new Set
is a shallow operation and does not copy the elements in the set, it just creates a new wrapper (typically a hash table of some sort) pointing to the same objects.
If you have a small set of large objects then duplicating that set might not take much storage because the objects will be shared between the two sets. Most of the memory is used by the objects in the set and these do not need to be copied to create a new set. So your 400Mb might become 450Mb and fit within the memory limit.
If you have a large set of small objects then duplicating that set may double the storage. Most of the memory is used in the Set
itself and can't be shared between the original set and the copy. In this case your 400Mb could easily become close to 800Mb.
Since you are running out of memory and you say there are a lot of objects, then it sounds like this is the problem, but we would need to see the code to tell for sure.
Thanks tim for the ans.I am checking the code and will come back
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
It is not quite as simple as that because it depends on the size of elements in the Set
.
Creating a new Set
is a shallow operation and does not copy the elements in the set, it just creates a new wrapper (typically a hash table of some sort) pointing to the same objects.
If you have a small set of large objects then duplicating that set might not take much storage because the objects will be shared between the two sets. Most of the memory is used by the objects in the set and these do not need to be copied to create a new set. So your 400Mb might become 450Mb and fit within the memory limit.
If you have a large set of small objects then duplicating that set may double the storage. Most of the memory is used in the Set
itself and can't be shared between the original set and the copy. In this case your 400Mb could easily become close to 800Mb.
Since you are running out of memory and you say there are a lot of objects, then it sounds like this is the problem, but we would need to see the code to tell for sure.
Thanks tim for the ans.I am checking the code and will come back
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
It is not quite as simple as that because it depends on the size of elements in the Set
.
Creating a new Set
is a shallow operation and does not copy the elements in the set, it just creates a new wrapper (typically a hash table of some sort) pointing to the same objects.
If you have a small set of large objects then duplicating that set might not take much storage because the objects will be shared between the two sets. Most of the memory is used by the objects in the set and these do not need to be copied to create a new set. So your 400Mb might become 450Mb and fit within the memory limit.
If you have a large set of small objects then duplicating that set may double the storage. Most of the memory is used in the Set
itself and can't be shared between the original set and the copy. In this case your 400Mb could easily become close to 800Mb.
Since you are running out of memory and you say there are a lot of objects, then it sounds like this is the problem, but we would need to see the code to tell for sure.
It is not quite as simple as that because it depends on the size of elements in the Set
.
Creating a new Set
is a shallow operation and does not copy the elements in the set, it just creates a new wrapper (typically a hash table of some sort) pointing to the same objects.
If you have a small set of large objects then duplicating that set might not take much storage because the objects will be shared between the two sets. Most of the memory is used by the objects in the set and these do not need to be copied to create a new set. So your 400Mb might become 450Mb and fit within the memory limit.
If you have a large set of small objects then duplicating that set may double the storage. Most of the memory is used in the Set
itself and can't be shared between the original set and the copy. In this case your 400Mb could easily become close to 800Mb.
Since you are running out of memory and you say there are a lot of objects, then it sounds like this is the problem, but we would need to see the code to tell for sure.
answered Nov 19 '18 at 19:17
Tim
4,5821317
4,5821317
Thanks tim for the ans.I am checking the code and will come back
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
Thanks tim for the ans.I am checking the code and will come back
– whysoseriousson
Nov 21 '18 at 3:14
Thanks tim for the ans.I am checking the code and will come back
– whysoseriousson
Nov 21 '18 at 3:14
Thanks tim for the ans.I am checking the code and will come back
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step,
This is not correct.
Immutable collections in scala (including Sets
) are implemented as persistent data structures, which usually have a property called "structural sharing". That means, when the structure is updated, it's not fully copied, but instead most of it is reused, with only relatively small part being actually re-created from scratch.
The easiest example to illustrate that is List
, which is implemented as a single-linked list, with root pointing to the head.
For example, you have the following code:
val a = List(3,2,1)
val b = 4 :: a
val c = 5 :: b
Although the three lists combined have 3 + 4 + 5 = 12 elements in total, they physically share the nodes, and there are only 5 List
nodes.
5 → 4 → 3 → 2 → 1
↑ ↑ ↑
c b a
Similar principle applies to Set
. Set
in scala is implemented as a HashTrie. I won't go into the details about specifics of a Trie, just think about it as a tree with a high branching factor. Now when that tree is updated, it's not copied completely. Only the nodes that are in the path from the tree root to the new/updated node are copied.
For the HashTrie
the depth of the tree can not be more than 7 levels. So, when updating Set
in scala you're looking at the memory allocation proportional to O(7 * 32)
(7 levels max, each node roughly speaking is an array of 32) in the worst case, regardless of the Set size.
Looking at you code, you have following things in memory:
myMuttableSet
is present untilgetNewCollection
returns
myMuttableSet.flatMap
creates mutable buffer underneath. Also, afterflatMap
is done,buffer.result
will copy the content of the mutable buffer over to immutable set. So there is actually a brief moment when two sets exist.- on every step of
flatMap
,returnedSet
also retains the memory.
Side note: why are you calling doSomeCalculationsAndreturnASet
again if you already have it's result cached in the returnedSet
? Could it be the root of the problem?
So, at any given point of time you have in memory (whichever is larger):
myMuttableSet
+mutable result set buffer
+returnedSet
+(another?) result doSomeCalculationsAndreturnASet
myMuttableSet
+mutable result set buffer
+immutable result set
To conclude, whatever your problems with memory are, adding the element to the Set most probably is not the culprit. My suggestion would be to suspend you program in debugger and use any profiler (such as VisualVM) to make heap dumps at different stages.
thanks for the detailed analysis .. I will check and come back on this.
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step,
This is not correct.
Immutable collections in scala (including Sets
) are implemented as persistent data structures, which usually have a property called "structural sharing". That means, when the structure is updated, it's not fully copied, but instead most of it is reused, with only relatively small part being actually re-created from scratch.
The easiest example to illustrate that is List
, which is implemented as a single-linked list, with root pointing to the head.
For example, you have the following code:
val a = List(3,2,1)
val b = 4 :: a
val c = 5 :: b
Although the three lists combined have 3 + 4 + 5 = 12 elements in total, they physically share the nodes, and there are only 5 List
nodes.
5 → 4 → 3 → 2 → 1
↑ ↑ ↑
c b a
Similar principle applies to Set
. Set
in scala is implemented as a HashTrie. I won't go into the details about specifics of a Trie, just think about it as a tree with a high branching factor. Now when that tree is updated, it's not copied completely. Only the nodes that are in the path from the tree root to the new/updated node are copied.
For the HashTrie
the depth of the tree can not be more than 7 levels. So, when updating Set
in scala you're looking at the memory allocation proportional to O(7 * 32)
(7 levels max, each node roughly speaking is an array of 32) in the worst case, regardless of the Set size.
Looking at you code, you have following things in memory:
myMuttableSet
is present untilgetNewCollection
returns
myMuttableSet.flatMap
creates mutable buffer underneath. Also, afterflatMap
is done,buffer.result
will copy the content of the mutable buffer over to immutable set. So there is actually a brief moment when two sets exist.- on every step of
flatMap
,returnedSet
also retains the memory.
Side note: why are you calling doSomeCalculationsAndreturnASet
again if you already have it's result cached in the returnedSet
? Could it be the root of the problem?
So, at any given point of time you have in memory (whichever is larger):
myMuttableSet
+mutable result set buffer
+returnedSet
+(another?) result doSomeCalculationsAndreturnASet
myMuttableSet
+mutable result set buffer
+immutable result set
To conclude, whatever your problems with memory are, adding the element to the Set most probably is not the culprit. My suggestion would be to suspend you program in debugger and use any profiler (such as VisualVM) to make heap dumps at different stages.
thanks for the detailed analysis .. I will check and come back on this.
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step,
This is not correct.
Immutable collections in scala (including Sets
) are implemented as persistent data structures, which usually have a property called "structural sharing". That means, when the structure is updated, it's not fully copied, but instead most of it is reused, with only relatively small part being actually re-created from scratch.
The easiest example to illustrate that is List
, which is implemented as a single-linked list, with root pointing to the head.
For example, you have the following code:
val a = List(3,2,1)
val b = 4 :: a
val c = 5 :: b
Although the three lists combined have 3 + 4 + 5 = 12 elements in total, they physically share the nodes, and there are only 5 List
nodes.
5 → 4 → 3 → 2 → 1
↑ ↑ ↑
c b a
Similar principle applies to Set
. Set
in scala is implemented as a HashTrie. I won't go into the details about specifics of a Trie, just think about it as a tree with a high branching factor. Now when that tree is updated, it's not copied completely. Only the nodes that are in the path from the tree root to the new/updated node are copied.
For the HashTrie
the depth of the tree can not be more than 7 levels. So, when updating Set
in scala you're looking at the memory allocation proportional to O(7 * 32)
(7 levels max, each node roughly speaking is an array of 32) in the worst case, regardless of the Set size.
Looking at you code, you have following things in memory:
myMuttableSet
is present untilgetNewCollection
returns
myMuttableSet.flatMap
creates mutable buffer underneath. Also, afterflatMap
is done,buffer.result
will copy the content of the mutable buffer over to immutable set. So there is actually a brief moment when two sets exist.- on every step of
flatMap
,returnedSet
also retains the memory.
Side note: why are you calling doSomeCalculationsAndreturnASet
again if you already have it's result cached in the returnedSet
? Could it be the root of the problem?
So, at any given point of time you have in memory (whichever is larger):
myMuttableSet
+mutable result set buffer
+returnedSet
+(another?) result doSomeCalculationsAndreturnASet
myMuttableSet
+mutable result set buffer
+immutable result set
To conclude, whatever your problems with memory are, adding the element to the Set most probably is not the culprit. My suggestion would be to suspend you program in debugger and use any profiler (such as VisualVM) to make heap dumps at different stages.
Now for before adding new element Scala tries to copy old set into a new set (which I think would again consume 400mb again) now at this step,
This is not correct.
Immutable collections in scala (including Sets
) are implemented as persistent data structures, which usually have a property called "structural sharing". That means, when the structure is updated, it's not fully copied, but instead most of it is reused, with only relatively small part being actually re-created from scratch.
The easiest example to illustrate that is List
, which is implemented as a single-linked list, with root pointing to the head.
For example, you have the following code:
val a = List(3,2,1)
val b = 4 :: a
val c = 5 :: b
Although the three lists combined have 3 + 4 + 5 = 12 elements in total, they physically share the nodes, and there are only 5 List
nodes.
5 → 4 → 3 → 2 → 1
↑ ↑ ↑
c b a
Similar principle applies to Set
. Set
in scala is implemented as a HashTrie. I won't go into the details about specifics of a Trie, just think about it as a tree with a high branching factor. Now when that tree is updated, it's not copied completely. Only the nodes that are in the path from the tree root to the new/updated node are copied.
For the HashTrie
the depth of the tree can not be more than 7 levels. So, when updating Set
in scala you're looking at the memory allocation proportional to O(7 * 32)
(7 levels max, each node roughly speaking is an array of 32) in the worst case, regardless of the Set size.
Looking at you code, you have following things in memory:
myMuttableSet
is present untilgetNewCollection
returns
myMuttableSet.flatMap
creates mutable buffer underneath. Also, afterflatMap
is done,buffer.result
will copy the content of the mutable buffer over to immutable set. So there is actually a brief moment when two sets exist.- on every step of
flatMap
,returnedSet
also retains the memory.
Side note: why are you calling doSomeCalculationsAndreturnASet
again if you already have it's result cached in the returnedSet
? Could it be the root of the problem?
So, at any given point of time you have in memory (whichever is larger):
myMuttableSet
+mutable result set buffer
+returnedSet
+(another?) result doSomeCalculationsAndreturnASet
myMuttableSet
+mutable result set buffer
+immutable result set
To conclude, whatever your problems with memory are, adding the element to the Set most probably is not the culprit. My suggestion would be to suspend you program in debugger and use any profiler (such as VisualVM) to make heap dumps at different stages.
edited Nov 20 '18 at 8:17
answered Nov 20 '18 at 8:03
Aivean
7,4621526
7,4621526
thanks for the detailed analysis .. I will check and come back on this.
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
thanks for the detailed analysis .. I will check and come back on this.
– whysoseriousson
Nov 21 '18 at 3:14
thanks for the detailed analysis .. I will check and come back on this.
– whysoseriousson
Nov 21 '18 at 3:14
thanks for the detailed analysis .. I will check and come back on this.
– whysoseriousson
Nov 21 '18 at 3:14
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53377150%2fout-of-memory-while-adding-elements-in-immutable-set-in-scala%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Could you include your loop code in your question?
– hasumedic
Nov 19 '18 at 14:56
1
Your understanding is correct
– Dima
Nov 19 '18 at 15:04
Yes you're correct, but it would be nice to see your code, because if you're doing something like:
var set = Set.empty[Int]; for (i <- 1 to 1000) { set = set + i}
, then, IMHO, would be better to just use a mutable set.– Luis Miguel Mejía Suárez
Nov 19 '18 at 15:18
Preference between
mutable val
vsimmutable var
should depend in part on visibility. See stackoverflow.com/a/11386867/3226045 for an explanation on referential transparency.– Shane Perry
Nov 19 '18 at 20:33
added the algo that is used in the program
– whysoseriousson
Nov 20 '18 at 6:57