Shortest paths between two sets of nodes












0















I created a graph in Neo4j with 10 million nodes and 30 million relationships.
Each node is labeled as A (4 million nodes) , B (6 million nodes) or C (20 nodes).

Nodes in A lead to nodes in B. Nodes in B lead to other nodes in B, and to nodes in C.

For each node in A, I need to find the closest node (or nodes, if they are the same distance) in C, and add the ID of the C node as a value of a property in the A node.

Any help would be much appreciated.










share|improve this question

























  • Welcome to SO! Please remember to include a Minimal, Complete, and Verifiable example. Such as what you've tried so far, what failed, what research you did.

    – Johan Rin
    Jan 1 at 12:45
















0















I created a graph in Neo4j with 10 million nodes and 30 million relationships.
Each node is labeled as A (4 million nodes) , B (6 million nodes) or C (20 nodes).

Nodes in A lead to nodes in B. Nodes in B lead to other nodes in B, and to nodes in C.

For each node in A, I need to find the closest node (or nodes, if they are the same distance) in C, and add the ID of the C node as a value of a property in the A node.

Any help would be much appreciated.










share|improve this question

























  • Welcome to SO! Please remember to include a Minimal, Complete, and Verifiable example. Such as what you've tried so far, what failed, what research you did.

    – Johan Rin
    Jan 1 at 12:45














0












0








0








I created a graph in Neo4j with 10 million nodes and 30 million relationships.
Each node is labeled as A (4 million nodes) , B (6 million nodes) or C (20 nodes).

Nodes in A lead to nodes in B. Nodes in B lead to other nodes in B, and to nodes in C.

For each node in A, I need to find the closest node (or nodes, if they are the same distance) in C, and add the ID of the C node as a value of a property in the A node.

Any help would be much appreciated.










share|improve this question
















I created a graph in Neo4j with 10 million nodes and 30 million relationships.
Each node is labeled as A (4 million nodes) , B (6 million nodes) or C (20 nodes).

Nodes in A lead to nodes in B. Nodes in B lead to other nodes in B, and to nodes in C.

For each node in A, I need to find the closest node (or nodes, if they are the same distance) in C, and add the ID of the C node as a value of a property in the A node.

Any help would be much appreciated.







neo4j shortest-path subgraph






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 1 at 12:49







dbank04

















asked Jan 1 at 12:36









dbank04dbank04

11




11













  • Welcome to SO! Please remember to include a Minimal, Complete, and Verifiable example. Such as what you've tried so far, what failed, what research you did.

    – Johan Rin
    Jan 1 at 12:45



















  • Welcome to SO! Please remember to include a Minimal, Complete, and Verifiable example. Such as what you've tried so far, what failed, what research you did.

    – Johan Rin
    Jan 1 at 12:45

















Welcome to SO! Please remember to include a Minimal, Complete, and Verifiable example. Such as what you've tried so far, what failed, what research you did.

– Johan Rin
Jan 1 at 12:45





Welcome to SO! Please remember to include a Minimal, Complete, and Verifiable example. Such as what you've tried so far, what failed, what research you did.

– Johan Rin
Jan 1 at 12:45












1 Answer
1






active

oldest

votes


















0














So we're looking at a model like this (using :LEAD since you didn't specify a relationship type):



(:A)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:C)


APOC Procedures offers the best solution for this one, but it's a two-parter since we first find the closest :C node using the path expander procedures, then rematch using that distance to get the full collection of :C nodes reachable at that distance.



You'll also want to make use of apoc.periodic.iterate() so you can batch this, though you may want to play around with the batchSize.



I'm making some assumptions in this query since you didn't provide much in the way of properties to use in the graph.



CALL apoc.periodic.iterate("MATCH (a:A) RETURN a",
"CALL apoc.path.spanningTree(a, {relationshipFilter:'LEAD>', labelFilter:'/C', limit:1}) YIELD path
WITH a, length(path) as length
CALL apoc.path.subgraphNodes(a, {relationshipFilter:'LEAD>', labelFilter:'/C', maxLevel:length}) YIELD node
WITH a, collect(node.id) as ids
SET a.cIDs = ids",
{batchSize:1000}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages





share|improve this answer
























  • Thank you so much! I refined it a little bit by specifying the relationship types, and used {batchSize:10000, iterateList:true, parallel:true}. Worked like a charm and completed within a few minutes.

    – dbank04
    Jan 2 at 10:43













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995476%2fshortest-paths-between-two-sets-of-nodes%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














So we're looking at a model like this (using :LEAD since you didn't specify a relationship type):



(:A)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:C)


APOC Procedures offers the best solution for this one, but it's a two-parter since we first find the closest :C node using the path expander procedures, then rematch using that distance to get the full collection of :C nodes reachable at that distance.



You'll also want to make use of apoc.periodic.iterate() so you can batch this, though you may want to play around with the batchSize.



I'm making some assumptions in this query since you didn't provide much in the way of properties to use in the graph.



CALL apoc.periodic.iterate("MATCH (a:A) RETURN a",
"CALL apoc.path.spanningTree(a, {relationshipFilter:'LEAD>', labelFilter:'/C', limit:1}) YIELD path
WITH a, length(path) as length
CALL apoc.path.subgraphNodes(a, {relationshipFilter:'LEAD>', labelFilter:'/C', maxLevel:length}) YIELD node
WITH a, collect(node.id) as ids
SET a.cIDs = ids",
{batchSize:1000}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages





share|improve this answer
























  • Thank you so much! I refined it a little bit by specifying the relationship types, and used {batchSize:10000, iterateList:true, parallel:true}. Worked like a charm and completed within a few minutes.

    – dbank04
    Jan 2 at 10:43


















0














So we're looking at a model like this (using :LEAD since you didn't specify a relationship type):



(:A)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:C)


APOC Procedures offers the best solution for this one, but it's a two-parter since we first find the closest :C node using the path expander procedures, then rematch using that distance to get the full collection of :C nodes reachable at that distance.



You'll also want to make use of apoc.periodic.iterate() so you can batch this, though you may want to play around with the batchSize.



I'm making some assumptions in this query since you didn't provide much in the way of properties to use in the graph.



CALL apoc.periodic.iterate("MATCH (a:A) RETURN a",
"CALL apoc.path.spanningTree(a, {relationshipFilter:'LEAD>', labelFilter:'/C', limit:1}) YIELD path
WITH a, length(path) as length
CALL apoc.path.subgraphNodes(a, {relationshipFilter:'LEAD>', labelFilter:'/C', maxLevel:length}) YIELD node
WITH a, collect(node.id) as ids
SET a.cIDs = ids",
{batchSize:1000}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages





share|improve this answer
























  • Thank you so much! I refined it a little bit by specifying the relationship types, and used {batchSize:10000, iterateList:true, parallel:true}. Worked like a charm and completed within a few minutes.

    – dbank04
    Jan 2 at 10:43
















0












0








0







So we're looking at a model like this (using :LEAD since you didn't specify a relationship type):



(:A)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:C)


APOC Procedures offers the best solution for this one, but it's a two-parter since we first find the closest :C node using the path expander procedures, then rematch using that distance to get the full collection of :C nodes reachable at that distance.



You'll also want to make use of apoc.periodic.iterate() so you can batch this, though you may want to play around with the batchSize.



I'm making some assumptions in this query since you didn't provide much in the way of properties to use in the graph.



CALL apoc.periodic.iterate("MATCH (a:A) RETURN a",
"CALL apoc.path.spanningTree(a, {relationshipFilter:'LEAD>', labelFilter:'/C', limit:1}) YIELD path
WITH a, length(path) as length
CALL apoc.path.subgraphNodes(a, {relationshipFilter:'LEAD>', labelFilter:'/C', maxLevel:length}) YIELD node
WITH a, collect(node.id) as ids
SET a.cIDs = ids",
{batchSize:1000}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages





share|improve this answer













So we're looking at a model like this (using :LEAD since you didn't specify a relationship type):



(:A)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:B)
(:B)-[:LEAD]->(:C)


APOC Procedures offers the best solution for this one, but it's a two-parter since we first find the closest :C node using the path expander procedures, then rematch using that distance to get the full collection of :C nodes reachable at that distance.



You'll also want to make use of apoc.periodic.iterate() so you can batch this, though you may want to play around with the batchSize.



I'm making some assumptions in this query since you didn't provide much in the way of properties to use in the graph.



CALL apoc.periodic.iterate("MATCH (a:A) RETURN a",
"CALL apoc.path.spanningTree(a, {relationshipFilter:'LEAD>', labelFilter:'/C', limit:1}) YIELD path
WITH a, length(path) as length
CALL apoc.path.subgraphNodes(a, {relationshipFilter:'LEAD>', labelFilter:'/C', maxLevel:length}) YIELD node
WITH a, collect(node.id) as ids
SET a.cIDs = ids",
{batchSize:1000}) YIELD batches, total, errorMessages
RETURN batches, total, errorMessages






share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 2 at 1:54









InverseFalconInverseFalcon

19.5k31831




19.5k31831













  • Thank you so much! I refined it a little bit by specifying the relationship types, and used {batchSize:10000, iterateList:true, parallel:true}. Worked like a charm and completed within a few minutes.

    – dbank04
    Jan 2 at 10:43





















  • Thank you so much! I refined it a little bit by specifying the relationship types, and used {batchSize:10000, iterateList:true, parallel:true}. Worked like a charm and completed within a few minutes.

    – dbank04
    Jan 2 at 10:43



















Thank you so much! I refined it a little bit by specifying the relationship types, and used {batchSize:10000, iterateList:true, parallel:true}. Worked like a charm and completed within a few minutes.

– dbank04
Jan 2 at 10:43







Thank you so much! I refined it a little bit by specifying the relationship types, and used {batchSize:10000, iterateList:true, parallel:true}. Worked like a charm and completed within a few minutes.

– dbank04
Jan 2 at 10:43






















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53995476%2fshortest-paths-between-two-sets-of-nodes%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

How to fix TextFormField cause rebuild widget in Flutter