Let’s assume I have a query that finds friends of a given person, and their friends, and their friends.
Something like:
S = { personId };
VertexSet_1 =
SELECT s
FROM S:s -(KNOWS*1..3)- Person:friends
WHERE (friends.firstName == firstName);
Is there a way to return the value of how many hops away is a given friend from the person?
If we have a graph like this:
(John)-KNOWS-(Anna)
(Anna:Person)-KNOWS-(Tom)
The expected result would be (starting from John)
John, Anna, 1
John, Tom, 2
Any thoughts? Is there a construct in the GSQL language for that? Or do I need to have multiple one-hop queries to keep the track of the distance?
This is possible in GSQL, but not with the Kleene star syntax as you have specified.
Essentially, the way to do this is through brute force. We start with the source node, and we inspect all other vertices that are 1 hop away. Then, we inspect all vertices (that have not been visited yet) that are 1 hop away from these new nodes (these nodes will be 2 hops away from the source). This process can repeat indefinitely, as we keep track of visited nodes which prevents us from travelling backwards.
The process eventually stops once it inspects the node that you have specified as the target. At that point, it will return the number of hops that it takes to get from source to target. If the specified maximum number of hops has been inspected and the target has not been found, the query will return -1.
CREATE QUERY test(INT maxHops, STRING startID, STRING endID) FOR GRAPH MyGraph RETURNS (INT) {
INT ans=0;
OrAccum @visited = false;
SumAccum<int> @@loop=0;
Start = {Person.*};
Start = SELECT s from Start:s where s.firstName == startID;
Start = SELECT v
FROM Start:v
ACCUM v.@visited = true;
WHILE (@@loop < maxHops) DO
@@loop += 1;
Start = SELECT v
FROM Start:u - (KNOWS:e)- :v
WHERE v.@visited == false
ACCUM v.@visited = true
POST-ACCUM IF v.firstName == endID THEN
ans = @@loop END;
if ans != 0 then
print ans;
return ans;
end;
END;
// ONLY IF TARGET NOT FOUND
print -1 as ans;
return -1;
}
Note that if source and target are n hops apart, this algorithm will have to inspect all other nodes that are n hops or less away from the source. This may be inefficient for larger graphs.
1 Like
Thanks. I’ll give it a try.
1 Like