Perform additional processing on vertex sets

The query at the bottom of this post returns a collection of what I call paths, but you might call vertex sets as shown in the screenshot below. Question. What’s the correct term for these?

Based on the logic we came up with thus far I can return vertex sets that make up phrases like this:

“during any period of construction to a width”

This was the easy part. Now I need to do additional processing on each phrase. The last vertex is “width” which I have the vertex id to use. I need to look n hops past this vertex looking for a word such as “100” or “one hundred” followed by another vertex 1 or 2 more hops containing the text:

(feet|foot|meter|inches|inch|’|”)

Note: In the example above the text “one hundred” will actually be a combination of 2 vertices.

If I can find the correct words (match the criteria correctly), then I need to add the word from each hop to the end of the phrase. This should give us a finished phrase like:

“during any period of construction to a width of (100) feet”

I’m thinking we can use a UDFs with regex for parts of this. The first question in this puzzle is how do we break the accumulator values down to these subsets and begin the additional processing.

Thank you.

CREATE QUERY aFindPathContainingMulitCriteria(STRING criteria, INT maxDist) FOR GRAPH MyGraph {
/* exclusive,permanent,Right,Way */
ListAccum<string>  @@criteriaWords;
SetAccum<edge> @@edgeSet;
SetAccum<vertex> @@vSet;
ListAccum<STRING> @@words;

STRING startWord;
STRING endWord;
INT wordCount;

 @@criteriaWords += string_split(criteria,",");
startWord = @@criteriaWords.get(0);
endWord = @@criteriaWords.get(@@criteriaWords.size()-1);

Start (ANY) = {word.*};

Start = select s from Start:s where s.Text == startWord;

while Start.size() > 0 limit 20 do
Start = select t from Start:s-(nextword:e)-word:t
accum @@edgeSet += e,
@@vSet += t,
@@words += t.Text
having t.Text != endWord;
end;

vSet = @@vSet;
print @@vSet,@@edgeSet, vSet;
print @@words;
}

Hi again George,

For your first question about going n hops past the “end word” - you can store the vertices of these words within the ACCUM clause of the select statement by using an IF statement :

SetAccum<vertex> @@matchSet;

while Start.size() > 0 limit 20 do

  Start = select t from Start:s-(nextword:e)-word:t

    accum @@edgeSet += e,

          @@vSet += t,

          @@words += t.Text,

          if t.Text == endWord then

            @@matchSet += t

          end

          having t.Text != endWord;

end;

As for traversing to the “100” or “one hundred”, this doesn’t seem like something we can support on our end, so there would probably need to be some data alteration on your end.

Regarding your second question, about subsets, you can actually use a local accumulator to save the list of words up to a given point. Local accumulators differ from global in that there is a unique one for each vertex, whereas a global accumulator has one value for the entire graph.

SetAccum<edge> @@edgeSet;

SetAccum<vertex> @@vSet;

<b>ListAccum<string> @subset; // local accumulator</b>

**ListAccum<ListAccum<string>> @@superset; // global accumulator**

Start (ANY) = {customer.*};

Start = select s from Start:s where s.test == startWord

        accum s.@subset += s.test

        post-accum @@superset += s.@subset;

while Start.size() > 0 limit 20 do 

  Start = select t from Start:s-(ctoc:e)-customer:t

          accum @@edgeSet += e, 

                @@vSet += t,

 **t.@subset += s.@subset,** 

 **t.@subset += t.test**

 **post-accum @@superset += t.@subset**

          having t.test != endWord;

end;

print @@superset;

Here’s what’s going on:

In the first select statement, we’re adding the first word to its own List and add that list to the global list.

In the while loop, our traversal logic stays the same.

After accumulating the target vertex to the vertex Set, we add the source’s list to the target’s list along with the target’s word.

During the POST-ACCUM phase, we add the target’s updated list to the global list.

Thanks,

Kevin