Hi Ishestakov,
Thanks for the reply It turns out that my actual use case is a little more complicated than my previous simplified example. The actual code looks something like this (note it will throw the same error):
SetAccum<VERTEX<Claim>> @claims;
SetAccum<STRING> @vins, @plates, @ids, @names;
INT n_secs;
n_secs = n_days*24*60*60+1;
_t0 = SELECT s FROM claim:s - (REV_CAR_ASSOCIATE_WITH>) - Car:t
ACCUM s.@vins += t.vin,
s.@plates += t.plate_no;
_t1 = SELECT c1 FROM claim:c1 - (<CAR_ASSOCIATE_WITH.CAR_ASSOCIATE_WITH>) - claim:c2
ACCUM
count(c1.@plates INTERSECT c2.@plates) > 1 OR count(c1.@vins INTERSECT c2.@vins) > 1 AND
c1 != c2 THEN
c1.@claims += c2 END
POST-ACCUM c1.@claims += c1;
And I modified it into:
SetAccum<VERTEX<Claim>> @claims;
SetAccum<STRING> @vins, @plates, @ids, @names;
SumAccum<INT> @vin_cnt, @plate_cnt;
INT n_secs;
n_secs = n_days*24*60*60+1;
_t0 = SELECT s FROM claim:s - (REV_CAR_ASSOCIATE_WITH>) - Car:t
ACCUM s.@vins += t.vin,
s.@plates += t.plate_no;
_t1 = SELECT c2 FROM claim:c1 - (<CAR_ASSOCIATE_WITH.CAR_ASSOCIATE_WITH>) - claim:c2
ACCUM IF c1 != c2 AND abs(datetime_diff(c1.accident_time, c2.accident_time)) < n_secs THEN
FOREACH vin IN c2.@vins DO
FOREACH plate IN c2.@plates DO
IF c1.@vins.contains(vin) THEN c1.@vin_cnt += 1 END,
IF c1.@plates.contains(plate) THEN c1.@plate_cnt += 1 END,
IF c1.@vin_cnt > 1 OR c1.@plate_cnt > 1 then c1.@claims += c2, BREAK END
END
END
END, c1.@vin_cnt = 0, c1.@plate_cnt = 0
POST-ACCUM c1.@claims += c1;
(PS. I added a BREAK inside FOREACH loop to stop iterating once the condition is satisfied)
But now it throws the error:
an undefined variable plate in the current scope
which seems to be another version 2 syntax incompatibility issue.
So eventually I gave up the multi-hop syntax and switched to single hop, although doing single hop for pairwise comparison has always been less intuitive to me.
SetAccum<VERTEX<Claim>> @claims;
SetAccum<STRING> @vins, @plates;
GroupByAccum<VERTEX<Claim> claim, DATETIME accident_time, SetAccum<STRING> vin, SetAccum<STRING> plate> @claim_info;
INT n_secs;
n_secs = n_days*24*60*60+1;
_t0 = SELECT t FROM claim:s - (REV_CAR_ASSOCIATE_WITH>) - Car:t
ACCUM
s.@vins += t.vin,
s.@plates += t.plate_no,
t.@claim_info += (s -> s.accident_time, t.vin, t.plate_no);
_t1 = SELECT t FROM claim:s - (REV_CAR_ASSOCIATE_WITH>) - Car:t
ACCUM
FOREACH info IN t.@claim_info DO
IF s != info.claim AND
abs(datetime_diff(s.accident_time, info.accident_time)) < n_secs AND
(count(info.vin INTERSECT s.@vins) > 1 OR count(info.plate INTERSECT s.@plates) > 1) THEN
s.@claims += info.claim
END
END
POST-ACCUM s.@claims += s;
It basically has to store the info from the Claim vertex on the Car vertex it connects to, and then traverse from Car again to Claim to do the pairwise comparison. However, while doing the pairwise comparison, there seems to be plenty of redundancy, much like what is in jaccard_batch implementation.