Batch process with large data

igenta · February 21, 2023, 7:57am

Hi everyone,

I’m new to tigergraph. I do some examples about finding similar customers and wondering if there are other way to process millions of people in the same time?
In GSQL syntax, accum doesn’t support DICT. It turns out that the query can save the similarity score for each id as a parameter. So the GSQL cannot return similar persons for each batch of customers.

markmegerian · February 21, 2023, 2:49pm

Can you provide some more details, such as the example you are referring to?

The ACCUM allows for many options, such as nested Accum and SetAccum. So using the SetAccum, you could generate a set of similar persons stored in an accumulator on the customer vertex.

igenta · February 22, 2023, 7:32am

I figure out that can use mapaccum to save similar score between pair of customer. Thanks @markmegerian

igenta · March 1, 2023, 4:19am

Hi @markmegerian ,
With this example, we will recommend msg to every people that is the parameter of query

Can you introduce another example that we do the same recommendation to a batch of customers?
I don’t think running a loop to call a query with every single customer is a good idea.

markmegerian · March 2, 2023, 5:44pm

That would be very easy, but I want to ask you: what other criteria would you use to select the batch. this example uses first and last name, are you suggesting some other criteria which will yield a batch of customers? I agree that would work much better for certain use cases, where you dont want to call the query for a user at a time. But also keep in mind that for a real-time recommender, then you actually want to do for the specific customer that is logged in and using the system at that moment