Flatten Of the output Of a GSQL Query

sai_7419 · July 30, 2024, 9:22am

Hello,

I am working with a nested group-by accumulator in a query, and I need to flatten it. Specifically, I want to extract each entry in the nested group-by accumulators into separate columns, ensuring the entries are correctly mapped.

For example:

Groupby Accum<
     k1:
     k2:
     k3: (i.e., another group-by accumulator)<
          m1:
          m2:
          m3>>

Desired output:

row 1 k1 k2 m1
row 2 k1 k2 m2
row 3 k1 k2 m3

It is important to note that this output cannot be generated directly in TigerGraph due to its large size. We are receiving the response in JSON format via the REST API.

Is there any way to achieve the desired output response using GSQL, given these constraints?

Jim_Limprasert · August 2, 2024, 8:46pm

Hi @sai_7419,

We don’t know the current structure of your group-by accumulator. However, if you want to flatten it, I would say that there’s two approaches you can take - either have a Tuple type to hold the “row” result or print out the row to a CSV file. In both approaches, you’ll probably have to do the nested FOREACH loop on the GroupBy accum to get everything you want in one row.

Approach 1 - use Tuple to hold the column names and their types. Then, have a ListAccum to hold the result, use

TYPEDEF TUPLE<STRING col_01, STRING col_02, ...> Row_Tuple;
ListAccum<Row_Tuple> all_result_rows;

...

FOREACH (g1, g2, g3, g4) IN @@group_by_accum_01 DO
    // suppose g4 is another group_by_accum
    FOREACH (g4_1, g4_2, g4_3) IN g4 DO
        all_results_row += Row_Tuple(g1, g2, g3, g4_1, g4_2, g4_3);
    END
END

PRINT all_results_row;

Approach 2 - print each row to a CSV file. Note that we only support CSV format for now.

FILE f1 ("<some_file_name>.csv");

...

FOREACH (g1, g2, g3, g4) IN @@group_by_accum_01 DO
    // suppose g4 is another group_by_accum
    FOREACH (g4_1, g4_2, g4_3) IN g4 DO
       f1.println(g1, g2, g3, g4_1, g4_2, g4_3);
    END
END

TigerGraph had a 2 GB limit on response size (seen here - Output Statements and FILE Objects :: GSQL Language Reference). You might want to consider doing some “batching” to limit the data size.

Best,
Supawish Limprasert (Jim)
Solution Engineer, TigerGraph