How to extract graph in adjacency list or matrix format from Tigergraph Cloud or UI

I couldn’t find any steps or guide to achieve the same. I need to extract Relational Graph and use it as input to GCNN model. The graph is heterogenous consisting of different vertex and edge types.

Unfortunately, we do not have a built-in method for heterogenous graph extraction for input into a GNN. How big is your graph? Depending on the size, you may need to batch it into neighborhood samples for each vertex in order to fit the graph into the memory of your training computer. You can extract an edge list with a query something like this (assuming you can fit the entire graph in memory):

CREATE QUERY companyLinks(/* Parameters here */) FOR GRAPH @graphname@ SYNTAX v2{ 
	TYPEDEF TUPLE <STRING src, STRING srcType, STRING dest, STRING, destType> TUPLE_RECORD;
	ListAccum<TUPLE_RECORD> @@tupleRecords;
	start = {*};  
	result = SELECT tgt FROM start:s-(:e1)-:tgt WHERE s != tgt 
	         ACCUM @@tupleRecords += TUPLE_RECORD (s.id, s.type, tgt.id, tgt.type);
	PRINT @@tupleRecords;
}

Once this query is installed, you can call it and get a JSON response with pyTigerGraph. Let me know if you have any other questions!

Thanks @Parker_Erickson . My graph is heterogenous with ~12 vextex , the tabular data records size is in ~millions. Currently I’m working with sample data with around 2 lakh records and a small heterogenous graph to understand Tigergraph platform and evolve to solve a bigger problem. I tried extracting edge list from your query. Listing the issues I am facing also attached sample graph being used.

  1. I am getting error in the line start = {}
    Error message : (4, 10) Error: extraneous input '
    ’ expecting {ABORT, ANY, BY, COMMIT, DISTINCT, FILE, GROUP, INSERT, LASTHOP, LIST, LOG, MAP, MATCH, NOW, PATH, PER, REPLACE, SELECT_VERTEX, SRC, TGT, TO_DATETIME, UPDATE, ‘}’, ‘_’, NAME, GACCNAME}

  2. I changed start = {} to start = {User_id.}
    Error message : (5, 0) Error: ‘s.id’ indicates vertex types [Merchant_Category], which does not conform to any source vertex type of [User_id-(User_makes_Txn)->Transaction, User_id-(user_has_device)->device, User_id-(has_address)->address, User_id-(reverse_User_Receives_Txn)->Transaction, User_id-(Is_Fraud)->Fraud] supported here.

@pink03 Is it possible to share the full query? Are you starting with a single UserID or all UserID’s?

@Jon_Herke I want to extract the entire heterogenous relational graph in adjacency list or adjacency list format. I tried query shared by @Parker_Erickson but was facing error hence starting with all user_id vertex to get edge lists was just an experiment.

Yeah, that query I wrote without knowing your schema. You will have to load your data into the graph before running this query. The query will have to be edited to get whatever the primary id of each vertex type is instead of “s.id”. Hope this helps.

2 Likes

Thanks @Parker_Erickson . Extracting edge list worked but I want only those results where dest.id is not null . How do I add that condition in the query .

CREATE QUERY userTomerchant() FOR GRAPH Fraud_sample SYNTAX v2{
TYPEDEF TUPLE <STRING src, STRING srcType, STRING dest, STRING destType> TUPLE_RECORD;
ListAccum<TUPLE_RECORD> @@tupleRecords;

Seed = {User.*};
acctSend = SELECT tgt
FROM Seed:s - (User_makes_txn>:e1) - Transaction:t- (Merchant_Receives_txn>:e2)- :tgt
WHERE s != tgt
ACCUM @@tupleRecords += TUPLE_RECORD (s.id, s.type, tgt.id, tgt.type);
PRINT @@tupleRecords;
}

CREATE QUERY userTomerchant() FOR GRAPH Fraud_sample SYNTAX v2{
TYPEDEF TUPLE <STRING src, STRING srcType, STRING dest, STRING destType> TUPLE_RECORD;
ListAccum<TUPLE_RECORD> @@tupleRecords;

Seed = {User.*};
acctSend = SELECT tgt
FROM Seed:s - (User_makes_txn>:e1) - Transaction:t- (Merchant_Receives_txn>:e2)- :tgt
WHERE tgt.id != NULL
ACCUM @@tupleRecords += TUPLE_RECORD (s.id, s.type, tgt.id, tgt.type);
PRINT @@tupleRecords;
}

This should do it, but I would second guess your loading of the data if there were null values as your primary id.

2 Likes