Installing Tiger graph algorithms on windows to use in Graph Studio - WCC, Jacard, Louvian etc

@Parker_Erickson @Vladimir_Slesarev @Szilard_Barany
Strange Error as shared earlier -

GSQL > @tg_fastRP.gsql

Semantic Check Error in query tg_fastRP (SEM-45): line 34, col 13
The tuple name or the function tg_extract_list is not defined.
Failed to create queries: [tg_fastRP].

@Szilard_Barany are you joining zoom call scheduled by you?

Regards
Anuroop

tg_jaccard_nbor_ap_batch is not giving me any result where as tg_jaccard_nbor_ss is working fine but I need to run it for all the nodes simultaneously.

I am not able to find code for tg_jaccard_nbor_ap.gsql. Algo page is broken, giving 404 error.

Could someone please help?

@Mohamed_Zrouga @Dan_Barkus @Parker_Erickson @Szilard_Barany @Vladimir_Slesarev @Jon_Herke

Just verifying this is the link you’re referring to https://github.com/tigergraph/gsql-graph-algorithms/tree/master/algorithms/Similarity/jaccard then when clicking on tg_jaccard_nbor_ap.gsql you got a 404 (page doesn’t exist). I’ve also (confirmed) received the error.

Forwarding this on to the Graph Data Science team to get their input on this thread.

Yes, this is the link

tg_wcc algorithm is not working in my case. The weekly connected components have been allocated different communities/result attribute value. There coul dbe something wrong at my end but I am not able to figure out.

Could anyone please help?

I am available for a zoom call today as well as tomorrow but please consider this on top priority.

My email ID: anuroop.ajmera@gmail.com

@Parker_Erickson @Mohamed_Zrouga @Jon_Herke @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno

@AnuroopAjmera “tg_wcc algorithm is not working in my case”

Can you elaborate on what isn’t working as expected? What steps did you take? What errors are you receiving?

Hi Jon,
It’s assigning different community IDs to nodes which should be in same community as per source data. I checked this using pivot in excel.

Looks like something wrong at my end.

Could someone please help on priority? I am available to connect over a meeting (zoom call?) - anuroop.ajmera@gmail.com

Regards
Anuroop Anmera

@Jon_Herke @Parker_Erickson @Mohamed_Zrouga @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno

wcc algorithm is not working in my case because unlike neo4j, this algo in Tigergraph expects nodes to be similar but in my case I want to find communities of nodes based on other nodes (properties).

@Jon_Herke @Parker_Erickson @Mohamed_Zrouga @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno

Current conclusion on my Tigergraph’s DS project is given below. Work is still in progress so final conclusion will be shared later.

  1. tg_wcc is not working
    wcc algorithm is not working in my case because unlike neo4j, this algo in Tigergraph expects nodes to be similar but in my case I want to find communities of nodes based on other nodes (properties).

  2. tg_jaccard_nbor_ap_batch has code issue
    Code has bug which is now resolved locally with the help of @Szilard_Barany

  3. tg_louvain is not showing visualizations
    Louvain is also not showing any visualizations of communities like shown in Tigergraph videos. “Explore Graph” section is showing some nodes having common cid but Louvain’s output is not hsowing the same.

  4. Absence of weighted degree centrality algorith in Tigergraph:
    Details below -
    Weighted degree centrality algorithm

  5. Not able to load 30 MB file:
    Details below:
    Data load getting stuck for 16 MB file

  6. Changes done in gsql ar enot reflecting in Tigergraph GraphStudio:
    Details below:
    How to get GSQL changes reflected in GraphStudio

@Jon_Herke @Parker_Erickson @Mohamed_Zrouga @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno @Pawan_Mall

want to apply maxmin normalization but getting error when doing max() - min() below, can someone please suggest what is the issue here?

GSQL > BEGIN
GSQL > CREATE QUERY degree_cent_res() FOR GRAPH NetworkAnomalyDetection SYNTAX v2 {
GSQL > SELECT s.SourceAddress, (max(s.degree) - min(s.degree)) as risk_score INTO T
GSQL > FROM SourceIP:s -(HAS_SESSION_EVENT>:e)- SessionEvent:se
GSQL > GROUP BY s.SourceAddress
GSQL > ORDER BY risk_score DESC
GSQL > LIMIT 100;
GSQL > PRINT T;
GSQL > }
GSQL > END
Index -1 out of bounds for length 2
Failed to create queries: [degree_cent_res].

@Szilard_Barany @Dan_Barkus @Parker_Erickson @Jon_Herke @Renchu_Song @Mohamed_Zrouga @Bruno @Pawan_Mall @Vladimir_Slesarev @markmegerian

can anyone please suggest?

@Szilard_Barany @Dan_Barkus @Parker_Erickson @Jon_Herke @Renchu_Song @Mohamed_Zrouga @Bruno @Pawan_Mall @Vladimir_Slesarev @markmegerian

I have a few observations and questions for you

  1. The use of the SQL-like syntax is not appropriate in this case, you should use the regular graph traversal syntax. The main reason being that you can only include base attributes in the SELECT list and you are trying to use the degree function
  2. The degree function has parentheses after it, degree( )
  3. Will you have multiple SourceIP vertices with the same SourceAddress value? If so, then you can summarize using a MapAccum, like this

 MapAccum<STRING, MinAccum<INT>> @@minDegree;
  MapAccum<STRING, MaxAccum<INT>> @@maxDegree;
  
  S = SELECT s FROM SourceIP:s -(HAS_SESSION_EVENT>:e)-  SessionEvent:se 
            ACCUM @@minDegree += (s.SourceAddress -> s.degree()), 
                          @@maxDegree += (s.SourceAddress -> s.degree())

if not (i.e. each SourceIP has a unique SourceAddress) then its even simpler, you can just use the vertex-attached accum, like this

  MinAccum<INT> @minDegree;
  MaxAccum<INT> @maxDegree, @riskScore;
  
  S = SELECT s FROM SourceIP:s -(HAS_SESSION_EVENT>:e)-  SessionEvent:se 
            ACCUM s.@minDegree += s.degree(),
                          s.@maxDegree += s.degree()
           POST-ACCUM  s.@riskScore += s.@maxDegree - s.@minDegree;

@markmegerian
Here I am not intending to use degree function, I have added an attribute named “degree” in vertex and captured the output of degree centrality algorithm in that attribute.
And now I want to apply minmax normalisation and facing above mentioned index out of bound issue.

In this case, can’t I use simple SQL syntax? Kindly suggest.

@Szilard_Barany @Dan_Barkus @Parker_Erickson @Jon_Herke @Renchu_Song @Mohamed_Zrouga @Bruno @Pawan_Mall @Vladimir_Slesarev @markmegerian

@markmegerian
Getting an error "no viable alternative at input ‘s.degree()\n s’ at line s.@maxDegree += s.degree() from below code -

CREATE QUERY degree_centrality_res() FOR GRAPH NetworkAnomalyDetection {
MinAccum @minDegree;
MaxAccum @maxDegree, @riskScore;

S = SELECT s FROM SourceIP:s -(HAS_SESSION_EVENT>:e)- SessionEvent:se
ACCUM s.@minDegree += s.degree(),
s.@maxDegree += s.degree()
POST-ACCUM s.@riskScore += s.@maxDegree - s.@minDegree;
}

@Szilard_Barany @Parker_Erickson @Renchu_Song @Mohamed_Zrouga @Pawan_Mall @Vladimir_Slesarev @markmegerian

try specifying SYNTAX V2

CREATE QUERY degree_centrality_res() FOR GRAPH NetworkAnomalyDetection SYNTAX V2 {

@markmegerian

This is working fine without brackets with degree as that’s an attribute as mentioned earlier but it is not giving desired results.

Actually, I have to first calculate max and min of degree across the graph and then
I have to calculate risk score using max-min normalisation as below -

riskScore += (1 - (s.degree - s.@minDegree) / (s.@maxDegree - s.@minDegree));

Could you kindly suggest?

@Szilard_Barany @Parker_Erickson @Renchu_Song @Mohamed_Zrouga @Pawan_Mall @Vladimir_Slesarev

Could someone please help?

  1. I have to first calculate max and min of degree (node attribute) across the graph and then
  2. calculate risk score per node using max-min normalisation as below -

riskScore += (1 - (s.degree - s.@minDegree) / (s.@maxDegree - s.@minDegree));

@markmegerian @Szilard_Barany @Parker_Erickson @Renchu_Song @Mohamed_Zrouga @Pawan_Mall @Vladimir_Slesarev

This has to be done on output of degree centrality algorithm which I have saved in “degree” attribute on each SourceIP node.

@markmegerian @Szilard_Barany @Parker_Erickson @Renchu_Song @Mohamed_Zrouga @Pawan_Mall @Vladimir_Slesarev

@markmegerian

I am able to calculate min and max correctly at graph level but risk score calculation at node level is not working correctly. Not sure what kind of variable should @riskscore be defined and what is the issue with calculation?

CREATE QUERY degree_centrality_res() FOR GRAPH NetworkAnomalyDetection SYNTAX V2{
MinAccum @@minDegree;
MaxAccum @@maxDegree;
SumAccum @riskScore;

S = SELECT s FROM SourceIP:s -(HAS_SESSION_EVENT>:e)- SessionEvent:se
ACCUM @@minDegree += s.degree,
@@maxDegree += s.degree;
T = SELECT s FROM SourceIP:s -(HAS_SESSION_EVENT>:e)- SessionEvent:se
ACCUM
s.@riskScore += (1 - (s.degree - @@minDegree) / (@@maxDegree - @@minDegree));
PRINT @@minDegree;
PRINT @@maxDegree;
PRINT T[T.@riskScore];
}

@Szilard_Barany @Parker_Erickson @Renchu_Song @Mohamed_Zrouga @Pawan_Mall @Vladimir_Slesarev