Installing Tiger graph algorithms on windows to use in Graph Studio - WCC, Jacard, Louvian etc

Mohamed_Zrouga · February 16, 2022, 11:01pm

Hi @AnuroopAjmera
in the following error, I can see libudf reload failure

this means that at a certain point of installing the UDF function you broke your docker instance, in order to confirm that we’d like to have the latest logs from your docker container ( please send them to devrel@tigergraph.com ).

logs are available under /home/tigergraph/logs

Szilard_Barany · February 16, 2022, 11:38pm

Hi Anuroop,

I am Szilard Barany, a sales engineer/solution architect at TigerGraph EMEA. Since you work for Barclays, and since it’s a British company, I am the “closest” tech help for you.
If you want, you can contact me at szilard.barany@tigergraph.com directly. I am at a company event in the US, but I can possibly set up a Zoom call with you at 12pm (noon) your time. Let me know if that would work for you.

Regards,
Szilard

AnuroopAjmera · February 17, 2022, 2:24am

#ifndef EXPRFUNCTIONS_HPP_
#define EXPRFUNCTIONS_HPP_

#include <stdio.h>
#include <stdlib.h>
#include
#include <gle/engine/cpplib/headers.hpp>
#include
#include
#include
#include
#include <math.h>
#include

/** XXX Warning!! Put self-defined struct in ExprUtil.hpp **

No user defined struct, helper functions (that will not be directly called
in the GQuery scripts) etc. are allowed in this file. This file only
contains user-defined expression function’s signature and body.
Please put user defined structs, helper functions etc. in ExprUtil.hpp
*/
#include “ExprUtil.hpp”

namespace UDIMPL {
typedef std::string string; //XXX DON’T REMOVE

/****** BIULT-IN FUNCTIONS ********/
/ XXX DON’T REMOVE ****************/
inline int64_t str_to_int (string str) {
return atoll(str.c_str());
}

inline int64_t float_to_int (float val) {
return (int64_t) val;
}

inline string to_string (double val) {
char result[200];
sprintf(result, “%g”, val);
return string(result);
}

inline ListAccum tg_extract_list(string weights){
ListAccum wghts;
string current_weight;
std::stringstream s_stream(weights);
while (s_stream.good()) {
std::getline(s_stream, current_weight, ‘,’);
wghts.data_.push_back(std::stof(current_weight));
}
return wghts;
}

inline float tg_fastrp_rand_func(int64_t v_id, int64_t emb_idx, int64_t seed, int64_t s){
std::hashstd::string hasher;
auto hash = hasher(std::to_string(v_id) + “,” + std::to_string(emb_idx) + “,” + std::to_string(seed));

std::mt19937 gen(hash);
std::uniform_real_distribution distribution(0.0, 1.0);
float p1 = 0.5 / s, p2 = p1, p3 = 1 - 1.0 / s;
float v1 = sqrt(s), v2 = -v1, v3 = 0.0;

float random_value = distribution(gen);
if (random_value <= p1)
return v1;
else if (random_value <= p1 + p2)
return v2;
else
return v3;
}
}
/****************************************/

#endif /* EXPRFUNCTIONS_HPP_ */

AnuroopAjmera · February 17, 2022, 2:27am

Copy and Paste hasn’t printed #include statement without .h files otherwise the list is same as what you provided, even it was correct earlier also, only there were some duplicate #include statements which shouldn’t have created any issues.
Copy and Paste hasn’t printed “<” and “>” symbols at few places

I can email you this complete file if you want. Your email ID please?

Thanks

Parker_Erickson · February 17, 2022, 2:58am

parker.erickson@tigergraph.com would be good.

AnuroopAjmera · February 17, 2022, 3:02am

@Mohamed_Zrouga @Parker_Erickson @Vladimir_Slesarev @Dan_Barkus

Great News, I could install wcc, jacard and louvain successfully (with below warning) today morning so the error is just with fastRP now. Thanks to you and TigerGraph team who is providing suggestions. Now I need to implement these to my business problem

GSQL > INSTALL QUERY tg_wcc
Start installing queries, about 1 minute …
tg_wcc query: curl -X GET ‘http://127.0.0.1:9000/query/NetworkAnomalyDetection/tg_wcc?v_type=VALUE&e_type=VALUE&[output_limit=VALUE]&[print_accum=VALUE]&[result_attr=VALUE]&[file_path=VALUE]’. Add -H “Authorization: Bearer TOKEN” if authentication is enabled.
Select ‘m1’ as compile server, now connecting …
Node ‘m1’ is prepared as compile server.

[========================================================================================================] 100% (1/1)
Query installation finished.
GSQL > INSTALL QUERY tg_jaccard_nbor_ap_batch
Start installing queries, about 1 minute …
tg_jaccard_nbor_ap_batch query: curl -X GET ‘http://127.0.0.1:9000/query/NetworkAnomalyDetection/tg_jaccard_nbor_ap_batch?[top_k=VALUE]&v_type=VALUE&feat_v_type=VALUE&e_type=VALUE&re_type=VALUE&similarity_edge=VALUE&[src_batch_num=VALUE]&[nbor_batch_num=VALUE]&[print_accum=VALUE]&[print_limit=VALUE]&[file_path=VALUE]’. Add -H “Authorization: Bearer TOKEN” if authentication is enabled.
Select ‘m1’ as compile server, now connecting …
Node ‘m1’ is prepared as compile server.

[========================================================================================================] 100% (1/1)
Query installation finished.
GSQL > INSTALL QUERY tg_louvain
Start installing queries, about 1 minute …
tg_louvain query: curl -X GET ‘http://127.0.0.1:9000/query/NetworkAnomalyDetection/tg_louvain?v_type=VALUE&e_type=VALUE&[wt_attr=VALUE]&[max_iter=VALUE]&[result_attr=VALUE]&[file_path=VALUE]&[print_info=VALUE]’. Add -H “Authorization: Bearer TOKEN” if authentication is enabled.
Select ‘m1’ as compile server, now connecting …
Node ‘m1’ is prepared as compile server.

[========================================================================================================] 100% (1/1)
Query installation finished.

Louvain warning from GraphStudio:

(72, 23) Warning: The comparison ‘-t.@max_best_move.weight==t.@sum_cc_weight’ may lead to unexpected behavior because it involves equality test between float/double numeric values. We suggest to do such comparison with an error margin, e.g. ‘abs((-t.@max_best_move.weight) - (t.@sum_cc_weight)) < epsilon’, where epsilon is a very small positive value of your choice, such as 0.0001.
(148, 25) Warning: The comparison ‘-s.@max_best_move.weight==s.@sum_cc_weight’ may lead to unexpected behavior because it involves equality test between float/double numeric values. We suggest to do such comparison with an error margin, e.g. ‘abs((-s.@max_best_move.weight) - (s.@sum_cc_weight)) < epsilon’, where epsilon is a very small positive value of your choice, such as 0.0001.

AnuroopAjmera · February 17, 2022, 3:04am

@Parker_Erickson @Vladimir_Slesarev @Szilard_Barany
Strange Error as shared earlier -

GSQL > @tg_fastRP.gsql

Semantic Check Error in query tg_fastRP (SEM-45): line 34, col 13
The tuple name or the function tg_extract_list is not defined.
Failed to create queries: [tg_fastRP].

AnuroopAjmera · February 17, 2022, 3:03pm

@Szilard_Barany are you joining zoom call scheduled by you?

Regards
Anuroop

AnuroopAjmera · February 22, 2022, 8:20am

tg_jaccard_nbor_ap_batch is not giving me any result where as tg_jaccard_nbor_ss is working fine but I need to run it for all the nodes simultaneously.

I am not able to find code for tg_jaccard_nbor_ap.gsql. Algo page is broken, giving 404 error.

Could someone please help?

@Mohamed_Zrouga @Dan_Barkus @Parker_Erickson @Szilard_Barany @Vladimir_Slesarev @Jon_Herke

Jon_Herke · February 22, 2022, 2:07pm

Just verifying this is the link you’re referring to https://github.com/tigergraph/gsql-graph-algorithms/tree/master/algorithms/Similarity/jaccard then when clicking on tg_jaccard_nbor_ap.gsql you got a 404 (page doesn’t exist). I’ve also (confirmed) received the error.

Forwarding this on to the Graph Data Science team to get their input on this thread.

AnuroopAjmera · February 23, 2022, 2:41am

Yes, this is the link

AnuroopAjmera · February 28, 2022, 3:57pm

tg_wcc algorithm is not working in my case. The weekly connected components have been allocated different communities/result attribute value. There coul dbe something wrong at my end but I am not able to figure out.

Could anyone please help?

I am available for a zoom call today as well as tomorrow but please consider this on top priority.

My email ID: anuroop.ajmera@gmail.com

@Parker_Erickson @Mohamed_Zrouga @Jon_Herke @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno

Jon_Herke · February 28, 2022, 7:55pm

@AnuroopAjmera “tg_wcc algorithm is not working in my case”

Can you elaborate on what isn’t working as expected? What steps did you take? What errors are you receiving?

AnuroopAjmera · March 1, 2022, 9:17am

Hi Jon,
It’s assigning different community IDs to nodes which should be in same community as per source data. I checked this using pivot in excel.

Looks like something wrong at my end.

Could someone please help on priority? I am available to connect over a meeting (zoom call?) - anuroop.ajmera@gmail.com

Regards
Anuroop Anmera

@Jon_Herke @Parker_Erickson @Mohamed_Zrouga @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno

AnuroopAjmera · March 2, 2022, 4:38pm

wcc algorithm is not working in my case because unlike neo4j, this algo in Tigergraph expects nodes to be similar but in my case I want to find communities of nodes based on other nodes (properties).

@Jon_Herke @Parker_Erickson @Mohamed_Zrouga @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno

AnuroopAjmera · March 6, 2022, 10:28am

Current conclusion on my Tigergraph’s DS project is given below. Work is still in progress so final conclusion will be shared later.

tg_wcc is not working
wcc algorithm is not working in my case because unlike neo4j, this algo in Tigergraph expects nodes to be similar but in my case I want to find communities of nodes based on other nodes (properties).
tg_jaccard_nbor_ap_batch has code issue
Code has bug which is now resolved locally with the help of @Szilard_Barany
tg_louvain is not showing visualizations
Louvain is also not showing any visualizations of communities like shown in Tigergraph videos. “Explore Graph” section is showing some nodes having common cid but Louvain’s output is not hsowing the same.
Absence of weighted degree centrality algorith in Tigergraph:
Details below -
Weighted degree centrality algorithm
Not able to load 30 MB file:
Details below:
Data load getting stuck for 16 MB file
Changes done in gsql ar enot reflecting in Tigergraph GraphStudio:
Details below:
How to get GSQL changes reflected in GraphStudio

@Jon_Herke @Parker_Erickson @Mohamed_Zrouga @Szilard_Barany @Dan_Barkus @Vladimir_Slesarev @Bruno @Pawan_Mall

AnuroopAjmera · March 10, 2022, 9:47am

want to apply maxmin normalization but getting error when doing max() - min() below, can someone please suggest what is the issue here?

GSQL > BEGIN
GSQL > CREATE QUERY degree_cent_res() FOR GRAPH NetworkAnomalyDetection SYNTAX v2 {
GSQL > SELECT s.SourceAddress, (max(s.degree) - min(s.degree)) as risk_score INTO T
GSQL > FROM SourceIP:s -(HAS_SESSION_EVENT>:e)- SessionEvent:se
GSQL > GROUP BY s.SourceAddress
GSQL > ORDER BY risk_score DESC
GSQL > LIMIT 100;
GSQL > PRINT T;
GSQL > }
GSQL > END
Index -1 out of bounds for length 2
Failed to create queries: [degree_cent_res].

@Szilard_Barany @Dan_Barkus @Parker_Erickson @Jon_Herke @Renchu_Song @Mohamed_Zrouga @Bruno @Pawan_Mall @Vladimir_Slesarev @markmegerian

AnuroopAjmera · March 10, 2022, 5:37pm

can anyone please suggest?

@Szilard_Barany @Dan_Barkus @Parker_Erickson @Jon_Herke @Renchu_Song @Mohamed_Zrouga @Bruno @Pawan_Mall @Vladimir_Slesarev @markmegerian

markmegerian · March 10, 2022, 7:57pm

I have a few observations and questions for you

The use of the SQL-like syntax is not appropriate in this case, you should use the regular graph traversal syntax. The main reason being that you can only include base attributes in the SELECT list and you are trying to use the degree function
The degree function has parentheses after it, degree( )
Will you have multiple SourceIP vertices with the same SourceAddress value? If so, then you can summarize using a MapAccum, like this


 MapAccum<STRING, MinAccum<INT>> @@minDegree;
  MapAccum<STRING, MaxAccum<INT>> @@maxDegree;
  
  S = SELECT s FROM SourceIP:s -(HAS_SESSION_EVENT>:e)-  SessionEvent:se 
            ACCUM @@minDegree += (s.SourceAddress -> s.degree()), 
                          @@maxDegree += (s.SourceAddress -> s.degree())

if not (i.e. each SourceIP has a unique SourceAddress) then its even simpler, you can just use the vertex-attached accum, like this

  MinAccum<INT> @minDegree;
  MaxAccum<INT> @maxDegree, @riskScore;
  
  S = SELECT s FROM SourceIP:s -(HAS_SESSION_EVENT>:e)-  SessionEvent:se 
            ACCUM s.@minDegree += s.degree(),
                          s.@maxDegree += s.degree()
           POST-ACCUM  s.@riskScore += s.@maxDegree - s.@minDegree;

AnuroopAjmera · March 11, 2022, 9:39am

@markmegerian
Here I am not intending to use degree function, I have added an attribute named “degree” in vertex and captured the output of degree centrality algorithm in that attribute.
And now I want to apply minmax normalisation and facing above mentioned index out of bound issue.

In this case, can’t I use simple SQL syntax? Kindly suggest.

@Szilard_Barany @Dan_Barkus @Parker_Erickson @Jon_Herke @Renchu_Song @Mohamed_Zrouga @Bruno @Pawan_Mall @Vladimir_Slesarev @markmegerian