VERTEX_MUST_EXIST disables vertex creation on the fly (as the docs explain it: https://docs.tigergraph.com/gsql-ref/3.3/ddl-and-loading/creating-a-loading-job#_vertex_must_exist_parameter) while loading data.
I see the strange behavior of TG (I am using version 3.1.6).
When I try to create the following loading job, I get some errors:
CREATE LOADING JOB test {
DEFINE FILENAME file1;
DEFINE FILENAME file2;
LOAD file1
TO VERTEX Level1 VALUES ($0, $1) WHERE $2 == "1",
TO VERTEX Level2 VALUES ($0, $1) WHERE $2 == "2",
TO VERTEX Level3 VALUES ($0, $1) WHERE $2 == "3"
USING SEPARATOR="|", HEADER="false", EOL="\n";
LOAD file2
TO EDGE PART_OF VALUES($0 Level2, $1 Level1),
TO EDGE PART_OF VALUES($0 Level3, $1 Level2)
USING SEPARATOR="|", HEADER="false", EOL="\n", VERTEX_MUST_EXIST="true";
}
Here is the error message:
tigergraph@m1:~/kb/loading-test$ gsql --graph LoadingTest test.gsql
Semantic Check Fails: The USING clause for the same file path "null" should be the same. However in Job 'test' one block has USING clause as "{EOL=\n, SEPARATOR=|, HEADER=false, VERTEX_MUST_EXIST=true}", while another block has USING clause as "{EOL=\n, SEPARATOR=|, HEADER=false}".
Semantic Check Fails: The file file2 has different configs!
If you need different configs for one file, please use symbolic link.
The job test could not be created!
If I take each of the LOAD statements to a dedicated file/job definition, I can create both the loading jobs…
Please, let me know if it is a bug. I couldn’t find info about this restriction in the docs.
Maybe the GSQL interpreter is confused… (please, note the following fragment: the same file path “null”)
Regards,
Karol
If someone wants to reproduce the problem, here is a simple schema I am using:
CREATE VERTEX Level1(PRIMARY_ID id STRING, name STRING) WITH STATS="OUTDEGREE_BY_EDGETYPE"
CREATE VERTEX Level2(PRIMARY_ID id STRING, name STRING) WITH STATS="OUTDEGREE_BY_EDGETYPE"
CREATE VERTEX Level3(PRIMARY_ID id STRING, name STRING) WITH STATS="OUTDEGREE_BY_EDGETYPE"
CREATE DIRECTED EDGE PART_OF(FROM Level2, TO Level1|FROM Level3, TO Level2)
CREATE GRAPH LoadingTest(Level1, Level2, Level3, PART_OF)
USE GRAPH LoadingTest