Hello -
I have a working script that loads data from S3 into our graph using GSQL.
I want to be able to run it from DataBricks using the conn class method conn.uploadFile. But it doesn’t seem to be working.
Here’s my steps:
Step 1: create loading job string
load_job = f'''
use graph {graph}
drop job {test_load_vertices}
drop data_source {data_source}
create data_source S3 {data_source} for graph {graph}
set {data_source} = "/home/ubuntu/s3.config"
CREATE LOADING JOB {test_load_vertices} FOR GRAPH {graph} {{
DEFINE FILENAME MyDataSource;
LOAD MyDataSource to VERTEX identity VALUES($"id", $"name", $"countries") USING JSON_FILE = "true";}}
Step 2: load it using the conn class
conn.gsql(load_job)
The above runs successfully:
Step 3: use the conn.uploadFile() function.
conn.uploadFile(filePath="s3://path/to/test_file.parquet", fileTag='MyDataSource', jobName=test_load_vertices, timeout=600000)
This doesn’t produce any output and in graphstudio load data page nothing happens.