Data and processing distribution across nodes in Enterprise edition

jimwu · November 20, 2020, 6:02pm

I installed the Enterprise version on a cluster with 4 machines with high availability set to off. I have a couple of questions regarding how data and processing are distributed across the machines in the cluster:

When I create a graph, is the graph data only stored in the leader node? Or the graph data are stored on every machine?
When I run a query, will it automatically be distributed to all machines for processing?

Cheers,
Jim

Mingxi_Wu · November 23, 2020, 6:55am

No. The graph data is partitioned across the cluster of nodes (every machine).
If you use the “distributed” key word in your query, such as “create distributed query test()” , all the nodes of the cluster will participate the query processing.

I recommend this video to you which contains the architecture overview.

jimwu · November 26, 2020, 7:07pm

Thanks @Mingxi_Wu!

Some related questions on the distributed query: Is it possible to remove a node from the pool? Also is it possible the change the priority of certain nodes so that nodes with higher priority get assigned to a query first?

Mingxi_Wu · February 22, 2021, 6:41am

not supported yet. but will be soon.