High Availability questions for TigerGraph

Paul · March 28, 2022, 9:21am

I’ve taken a look at the documentation about HA and scalability as outlined here: HA Cluster Configuration :: Docs.

Does TigerGraph manage all writes on the Master like neo4j? If not, how do writes occur and is it strong consistency or eventual consistency?
If a Master goes down, does this mean a local group is down?
How long does a failure take to recover?
Is the client library able to recover from a failure / re-route to a working master?
How does it fare against something like YugaByteDB/Spanner especially wrt scale on a global basis?
If there is Geo scalability, does the election process occur over regions? I.e if there are 3 regions UK, UK, Asia. Is there only 1 master globally and election would happen between nodes?

Many thanks

Thanks

Elliot_Martin · March 28, 2022, 3:58pm

Hi Paul, welcome to the TigerGraph Community! I’ll try my best with answering most of your questions, please let me know if there’s anything I miss or that you need clarifying on.

Reads and writes can be sent to any node in a TigerGraph cluster - there is no concept of a Leader/Follower in the same way as Neo (disclaimer, not a Neo expert - just did a few Google searches). Writes are sent to RESTPP which sends the request to Kafka, which TigerGraph uses as a write ahead log. The writes are read from Kafka by the GPE. TigerGraph guarantees strong consistency:
Transaction Processing and ACID Support :: Docs
If any node goes down in an HA cluster, the cluster will continue to operate.
TigerGraph uses ZooKeeper to track nodes that are alive. There is a 30 second heartbeat, so it may take the system up to 30 seconds to realize a node is dead.
N/A - though I’d note that we recommend routing requests to a load balancer to distribute requests to all TigerGraph node’s RESTPP
With Cross Region Replication you’ll be able to replicate a TigerGraph cluster to other regions globally. This does come with a caveat that these replicas are read only. Between support for ACID transations, HA support, transactional updates, and automatic partitioning of data, TigerGraph offers many of the features that GCP Spanner does (though of course Spanner is a traditional SQL db and TigerGraph is a graph DB).
Additional CRR clusters can be provisioned regionally. Fail over to another cluster is a manual operation. The primary cluster is determined by the setup, rather than through an election process.

Let me know if I can help clarify anywhere.