I’m running a TigerGraphDB on one VM that started acting up a few days ago. Most of the services (everything but ADMIN, CTRL, ETCD, EXE, IMF, KAFKA, and ZK) go down at the same time every day (I’ve copied a portion of the log file for the GPE service below). To the best of my knolwedge, we are not actively running any queries when these services crash. I’ve set up a cron job to restart the DB after it crashes, but I wanted to see if anyone had any insights into why this might be happening.
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0610 22:32:47.073527 27432 zookeeper_context.cpp:254] ZooKeeper Error code connection loss. Cannot read path /tigergraph/dict/objects/__services/DICT/addresses/DICT
E0610 22:32:47.073699 27432 address_resolver.cpp:88] AddressResolver cannot resolve path:/tigergraph/dict/objects/__services/DICT/addresses/DICT, rc:kZkError
E0610 22:32:47.601683 27962 gbrain_active_address_resolver.cpp:124] [RefreshGSE] QueryLeader failed. error code: 14, error msg: Socket closed
E0610 22:32:47.602030 27962 gbrain_active_address_resolver.cpp:151] [RefreshGSE] failed to get leaders for all GSE partitions.
E0610 22:32:48.573808 27432 heartbeat_client.cpp:451] CLIENT: Cannot set up client session: can not resolve server , rc: kZkError, retried: 0
E0610 22:32:48.574110 27432 zookeeper_context.cpp:254] ZooKeeper Error code connection loss. Cannot read path /tigergraph/dict/objects/__services/DICT/addresses/DICT
E0610 22:32:48.574146 27432 address_resolver.cpp:88] AddressResolver cannot resolve path:/tigergraph/dict/objects/__services/DICT/addresses/DICT, rc:kZkError
E0610 22:32:50.074249 27432 heartbeat_client.cpp:451] CLIENT: Cannot set up client session: can not resolve server , rc: kZkError, retried: 1
E0610 22:32:50.074615 27432 zookeeper_context.cpp:254] ZooKeeper Error code connection loss. Cannot read path /tigergraph/dict/objects/__services/DICT/addresses/DICT
E0610 22:32:50.074651 27432 address_resolver.cpp:88] AddressResolver cannot resolve path:/tigergraph/dict/objects/__services/DICT/addresses/DICT, rc:kZkError
E0610 22:32:51.574764 27432 heartbeat_client.cpp:451] CLIENT: Cannot set up client session: can not resolve server , rc: kZkError, retried: 2
E0610 22:32:51.575121 27432 zookeeper_context.cpp:254] ZooKeeper Error code connection loss. Cannot read path /tigergraph/dict/objects/__services/DICT/addresses/DICT
E0610 22:32:51.575156 27432 address_resolver.cpp:88] AddressResolver cannot resolve path:/tigergraph/dict/objects/__services/DICT/addresses/DICT, rc:kZkError
E0610 22:32:52.601850 27962 gbrain_active_address_resolver.cpp:124] [RefreshGSE] QueryLeader failed. error code: 14, error msg: failed to connect to all addresses
E0610 22:32:52.601956 27962 gbrain_active_address_resolver.cpp:151] [RefreshGSE] failed to get leaders for all GSE partitions.
...