Partition policy

Hello, I would like to ask howTigerGraph partition data. Below is the current partition information by gstatusgraph. I expected data to be uniformly distributed, but it does not seem to be. I also wonder if I can change TigerGraph’s partition policy. Does TigerGraph dynamically re-partition as graph size increases?

Thanks!

[GRAPH  ] Graph was loaded (/home/tigergraph/tigergraph/data/gstore):
[m1     ] Partition size: 136MiB, IDS size: 7.7MiB, Vertex count: 894077, Edge count: 9980201, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m2     ] Partition size: 135MiB, IDS size: 7.7MiB, Vertex count: 895069, Edge count: 9757749, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m3     ] Partition size: 136MiB, IDS size: 7.7MiB, Vertex count: 894669, Edge count: 10059032, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m4     ] Partition size: 136MiB, IDS size: 7.7MiB, Vertex count: 895174, Edge count: 10037188, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m5     ] Partition size: 135MiB, IDS size: 7.8MiB, Vertex count: 895804, Edge count: 9946796, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m6     ] Partition size: 136MiB, IDS size: 7.8MiB, Vertex count: 894804, Edge count: 10025650, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m7     ] Partition size: 136MiB, IDS size: 6.7MiB, Vertex count: 896048, Edge count: 9972395, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m8     ] Partition size: 137MiB, IDS size: 6.7MiB, Vertex count: 895387, Edge count: 10109897, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m9     ] Partition size: 136MiB, IDS size: 7.7MiB, Vertex count: 896311, Edge count: 10046508, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m10    ] Partition size: 136MiB, IDS size: 6.7MiB, Vertex count: 895358, Edge count: 9917336, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m11    ] Partition size: 136MiB, IDS size: 6.7MiB, Vertex count: 894091, Edge count: 9954003, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m12    ] Partition size: 135MiB, IDS size: 7.8MiB, Vertex count: 895379, Edge count: 9733248, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m13    ] Partition size: 136MiB, IDS size: 6.7MiB, Vertex count: 896432, Edge count: 10004332, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m14    ] Partition size: 135MiB, IDS size: 7.8MiB, Vertex count: 895148, Edge count: 9898212, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m15    ] Partition size: 68MiB, IDS size: 3.9MiB, Vertex count: 446459, Edge count: 5014919, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m16    ] Partition size: 68MiB, IDS size: 3.4MiB, Vertex count: 447499, Edge count: 4950016, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m17    ] Partition size: 68MiB, IDS size: 3.9MiB, Vertex count: 446910, Edge count: 4913556, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[m18    ] Partition size: 68MiB, IDS size: 3.9MiB, Vertex count: 447486, Edge count: 4864845, NumOfDeletedVertices: 0 NumOfSkippedVertices: 0
[WARN   ] Above vertex and edge counts are for internal use which show approximate topology size of the local graph partition. Use DML to get the correct graph topology information

@hdy TigerGraph partitions data based on segments and vertex type. Given your current partition information, it appears you have 18 nodes. This configuration can lead to data skew because TigerGraph performs with partition numbers that are powers of 2 (like 4, 8, 16, etc.). Data skew can occur in clusters that do not follow this pattern.

@Jon_Herke Thank you for your response! It’s good to know. Could you elaborate segments and vertex type?

Hi @hdy ,

You can watch the following video from minute 21:57 to minute 24:31 to learn more about what segments are:

The data in TigerGraph inside one partition is split into multiple segments of that partition.

I hope this helps!

Best,
Supawish Limprasert (Jim)
Solution Engineer, TigerGraph

@Jim_Limprasert Thank you!

1 Like