Unable to finish loading, test cas https://snap.stanford.edu/data/bigdata/communities/com-friendster.ungraph.txt.gz

Hello,

i’m trying to evaluate TigerGraph 2.4.0 developper edition. I tried both VirtualBox and Docker images (W10 host, 16Gb Ram core i74800 MQ, SSD ), following the schema “person->friend->person” and loading the file https://snap.stanford.edu/data/bigdata/communities/com-friendster.ungraph.txt.gz (unzipped and removed headers from the 3 first lines)

I’m unable to finish the loading in both case, the GPE is stooping, if you have a hint ?

You will find attachment of logs, the oldest is tigergraph2.4.0 virtualbox and the freshest is docker version.

Thanks,

Thomas

Hi Thomas,

How big is your system memory? (free -g)

Could you please see if the GPE is down due to OOM? dmesg | grep OOM

Thanks.

hello,

seems to be the right point to investigate ! ,

i’m about 35% of the data file (653,427,365 edges at the moment) and memory start to running low (1GB available/11GB for the container) :

tigergraph@ac4f436c6d0b:~ free -gh total used free shared buff/cache available Mem: 11G 9.5G 169M 436K 2.1G 1.9G Swap: 1.0G 157M 866M tigergraph@ac4f436c6d0b:~ free -gh
total used free shared buff/cache available
Mem: 11G 10G 274M 428K 1.2G 1.1G
Swap: 1.0G 165M 858M

I will post dmesg once running with --privileged

ok, finally OOM around 42% of datafile (787,430,095 edges).

OOM from dmesg in attachment if useful.

thanks,

Thomas

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 10G 281M 412K 588M 581M
Swap: 1.0G 349M 674M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 10G 221M 412K 655M 589M
Swap: 1.0G 400M 623M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 10G 132M 412K 737M 579M
Swap: 1.0G 505M 518M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 10G 223M 412K 573M 506M
Swap: 1.0G 505M 518M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 139M 336K 469M 323M
Swap: 1.0G 518M 505M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 10G 131M 308K 607M 451M
Swap: 1.0G 530M 493M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 10G 265M 240K 459M 439M
Swap: 1.0G 611M 412M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 136M 220K 409M 260M
Swap: 1.0G 651M 372M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 166M 156K 414M 295M
Swap: 1.0G 683M 340M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 299M 108K 222M 244M
Swap: 1.0G 716M 307M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 243M 96K 150M 151M
Swap: 1.0G 722M 301M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 129M 40K 215M 70M
Swap: 1.0G 793M 230M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 272M 40K 127M 169M
Swap: 1.0G 805M 218M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 132M 40K 306M 157M
Swap: 1.0G 963M 60M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 11G 135M 40K 213M 75M
Swap: 1.0G 964M 59M

tigergraph@ac4f436c6d0b:~$ free -gh
total used free shared buff/cache available
Mem: 11G 2.6G 8.9G 4K 213M 8.9G
Swap: 1.0G 960M 63M

Yes, then it must be an OOM problem.

Could you consider to extend your memory size?

Do you know the overall data size that you are going to ingest? We can make a memory estimation based on that.

Thanks.

Hello,

file is 1.8 billion lines (edges, undirected).

Use case is described here https://towardsdatascience.com/performance-exploding-introduction-to-tigergraph-developer-edition-486d6e6a409

Original file is https://snap.stanford.edu/data/bigdata/communities/com-friendster.ungraph.txt.gz (you need to unzip and remove headers)

Thanks for you help, i will ask more memory for my next try,

Thomas

Hi Thomas,

By looking at https://snap.stanford.edu/data/bigdata/communities/, I see that the raw data size is about 30GB.

Then it is recommended to use a server has 32GB memory.

Thanks.