Having duplicate vertices in the graph database

Hello TG,

I have found duplicate vertices in the TG database. I have double checked their ids, which are exactly the same.

I suspect the cause be that my endpoint of the vertex (with duplication) did not exist. Since the loader will automatically create vertex id and default attributes, will that produce multiple vertices with the same id if I have multiple edges pointing to the vertex endpoint which is missing?

Thanks

Hey,

Could you provide some more context about the situation in which you are seeing duplicate vertices?

After testing the loading scenario which you described with multiple edges (and no explicit vertex creation), duplicate vertices did not form.

Here is my query, which creates a single video with PRIMARY ID = 100 based on only edge insertions:

CREATE QUERY test() FOR GRAPH name {
  // Video 100 does not exist, audio 56789 already exists
  INSERT INTO VIDEO_HAS_AUDIO VALUES (100, 56789);
  // Video 100 does not exist, tag "zoo" already exists
  INSERT INTO VIDEO_HAS_TAG VALUES (100, "zoo");
}

Here is the resulting graph:

Definitely shouldn’t be possible for a number of reasons. If you find a way of doing it, let us know!

@Leo_Shestakov @Richard_Henderson
You are right. After testing it, I am sure that the database does not have duplicate vertices. But it seems that I still have duplicate vertices and edges as query output. Maybe I made a mistake in my query?

Here is my simple query to display all edges in GraphStudio:
setaccum @@edges;
start = ANY;
out = select s from start:s - (:e) - :t
accum @@edges += e;
print @@edges;

And the output contains duplicate vertices and edges, in both graphical and json forms.

Sorry that I am unable to show you the screenshot of my actual outputs, since the data contains sensitive information.

Thanks !

So, I ran your query on one of my graphs and it does not seem to return duplicate edges. Furthermore, you are accumulating a set, so (in theory) duplicates should not even be possible.

Off the top of my head, here are some things that you may be running into:

  • 2 of the “same” edge between vertices can exist (but of different edge types)
  • Accumulation of reverse edges for every directed edge that has them enabled
  • 2 directed edges between the same vertices can still both be unique (pointing opposite directions)

You mentioned also seeing duplicate vertices, which is not a problem because a unique edge is simply defined by its two endpoints (and edge type), so endpoints can repeat as long as their pairings are unique (and with directed edges, the pairings can repeat as long as the ordering is different).

1 Like