If you choose to continue with your original schema, you will not be able to efficiently utilize the ORDER BY feature of SELECT statements (because you are trying to order a set of one type of vertex by the attributes of another type). This will lead to much less intuitive and more clunky queries.
You will also need to declare user_id as a standard attribute for each vertex type. This is because a PRIMARY_ID cannot directly be used in a query body.
Example:
CREATE VERTEX user (
PRIMARY_ID user_id INT,
user_id INT,
age INT,
gender STRING,
zip_code INT
)
Next, you will need to create reverse edges for all of your directed edge types, so that we can travel backwards from a user_to_sms/user_to_email directed edge back to the originating user.
CREATE DIRECTED EDGE user_to_email (FROM user, TO email)
WITH REVERSE_EDGE=“reverse_email”
CREATE DIRECTED EDGE user_to_SMS (FROM user, TO sms)
WITH REVERSE_EDGE=“reverse_sms”
Now, here is the resulting style of query, which is much more complicated and cannot order the resulting user set by BOTH email and sms counts (in this case, I chose to use ordering by email counts).
CREATE QUERY segment_by_age(INT target_age) {
ListAccum @@id_list;
start = {user.*};
age_group =
SELECT s
FROM start:s
WHERE s.age == target_age
ACCUM @@id_list
+= s.user_id;
start = {email.*};
email_ordering =
SELECT s
FROM start:s
WHERE @@id_list.contains(s.user_id)
ORDER BY s.email_received_monthly DESC, s.email_received_weekly DESC;
user_list =
SELECT t
FROM email_ordering:s - (reverse_email:e) → user:t;
PRINT user_list;
}
Effectively, with the less optimized schema, you need to first identify a list of user IDs that meet the requirement for age (first SELECT statement). Next, you construct a list of email vertices that match these user IDs (second SELECT) and order them as required. Then, you map every one of these email vertices to its corresponding user source vertex with the help of reverse edges (third SELECT).
These steps are extremely redundant and inefficient, so I would highly recommend considering my optimization from the first reply.