Blog

GQL Validator
Albert Verdeny has built a GQL Validator. Try it out:
https://graphglot.com/playground

It is pretty awesome. You can write GQL queries and see which of the 228 optional features your query requires.

Key features:
- Write a GQL query and instantly see if it’s valid
- See which of the 228 optional features your query requires
- Toggle features on/off to simulate different database engines
February 25, 2026
Support for GQL is coming to Google BigQuery!

You can sign-up for a Preview here:

https://docs.google.com/forms/d/e/1FAIpQLSeLnFPa4Gd3BY5DMRBAfr1hGwNEVFQY2KRckQQBPjShOVo5-Q/viewform

February 25, 2026

Recursive Queries in SQL vs. GQL

SQL Recursive CTE queries in are commonly used when working with hierarchical data in Relational Databases. For e.g. the following Highway Network:

Let’s say we are interested in getting all the path from each Highway to other Highways. We would use a Recursive CTE for that. Let’s create some sample data and write a Recursive CTE Queries to extract all the PATHs.

create table flows_to (
    highway_from varchar(100)
    , highway_to varchar(100)
);

insert INTO FLOWS_TO (highway_from, highway_to) VALUES
(null, 'highway_1'), -- starting node
(null, 'highway_2'), -- starting node
(null, 'highway_9'), -- starting node
(null, 'highway_11'), -- starting node
('highway_1', 'highway_3'),
('highway_2', 'highway_3'),
('highway_3', 'highway_4'),
('highway_3', 'highway_6'),
('highway_4', 'highway_5'),
('highway_6', 'highway_7'),
('highway_7', 'highway_8'),
('highway_5', 'highway_8'),
('highway_9', 'highway_10'),
('highway_10', 'highway_12'),
('highway_8', 'highway_12'),
('highway_11', 'highway_12'),
('highway_12', 'highway_13');

create table flows_to (
    highway_from varchar(100)
    , highway_to varchar(100)
);

insert INTO FLOWS_TO (highway_from, highway_to) VALUES
(null, 'highway_1'), -- starting node
(null, 'highway_2'), -- starting node
(null, 'highway_9'), -- starting node
(null, 'highway_11'), -- starting node
('highway_1', 'highway_3'),
('highway_2', 'highway_3'),
('highway_3', 'highway_4'),
('highway_3', 'highway_6'),
('highway_4', 'highway_5'),
('highway_6', 'highway_7'),
('highway_7', 'highway_8'),
('highway_5', 'highway_8'),
('highway_9', 'highway_10'),
('highway_10', 'highway_12'),
('highway_8', 'highway_12'),
('highway_11', 'highway_12'),
('highway_12', 'highway_13');

Now let’s construct a Recursive CTE for this:

WITH  outcome(starting, highway_to, highway_from, path) AS (

  -- ANCHOR: Initial query to establish the base result set. 
  SELECT -- Anchor 
    highway_to as starting
    , highway_to
    , highway_from
    , highway_to as path
  FROM flows_to
  WHERE highway_from is null
  
  UNION ALL
  
  -- RECURSIVE: References outcome CTE to build on previous results
  SELECT starting, flows_to.highway_to, flows_to.highway_from, outcome.path || '->' || flows_to.highway_to
  FROM outcome
  JOIN flows_to ON flows_to.highway_from = outcome.highway_to
)
SELECT DISTINCT starting,  highway_to, path FROM outcome;

WITH  outcome(starting, highway_to, highway_from, path) AS (

  -- ANCHOR: Initial query to establish the base result set. 
  SELECT -- Anchor 
    highway_to as starting
    , highway_to
    , highway_from
    , highway_to as path
  FROM flows_to
  WHERE highway_from is null
  
  UNION ALL
  
  -- RECURSIVE: References outcome CTE to build on previous results
  SELECT starting, flows_to.highway_to, flows_to.highway_from, outcome.path || '->' || flows_to.highway_to
  FROM outcome
  JOIN flows_to ON flows_to.highway_from = outcome.highway_to
)
SELECT DISTINCT starting,  highway_to, path FROM outcome;

The above query will output:

Sample of the PATHs returned by Rercursive CTE

While the recursive CTE is not complex, it is not trivial to write.

Enter GQL.

Hierarchical data is a Graph. As such a Graph Query Language (GQL) may be a more elegant way to extract the above Paths. GQL, like SQL, is a high-level, declarative, programming language. It is specifically designed for graph structures. It uses nodes and relations as first-class citizens, although the output to a query can be a graph or a set of tuples.

Let’s explore using the same dataset and query using GQL. We will start with generating sample data

drop graph freeways;
create graph freeways{
  node freeway ({name string})
  , edge flows_to (freeway)-[]->(freeway)
};


use freeways;
INSERT (n1:freeway {_id:'highway_1', name:"Highway 1"})
  , (n2:freeway {_id:'highway_2', name:"Highway 2"})
  , (n3:freeway {_id:'highway_3', name:"Highway 3"})
  , (n4:freeway {_id:'highway_4', name:"Highway 4"})
  , (n5:freeway {_id:'highway_5', name:"Highway 5"})
  , (n6:freeway {_id:'highway_6', name:"Highway 6"})
  , (n7:freeway {_id:'highway_7', name:"Highway 7"})
  , (n8:freeway {_id:'highway_8', name:"Highway 8"})
  , (n9:freeway {_id:'highway_9', name:"Highway 9"})
  , (n10:freeway {_id:'highway_10', name:"Highway 10"})
  , (n11:freeway {_id:'highway_11', name:"Highway 11"})
  , (n12:freeway {_id:'highway_12', name:"Highway 12"})
  , (n13:freeway {_id:'highway_13', name:"Highway 13"})
  , (n1)-[:flows_to]->(n3) 
  , (n2)-[:flows_to]->(n3) 
  , (n3)-[:flows_to]->(n4) 
  , (n3)-[:flows_to]->(n6) 
  , (n4)-[:flows_to]->(n5) 
  , (n6)-[:flows_to]->(n7) 
  , (n5)-[:flows_to]->(n8) 
  , (n7)-[:flows_to]->(n8) 
  , (n9)-[:flows_to]->(n10) 
  , (n11)-[:flows_to]->(n12) 
  , (n8)-[:flows_to]->(n12) 
  , (n10)-[:flows_to]->(n12) 
  , (n12)-[:flows_to]->(n13);

drop graph freeways;
create graph freeways{
  node freeway ({name string})
  , edge flows_to (freeway)-[]->(freeway)
};


use freeways;
INSERT (n1:freeway {_id:'highway_1', name:"Highway 1"})
  , (n2:freeway {_id:'highway_2', name:"Highway 2"})
  , (n3:freeway {_id:'highway_3', name:"Highway 3"})
  , (n4:freeway {_id:'highway_4', name:"Highway 4"})
  , (n5:freeway {_id:'highway_5', name:"Highway 5"})
  , (n6:freeway {_id:'highway_6', name:"Highway 6"})
  , (n7:freeway {_id:'highway_7', name:"Highway 7"})
  , (n8:freeway {_id:'highway_8', name:"Highway 8"})
  , (n9:freeway {_id:'highway_9', name:"Highway 9"})
  , (n10:freeway {_id:'highway_10', name:"Highway 10"})
  , (n11:freeway {_id:'highway_11', name:"Highway 11"})
  , (n12:freeway {_id:'highway_12', name:"Highway 12"})
  , (n13:freeway {_id:'highway_13', name:"Highway 13"})
  , (n1)-[:flows_to]->(n3) 
  , (n2)-[:flows_to]->(n3) 
  , (n3)-[:flows_to]->(n4) 
  , (n3)-[:flows_to]->(n6) 
  , (n4)-[:flows_to]->(n5) 
  , (n6)-[:flows_to]->(n7) 
  , (n5)-[:flows_to]->(n8) 
  , (n7)-[:flows_to]->(n8) 
  , (n9)-[:flows_to]->(n10) 
  , (n11)-[:flows_to]->(n12) 
  , (n8)-[:flows_to]->(n12) 
  , (n10)-[:flows_to]->(n12) 
  , (n12)-[:flows_to]->(n13);

Next let’s write a GQL query to extract all the PATHs. Note that in the case of GQL, to compute and list the paths, it suffices to write:

MATCH nodes = (a)
CALL {
  MATCH path = (a)-[edges:flows_to]->{0,}(b)
  return reduce(path = a._id, edge in edges | path || '->' || edge._to) as route
  , b
}
return table(a.name, b.name, route);

MATCH nodes = (a)
CALL {
  MATCH path = (a)-[edges:flows_to]->{0,}(b)
  return reduce(path = a._id, edge in edges | path || '->' || edge._to) as route
  , b
}
return table(a.name, b.name, route);

This will output

Sample of the PATHs returned by the GQL.

It can be seen that the structure of the GQL query is far less complicated and more intuitive than its SQL counterpart, since it takes advantage of the graph structure. In this case, a basic MATCH.. RETURN structure suffices to express a recursive query. This is mainly because GQL is developed as a query language for
graphs and recursion is typical in these cases. The(a:highway)-[:flows_to]->{0,}(b:highway) path pattern selects all nodes b that are reachable by following one or more edges in the graph, traversing the graph using the :flows_to relation.

Path patterns describe multi-hop relationships in the graph:
{2,4}: Exactly 2 to 4 hops
{0,}: 0 or more hops (unbounded)
{,3}: Up to 3 hops
{5}: Exactly 5 hops

SQL CONNECT BY clause.

Some SQL based databases support CONNECT BY Clause. This clause allows queries on hierarchical data without the need for composing complex recursive CTEs. For e.g. the following CONNECT BY clause query will find all the paths from each to all others nodes

SELECT
    level,
    highway_from,
    SYS_CONNECT_BY_PATH(highway_from, '->') || '->' || highway_to  AS path,
    highway_to
FROM
    FLOWS_TO
START WITH
    highway_from IS NULL
CONNECT BY
    PRIOR highway_to = highway_from;

SELECT
    level,
    highway_from,
    SYS_CONNECT_BY_PATH(highway_from, '->') || '->' || highway_to  AS path,
    highway_to
FROM
    FLOWS_TO
START WITH
    highway_from IS NULL
CONNECT BY
    PRIOR highway_to = highway_from;

While this query is easier to read than its recursive CTE counterpart, it is not intuitive as GQL…

January 20, 2026

Association Rule Mining using GQL

What is the most frequently bought item with Infant Formula and Infant Diapers? Association Rule Mining is a key to Market Basket Analysis. While Association Rule Mining is a complex topic, we can use SQL to figure out what items most frequently bought together.

An association rule is a implication expression of the form X → Y, where X and Y are itemset. X is the Antecedent and Y is the Consequent

Example of Association Rules

{Infant Diaper} → {Coke},
{Milk, Bread} → {Eggs, Coke},
{Coke, Bread} → {Infant Milk}
Antecedent → Consequent

Now let’s assume we take {Infant Diaper, Infant Milk} as the Antecedent, and we want to figure out the Consequent i.e. what ITEM is most often bought with Infant Diaper and Infant Milk.

Association Rule are, by definition, extracted from the data i.e. they are core properties of the Data. We are not looking for correlation in the Item sets in a Rule. This makes Graph Databases a good platform for Association Rule Mining. While there are lot of Python and R packages that are designed for Association Rule Mining, we want to start exploring the use of GQL to perform Exploratory Data Mining exercise.

We will start with the following sample Orders and the Items contained in those Orders

ORDER_NUMBER	ITEMS
3	infant formula, infant diaper, coffee, coke
7	infant formula, infant diaper, mountain dew, coke, coffee
1	bread, infant formula
4	bread, infant formula, infant diaper, coffee
6	infant formula, infant diaper, mountain dew
5	infant formula, infant diaper, coke
2	bread, infant diaper, coffee, eggs

We are interested in Items that are bought together with Infant Diaper and Infant Milk.

Let’s start by creating the dataset

create graph order_items {
  node orders ({order_number string})
  , node item ({description string})
  , edge includes (orders)-[]->(items)
};

use graph order_items;
insert (n_o1:orders {_id: "order_1", order_number:'1'})
  , (n_o2:orders {_id: "order_2", order_number:'2'})
  , (n_o3:orders {_id: "order_3", order_number:'3'})
  , (n_o4:orders {_id: "order_4", order_number:'4'})
  , (n_o5:orders {_id: "order_5", order_number:'5'})
  , (n_o6:orders {_id: "order_6", order_number:'6'})
  , (n_o7:orders {_id: "order_7", order_number:'7'})
  , (n_infant_formula:item {_id: 'infant_formula', description:'infant formula'})
  , (n_infant_diaper:item {_id: 'infant_diaper', description:'infant diaper'})
  , (n_coffee:item {_id: 'coffee', description:'coffee'})
  , (n_coke:item {_id: 'coke', description:'coke'})
  , (n_mountain_dew:item {_id: 'mountain_dew', description:'mountain dew'})
  , (n_bread:item {_id: 'bread', description:'bread'})
  , (n_eggs:item {_id: 'eggs', description:'eggs'})
  , (n_o1)-[:includes]->(n_infant_formula)
  , (n_o1)-[:includes]->(n_bread)
  , (n_o2)-[:includes]->(n_bread)
  , (n_o2)-[:includes]->(n_infant_diaper)
  , (n_o2)-[:includes]->(n_coffee)
  , (n_o2)-[:includes]->(n_eggs)
  , (n_o3)-[:includes]->(n_infant_formula)
  , (n_o3)-[:includes]->(n_infant_diaper)
  , (n_o3)-[:includes]->(n_coffee)
  , (n_o3)-[:includes]->(n_coke)
  , (n_o4)-[:includes]->(n_bread)
  , (n_o4)-[:includes]->(n_infant_formula)
  , (n_o4)-[:includes]->(n_infant_diaper)
  , (n_o4)-[:includes]->(n_coffee)
  , (n_o5)-[:includes]->(n_infant_formula)
  , (n_o5)-[:includes]->(n_infant_diaper)
  , (n_o5)-[:includes]->(n_coke)
  , (n_o6)-[:includes]->(n_infant_formula)
  , (n_o6)-[:includes]->(n_infant_diaper)
  , (n_o6)-[:includes]->(n_mountain_dew)
  , (n_o7)-[:includes]->(n_infant_formula)
  , (n_o7)-[:includes]->(n_infant_diaper)
  , (n_o7)-[:includes]->(n_mountain_dew)
  , (n_o7)-[:includes]->(n_coke)
  , (n_o7)-[:includes]->(n_coffee)
;

create graph order_items {
  node orders ({order_number string})
  , node item ({description string})
  , edge includes (orders)-[]->(items)
};

use graph order_items;
insert (n_o1:orders {_id: "order_1", order_number:'1'})
  , (n_o2:orders {_id: "order_2", order_number:'2'})
  , (n_o3:orders {_id: "order_3", order_number:'3'})
  , (n_o4:orders {_id: "order_4", order_number:'4'})
  , (n_o5:orders {_id: "order_5", order_number:'5'})
  , (n_o6:orders {_id: "order_6", order_number:'6'})
  , (n_o7:orders {_id: "order_7", order_number:'7'})
  , (n_infant_formula:item {_id: 'infant_formula', description:'infant formula'})
  , (n_infant_diaper:item {_id: 'infant_diaper', description:'infant diaper'})
  , (n_coffee:item {_id: 'coffee', description:'coffee'})
  , (n_coke:item {_id: 'coke', description:'coke'})
  , (n_mountain_dew:item {_id: 'mountain_dew', description:'mountain dew'})
  , (n_bread:item {_id: 'bread', description:'bread'})
  , (n_eggs:item {_id: 'eggs', description:'eggs'})
  , (n_o1)-[:includes]->(n_infant_formula)
  , (n_o1)-[:includes]->(n_bread)
  , (n_o2)-[:includes]->(n_bread)
  , (n_o2)-[:includes]->(n_infant_diaper)
  , (n_o2)-[:includes]->(n_coffee)
  , (n_o2)-[:includes]->(n_eggs)
  , (n_o3)-[:includes]->(n_infant_formula)
  , (n_o3)-[:includes]->(n_infant_diaper)
  , (n_o3)-[:includes]->(n_coffee)
  , (n_o3)-[:includes]->(n_coke)
  , (n_o4)-[:includes]->(n_bread)
  , (n_o4)-[:includes]->(n_infant_formula)
  , (n_o4)-[:includes]->(n_infant_diaper)
  , (n_o4)-[:includes]->(n_coffee)
  , (n_o5)-[:includes]->(n_infant_formula)
  , (n_o5)-[:includes]->(n_infant_diaper)
  , (n_o5)-[:includes]->(n_coke)
  , (n_o6)-[:includes]->(n_infant_formula)
  , (n_o6)-[:includes]->(n_infant_diaper)
  , (n_o6)-[:includes]->(n_mountain_dew)
  , (n_o7)-[:includes]->(n_infant_formula)
  , (n_o7)-[:includes]->(n_infant_diaper)
  , (n_o7)-[:includes]->(n_mountain_dew)
  , (n_o7)-[:includes]->(n_coke)
  , (n_o7)-[:includes]->(n_coffee)
;

Let’s use GQL to query the Items that are bought with Infant Diaper and Infant Formula the most.

match (o:orders)-[:includes]->(third_item:item)
CALL (third_item) {
  match (o:orders)-[:includes]->(i1:item {_id:'infant_formula'})
    , (o)-[:includes]->(i2:item {_id:'infant_diaper'})
    , (o)-[:includes]->(third_item)
  return third_item, count(o) as counter  
}
filter counter > 0
return distinct table(third_item.description, counter)
order by counter desc;

match (o:orders)-[:includes]->(third_item:item)
CALL (third_item) {
  match (o:orders)-[:includes]->(i1:item {_id:'infant_formula'})
    , (o)-[:includes]->(i2:item {_id:'infant_diaper'})
    , (o)-[:includes]->(third_item)
  return third_item, count(o) as counter  
}
filter counter > 0
return distinct table(third_item.description, counter)
order by counter desc;

Notice the use of CALL statement. The CALL statement is used to invoke an inline procedure or subquery. An inline procedure is a user-defined procedure embedded within a query, commonly used to execute subqueries or perform data modifications. It enables complex logic such as looping. In short, CALL statement enables modular query design and the invocation of complex logic in a graph query.

In the above GQL query we first get a list of all items and then loop through each of them using the CALL statement to count the number of Orders where that item is included along with Infant Formula and Infant Diaper.

This query will output:

Based on the above query, we can extract the following Association Rule:

{Infant Milk, Infant Diaper} → {Coke}
{Infant Milk, Infant Diaper} → {Coffee}

This doesn’t mean that there is correlation between purchase of Infant Milk, Infant Diaper and Coffee. This is just a property if of the Data.

Alternate way is to the GROUP BY clause in GQL:

match ALL (o:orders)-[:includes]->(i1:item {_id:'infant_formula'})
  , (o)-[:includes]->(i2:item {_id:'infant_diaper'})
  , (o)-[:includes]->(i3:item)
return table(i1.description, i2.description, i3.description, count(distinct o))
group by i3.description;

match ALL (o:orders)-[:includes]->(i1:item {_id:'infant_formula'})
  , (o)-[:includes]->(i2:item {_id:'infant_diaper'})
  , (o)-[:includes]->(i3:item)
return table(i1.description, i2.description, i3.description, count(distinct o))
group by i3.description;

Both the CALL function and the GROUP BY will produce the same results in this case. However CALL function is more versatile allows for complex MATCH statement as part of the sub-query.

December 25, 2025

Aggregating Edge Weights in GQL

Often times we have to aggregate the weights along paths (edges) in a Graph. GQL provides a REDUCE function for this. We will use Ultimate Beneficial Ownership (UBO) graph to demonstrate this.

A beneficial owner is defined as a physical person “who directly or indirectly owns or controls a sufficient part of the shares or the voting rights, or who practices control through other means.”.

If a company is owned by another company (e.g. a holding company), it is the physical person(s) who ultimately own(s) the company who shall be registered as beneficial owner(s). Identifying Ultimate Beneficial Ownership (UBO) can be challenging due to complex and opaque ownership structures. But Property Graphs and GQL make this process easier.

Let’s take a look at the following Beneficial Ownership chart

Let’s create a the dataset based on the above Beneficial Ownership chart:

create graph ubo_graph {
  node person ({name string})
  , node business ({name string})
  , edge owns ()-[{percentage float}]->()
};

use ubo_graph;

insert 
  (n_uroosa:person {_id:'uroosa', name:'Uroosa'})
  , (n_fatima:person {_id:'fatima', name:'Fatima'})
  , (n_maryam:person {_id:'maryam', name:'Maryam'})
  , (n_zainab:person {_id:'zainab', name:'Zainab'})
  , (n_hannah:person {_id:'hannah', name:'Hannah'})
  , (n_acme_corp:business {_id:'acme corp', name:'Acme Corp.'})
  , (n_northwind:business {_id:'northwind', name:'Northwind'})
  , (n_fourth_cofee:business {_id:'fourth coffee', name:'Fourth Coffee'})
  , (n_tasmanian_traders:business {_id:'tasmanian traders', name:'Tasmanian Traders'})
  , (n_coffee_corp:business {_id:'coffee corp', name:'Coffe Corp.'})
  , (n_blue_yonder:business {_id:'blue yonder', name:'Blue Yonder'})
  , (n_hotel_kitchen_sink:business {_id:'hotel kitchen sink', name:'Hotel Kitchen Sink'})
  , (n_uroosa)-[e1:owns {percentage:0.80}]->(n_acme_corp)
  , (n_uroosa)-[e2:owns {percentage:0.15}]->(n_hotel_kitchen_sink)
  , (n_fatima)-[e3:owns {percentage:0.20}]->(n_acme_corp)
  , (n_zainab)-[e4:owns {percentage:0.19}]->(n_fourth_cofee)
  , (n_maryam)-[e5:owns {percentage:1.0}]->(n_coffee_corp)
  , (n_acme_corp)-[e6:owns {percentage:0.30}]->(n_fourth_cofee)
  , (n_northwind)-[e7:owns {percentage:0.09}]->(n_hotel_kitchen_sink)
  , (n_tasmanian_traders)-[e8:owns {percentage:0.49}]->(n_blue_yonder)
  , (n_coffee_corp)-[e9:owns {percentage:0.51}]->(n_blue_yonder)
  , (n_blue_yonder)-[e10:owns {percentage:0.51}]->(n_fourth_cofee)
  , (n_fourth_cofee)-[e11:owns {percentage:0.51}]->(n_hotel_kitchen_sink)
  , (n_hannah)-[e12:owns {percentage:0.25}]->(n_hotel_kitchen_sink)
;


create graph ubo_graph {
  node person ({name string})
  , node business ({name string})
  , edge owns ()-[{percentage float}]->()
};

use ubo_graph;

insert 
  (n_uroosa:person {_id:'uroosa', name:'Uroosa'})
  , (n_fatima:person {_id:'fatima', name:'Fatima'})
  , (n_maryam:person {_id:'maryam', name:'Maryam'})
  , (n_zainab:person {_id:'zainab', name:'Zainab'})
  , (n_hannah:person {_id:'hannah', name:'Hannah'})
  , (n_acme_corp:business {_id:'acme corp', name:'Acme Corp.'})
  , (n_northwind:business {_id:'northwind', name:'Northwind'})
  , (n_fourth_cofee:business {_id:'fourth coffee', name:'Fourth Coffee'})
  , (n_tasmanian_traders:business {_id:'tasmanian traders', name:'Tasmanian Traders'})
  , (n_coffee_corp:business {_id:'coffee corp', name:'Coffe Corp.'})
  , (n_blue_yonder:business {_id:'blue yonder', name:'Blue Yonder'})
  , (n_hotel_kitchen_sink:business {_id:'hotel kitchen sink', name:'Hotel Kitchen Sink'})
  , (n_uroosa)-[e1:owns {percentage:0.80}]->(n_acme_corp)
  , (n_uroosa)-[e2:owns {percentage:0.15}]->(n_hotel_kitchen_sink)
  , (n_fatima)-[e3:owns {percentage:0.20}]->(n_acme_corp)
  , (n_zainab)-[e4:owns {percentage:0.19}]->(n_fourth_cofee)
  , (n_maryam)-[e5:owns {percentage:1.0}]->(n_coffee_corp)
  , (n_acme_corp)-[e6:owns {percentage:0.30}]->(n_fourth_cofee)
  , (n_northwind)-[e7:owns {percentage:0.09}]->(n_hotel_kitchen_sink)
  , (n_tasmanian_traders)-[e8:owns {percentage:0.49}]->(n_blue_yonder)
  , (n_coffee_corp)-[e9:owns {percentage:0.51}]->(n_blue_yonder)
  , (n_blue_yonder)-[e10:owns {percentage:0.51}]->(n_fourth_cofee)
  , (n_fourth_cofee)-[e11:owns {percentage:0.51}]->(n_hotel_kitchen_sink)
  , (n_hannah)-[e12:owns {percentage:0.25}]->(n_hotel_kitchen_sink)
;

Now let’s see how the REDUCE function can be used to calculate the aggregate along the path. We will calculate the Uroosa’s total Ownership for the Hotel Kitchen Sink. Notice that Uroosa directly owns a share of the Hotel Kitchen Sink and also through a separate business entities (Acme Corp. and Fourth Coffee). We will need to account for both of these paths:

We can use the following GQL to Calculated the Ownership Percentage along each Path:

match path = all (a:person {_id:'uroosa'})
  -[owns]
  ->{0,} (b:business {_id:'hotel kitchen sink'})
return path, 
  reduce(total_ownership = 1 , own in owns | total_ownership * own.percentage);

match path = all (a:person {_id:'uroosa'})
  -[owns]
  ->{0,} (b:business {_id:'hotel kitchen sink'})
return path, 
  reduce(total_ownership = 1 , own in owns | total_ownership * own.percentage);

Notice the use of the REDUCE function to calculate the total_ownership. This will output:

This corresponds to 0.8 x 0.3 x 0.51 = 0.1224 through Acme Corp. and Fourth Coffee and direct share of 0.15 in Hotel Kitchen Sink.

We can also use the Linear Composition in GQL to sum up the Ownership percentages (0.15 + 0.1224) as following

match path = all (a:person {_id:'uroosa'})-[owns]->{0,} (b:business {_id:'hotel kitchen sink'})
return path, 
reduce(total_ownership = 1 , own in owns | total_ownership * own.percentage) as ownership
next 
return sum(ownership)
;

match path = all (a:person {_id:'uroosa'})-[owns]->{0,} (b:business {_id:'hotel kitchen sink'})
return path, 
reduce(total_ownership = 1 , own in owns | total_ownership * own.percentage) as ownership
next 
return sum(ownership)
;

Notice the use of NEXT to formulate the Linear Statement Composition. This will output:

December 22, 2025

Decomposing path into steps in Property Graphs – Part 2

In the post titled, , we looked at how we can unnest / flatten a path in Graph using GQL. In this post we will look at the how to identify each match when there are multiple paths . Let’s start with our basic Social Network example.

We are interested in path(s) from Saqib to Uroosa. Notice that are two paths for that 1) Saqib -> Angela -> Scott -> Uroosa; 2) Saqib -> Fatima -> Eleni -> Uroosa. We are also interested in the exact nature of the relationship i.e. co-worker vs. friend.

Let’s start by create a Graph that represents that above relationships

create graph social_graph{

 node person ({name string})
, edge relationship (person)-[{type string}]->(person)
};

use social_graph;


INSERT (n1:person {_id:'saqib', name:"Saqib"})
  , (n2:person {_id:'angela', name:"Angela"})
  , (n3:person {_id:'scott', name:"Scott"})
  , (n4:person {_id:'uroosa', name:"Uroosa"})
  , (n5:person {_id:'fatima', name:"Fatima"})
  , (n6:person {_id:'eleni', name:"Eleni"})
  , (n1)-[link1:relationship {type:"friend"}]->(n2)
  , (n2)-[link2:relationship {type:"co-worker"}]->(n3)
  , (n3)-[link3:relationship {type:"friend"}]->(n4)
  , (n1)-[link4:relationship {type:"co-worker"}]->(n5)
  , (n5)-[link5:relationship {type:"friend"}]->(n6)
  , (n6)-[link6:relationship {type:"friend"}]->(n4)


create graph social_graph{

 node person ({name string})
, edge relationship (person)-[{type string}]->(person)
};

use social_graph;


INSERT (n1:person {_id:'saqib', name:"Saqib"})
  , (n2:person {_id:'angela', name:"Angela"})
  , (n3:person {_id:'scott', name:"Scott"})
  , (n4:person {_id:'uroosa', name:"Uroosa"})
  , (n5:person {_id:'fatima', name:"Fatima"})
  , (n6:person {_id:'eleni', name:"Eleni"})
  , (n1)-[link1:relationship {type:"friend"}]->(n2)
  , (n2)-[link2:relationship {type:"co-worker"}]->(n3)
  , (n3)-[link3:relationship {type:"friend"}]->(n4)
  , (n1)-[link4:relationship {type:"co-worker"}]->(n5)
  , (n5)-[link5:relationship {type:"friend"}]->(n6)
  , (n6)-[link6:relationship {type:"friend"}]->(n4)

Next use the pnodes function to get the two possible PATH(s) as JSON

use social_graph;
MATCH path = all SHORTEST  (a {_id:'saqib'})-[links]->{1,4}(b {_id:'uroosa'})
return pnodes(path);

use social_graph;
MATCH path = all SHORTEST  (a {_id:'saqib'})-[links]->{1,4}(b {_id:'uroosa'})
return pnodes(path);

This will output:

------------ path 1    
[{"schema":"person","id":"saqib","uuid":"11024814086826229762","values":{"name":"Saqib"}},{"schema":"person","id":"angela","uuid":"7205761602816049155","values":{"name":"Angela"}},{"schema":"person","id":"scott","uuid":"2666133178426589188","values":{"name":"Scott"}},{"schema":"person","id":"uroosa","uuid":"14771808976798482437","values":{"name":"Uroosa"}}]
    
------------ path 2
[{"schema":"person","id":"saqib","uuid":"11024814086826229762","values":{"name":"Saqib"}},{"schema":"person","id":"fatima","uuid":"6413128068398841862","values":{"name":"Fatima"}},{"schema":"person","id":"eleni","uuid":"9583662206067671047","values":{"name":"Eleni"}},{"schema":"person","id":"uroosa","uuid":"14771808976798482437","values":{"name":"Uroosa"}}]

------------ path 1    
[{"schema":"person","id":"saqib","uuid":"11024814086826229762","values":{"name":"Saqib"}},{"schema":"person","id":"angela","uuid":"7205761602816049155","values":{"name":"Angela"}},{"schema":"person","id":"scott","uuid":"2666133178426589188","values":{"name":"Scott"}},{"schema":"person","id":"uroosa","uuid":"14771808976798482437","values":{"name":"Uroosa"}}]
    
------------ path 2
[{"schema":"person","id":"saqib","uuid":"11024814086826229762","values":{"name":"Saqib"}},{"schema":"person","id":"fatima","uuid":"6413128068398841862","values":{"name":"Fatima"}},{"schema":"person","id":"eleni","uuid":"9583662206067671047","values":{"name":"Eleni"}},{"schema":"person","id":"uroosa","uuid":"14771808976798482437","values":{"name":"Uroosa"}}]

Now let’s see look at how to get the each hop as row and also get the relationship type:

MATCH path = all SHORTEST  (a {_id:'saqib'})-[links]->{1,}(b {_id:'uroosa'})
for link in links WITH ORDINALITY hop_number
return table(link._from, link._to, link.type, hop_number, pnodeIds(path), PATH_LENGTH(path));

MATCH path = all SHORTEST  (a {_id:'saqib'})-[links]->{1,}(b {_id:'uroosa'})
for link in links WITH ORDINALITY hop_number
return table(link._from, link._to, link.type, hop_number, pnodeIds(path), PATH_LENGTH(path));

This will output:

Notice that the output contains two possible paths. pnodeIds are Path Identifiers. hop_number indicates the ordinal hop number in the path.

December 9, 2025

Decomposing path into steps in GQL

When working with Property Graphs, resolving the data path aka. decomposing Graph path is key to understanding the path traversed by a MATCH query in GQL. To facilitate this unnesting / flattening of path data, GQL provides few options. We will look at two options and apply them to following Graph to unnest the SHORTEST path between Alice and Diana i.e. Alice -> Chalie -> Diana

Let’s start by creating a sample Sample Graph:

create graph social_network {
  NODE PERSON ({name STRING, age int}), 
  EDGE FOLLOWS ()-[]->()
};
use graph social_network;

SHOW NODE SCHEMA;
SHOW EDGE SCHEMA;
INSERT (n1:PERSON {_id:'alice', name: "Alice", age:25})
  , (n2:PERSON {_id:'bob', name: "Bob", age:30})
  , (n3:PERSON {_id:'diana', name: "Diana", age:28})
  , (n4:PERSON {_id:'charlie', name: "Charlie", age:32});
MATCH (n1:PERSON {_id:'alice'})
  , (n2:PERSON {_id:'bob'})
  , (n3:PERSON {_id:'diana'})
  , (n4:PERSON {_id:'charlie'})
INSERT (n1)-[e1:FOLLOWS]->(n2)
  , (n2)-[e2:FOLLOWS]->(n1)
  , (n2)-[e3:FOLLOWS]->(n4)
  , (n1)-[e4:FOLLOWS]->(n4)
  , (n3)-[e5:FOLLOWS]->(n4)
  , (n4)-[e6:FOLLOWS]->(n3)
  , (n3)-[e7:FOLLOWS]->(n2) 
RETURN n1, n2, n3, n4;

create graph social_network {
  NODE PERSON ({name STRING, age int}), 
  EDGE FOLLOWS ()-[]->()
};
use graph social_network;

SHOW NODE SCHEMA;
SHOW EDGE SCHEMA;
INSERT (n1:PERSON {_id:'alice', name: "Alice", age:25})
  , (n2:PERSON {_id:'bob', name: "Bob", age:30})
  , (n3:PERSON {_id:'diana', name: "Diana", age:28})
  , (n4:PERSON {_id:'charlie', name: "Charlie", age:32});
MATCH (n1:PERSON {_id:'alice'})
  , (n2:PERSON {_id:'bob'})
  , (n3:PERSON {_id:'diana'})
  , (n4:PERSON {_id:'charlie'})
INSERT (n1)-[e1:FOLLOWS]->(n2)
  , (n2)-[e2:FOLLOWS]->(n1)
  , (n2)-[e3:FOLLOWS]->(n4)
  , (n1)-[e4:FOLLOWS]->(n4)
  , (n3)-[e5:FOLLOWS]->(n4)
  , (n4)-[e6:FOLLOWS]->(n3)
  , (n3)-[e7:FOLLOWS]->(n2) 
RETURN n1, n2, n3, n4;

First let’s use pnodes function to get the unnested path as a JSON.

MATCH paths =  SHORTEST 1 (n1 {_id:'alice'})-[links]->{0,4}(n2 {_id:'diana'}) 
LET length = PATH_LENGTH(paths)
RETURN pnodes(paths);

MATCH paths =  SHORTEST 1 (n1 {_id:'alice'})-[links]->{0,4}(n2 {_id:'diana'}) 
LET length = PATH_LENGTH(paths)
RETURN pnodes(paths);

This will output a JSON array as following:

    [{"schema":"PERSON","id":"alice","uuid":"5116091375716139010","values":{"age":25,"name":"Alice"}},{"schema":"PERSON","id":"charlie","uuid":"11961562809319292933","values":{"age":32,"name":"Charlie"}},{"schema":"PERSON","id":"diana","uuid":"16789421609860464644","values":{"age":28,"name":"Diana"}}]

    [{"schema":"PERSON","id":"alice","uuid":"5116091375716139010","values":{"age":25,"name":"Alice"}},{"schema":"PERSON","id":"charlie","uuid":"11961562809319292933","values":{"age":32,"name":"Charlie"}},{"schema":"PERSON","id":"diana","uuid":"16789421609860464644","values":{"age":28,"name":"Diana"}}]

Another way to UNNEST is to produce one row per hop. We can use the FOR loop for this

MATCH paths =  (n1 {_id:'alice'})-[links]->{0,4}(n2 {_id:'diana'})
FOR link in links
RETURN link;

MATCH paths =  (n1 {_id:'alice'})-[links]->{0,4}(n2 {_id:'diana'})
FOR link in links
RETURN link;

This will output:

Notice that we used Variable Length Path Pattern {0,4} in the aforementioned query. GQL supports the following Variable Length Path Patterns:

Quantifier	Description
`{n}`	Exactly `n`
`{n, m}`	Between `n` and `m` (inclusive)
`{, m}`	Between `0` and `m` (inclusive)
`{m,}`	`m` (inclusive) or higher

December 8, 2025

Decomposing PATH into Steps using GQL

Returning one or more paths during a graph traversal is a central feature of regular path queries in the GQL. To illustrate we will use the following example of a travel network:

We are interested in all the SHORTEST PATHS available from Bayreuth to Santiago.

Let’s start by creating a sample dataset

create graph travel_network {
  node city ({name string, country string})
  , edge train (city)-[{duration int}]->(city) -- duration in minutes
  , edge flight (city)-[{code string, duration int}]->(city) -- duration in minutes
  , edge car (city)-[{tolls float}]->(city)
};

use travel_network;

-- Create City nodes
INSERT 
  (n_bayreuth:city {_id: 'Bayreuth', name: 'Bayreuth', country: 'Germany'})
  , (n_nürnberg:city {_id: 'Nürnberg', name: 'Nürnberg', country: 'Germany'})
  , (n_stuttgart:city {_id: 'Stuttgart', name: 'Stuttgart', country: 'Germany'})
  , (n_frankfurt:city {_id: 'Frankfurt', name: 'Frankfurt', country: 'Germany'})
  , (n_paris:city {_id:'Paris', name: 'Paris', country: 'France'})
  , (n_amsterdam:city {_id:'Amsterdam', name: 'Amsterdam', country: 'Netherlands'})
  , (n_rome:city {_id:'Rome', name: 'Rome', country: 'Italy'})
  , (n_santiago:city {_id:'Santiago', name: 'Santiago', country: 'Chile'})
  , (n_bayreuth)-[:train]->(n_nürnberg)
  , (n_nürnberg)-[:train]->(n_bayreuth)
  , (n_nürnberg)-[:train {duration: 120}]->(n_stuttgart)
  , (n_nürnberg)-[:train]->(n_frankfurt)
  , (n_frankfurt)-[:train]->(n_paris)
  , (n_stuttgart)-[:train]->(n_paris)
  , (n_paris)-[:flight {code: 'AF406', duration: 780}]->(n_santiago)
  , (n_amsterdam)-[:flight]->(n_santiago)
  , (n_amsterdam)-[:flight]->(n_rome)
  , (n_rome)-[:car {tolls: 74}]->(n_bayreuth)
  , (n_frankfurt)-[:train]->(n_amsterdam)
   
;

create graph travel_network {
  node city ({name string, country string})
  , edge train (city)-[{duration int}]->(city) -- duration in minutes
  , edge flight (city)-[{code string, duration int}]->(city) -- duration in minutes
  , edge car (city)-[{tolls float}]->(city)
};

use travel_network;

-- Create City nodes
INSERT 
  (n_bayreuth:city {_id: 'Bayreuth', name: 'Bayreuth', country: 'Germany'})
  , (n_nürnberg:city {_id: 'Nürnberg', name: 'Nürnberg', country: 'Germany'})
  , (n_stuttgart:city {_id: 'Stuttgart', name: 'Stuttgart', country: 'Germany'})
  , (n_frankfurt:city {_id: 'Frankfurt', name: 'Frankfurt', country: 'Germany'})
  , (n_paris:city {_id:'Paris', name: 'Paris', country: 'France'})
  , (n_amsterdam:city {_id:'Amsterdam', name: 'Amsterdam', country: 'Netherlands'})
  , (n_rome:city {_id:'Rome', name: 'Rome', country: 'Italy'})
  , (n_santiago:city {_id:'Santiago', name: 'Santiago', country: 'Chile'})
  , (n_bayreuth)-[:train]->(n_nürnberg)
  , (n_nürnberg)-[:train]->(n_bayreuth)
  , (n_nürnberg)-[:train {duration: 120}]->(n_stuttgart)
  , (n_nürnberg)-[:train]->(n_frankfurt)
  , (n_frankfurt)-[:train]->(n_paris)
  , (n_stuttgart)-[:train]->(n_paris)
  , (n_paris)-[:flight {code: 'AF406', duration: 780}]->(n_santiago)
  , (n_amsterdam)-[:flight]->(n_santiago)
  , (n_amsterdam)-[:flight]->(n_rome)
  , (n_rome)-[:car {tolls: 74}]->(n_bayreuth)
  , (n_frankfurt)-[:train]->(n_amsterdam)
   
;

We can use the following GQL to get all the possible SHORTEST PATHS.

match paths = all shortest  (a:city {_id:'Bayreuth'})-[edges:train|flight]->{1,}(b:city {_id:'Santiago'})
for edge in edges
return distinct table(edge._from, edge._to, labels(edge), pnodeIds(paths), length(paths));

match paths = all shortest  (a:city {_id:'Bayreuth'})-[edges:train|flight]->{1,}(b:city {_id:'Santiago'})
for edge in edges
return distinct table(edge._from, edge._to, labels(edge), pnodeIds(paths), length(paths));

This will return each of the path decomposed into the each hop (step) as following

In the above GQL Query a is matched to the city of Bayreuth, and b to Santiago city. The PATH(s) are all possible combinations of the edges between between a and b which minimize the number of hops. In our case 4 is the minimum number of hops and there 3 possible PATHs that qualify.

December 3, 2025

GQL Playground

We are working on a interactive GQL Playground. The GQL Playground will provide interactive learning experience with guided examples. Stay tuned!

December 2, 2025

Linear Composition in GQL

The workhorse of Graph Query Language (GQL) is pattern matching. GQL is oblivious to how a graph is stored. The table resulting from pattern matching is manipulated by a sequence of operators that modify it in an imperative style that that is referred to as “Linear Composition” or “Sequential Composition”. In “Linear Composition” we can simply add clauses to the already existing query which apply new operations to the result of already processed clauses.

In the following example we will demonstrate how a sequence of operators can continue after the RETURN clause.

Let’s start with a simple Graph.

Let’s say we interested in the nodes that have the longest path and then we want to find out all the paths between those two nodes. We can start with finding the longest path i.e. Saqib -> Angela -> Scott -> Uroosa and then use Linear Composition to add another Pattern Matching query to identify the all the paths between Saqib and Uroosa

Let’s start by create some sample dataset:

create graph social_graph{

 node person ({name string})
, edge relationship (person)-[{type string}]->(person)
};

use social_graph;


INSERT (n1:person {_id:'saqib', name:"Saqib"})
  , (n2:person {_id:'angela', name:"Angela"})
  , (n3:person {_id:'scott', name:"Scott"})
  , (n4:person {_id:'uroosa', name:"Uroosa"})
  , (n5:person {_id:'fatima', name:"Fatima"})
  , (n6:person {_id:'eleni', name:"Eleni"})
  , (n1)-[link1:relationship {type:"friend"}]->(n2)
  , (n2)-[link2:relationship {type:"co-worker"}]->(n3)
  , (n3)-[link3:relationship {type:"friend"}]->(n4)
  , (n1)-[link4:relationship {type:"co-worker"}]->(n5)
  , (n5)-[link5:relationship {type:"friend"}]->(n6)
  , (n1)-[link:relationship {type:"friend"}]->(n4)
;

create graph social_graph{

 node person ({name string})
, edge relationship (person)-[{type string}]->(person)
};

use social_graph;


INSERT (n1:person {_id:'saqib', name:"Saqib"})
  , (n2:person {_id:'angela', name:"Angela"})
  , (n3:person {_id:'scott', name:"Scott"})
  , (n4:person {_id:'uroosa', name:"Uroosa"})
  , (n5:person {_id:'fatima', name:"Fatima"})
  , (n6:person {_id:'eleni', name:"Eleni"})
  , (n1)-[link1:relationship {type:"friend"}]->(n2)
  , (n2)-[link2:relationship {type:"co-worker"}]->(n3)
  , (n3)-[link3:relationship {type:"friend"}]->(n4)
  , (n1)-[link4:relationship {type:"co-worker"}]->(n5)
  , (n5)-[link5:relationship {type:"friend"}]->(n6)
  , (n1)-[link:relationship {type:"friend"}]->(n4)
;

Now let’s find the Longest path using PATTERN MATCHING

MATCH path = ALL  (a)-[links]->{1,}(b)
RETURN a, b, PATH_LENGTH(path) as path_length
ORDER BY path_length DESC limit 1

MATCH path = ALL  (a)-[links]->{1,}(b)
RETURN a, b, PATH_LENGTH(path) as path_length
ORDER BY path_length DESC limit 1

This will result in:

Next let’s add use Linear Composition to add another Pattern Matching query to identify the all the paths between Saqib and Uroosa

MATCH path = ALL  (a)-[links]->{1,}(b)
RETURN a, b, PATH_LENGTH(path) as path_length, path
ORDER BY path_length DESC limit 1
NEXT
MATCH path = ALL (a)-[links]->{1,}(b)
RETURN table(a.name, b.name, pedges(path));


MATCH path = ALL  (a)-[links]->{1,}(b)
RETURN a, b, PATH_LENGTH(path) as path_length, path
ORDER BY path_length DESC limit 1
NEXT
MATCH path = ALL (a)-[links]->{1,}(b)
RETURN table(a.name, b.name, pedges(path));

This will result in

Observe that GQL processes results of pattern matching in sequential, or pipelined way where the output of each operation in a sequence serves as the input for the next operation. GQL standard refers to this as Linear Statement Composition or simple Linear Composition.

September 4, 2025