Mingxi Wu | The Blog Pros

September 9, 2021

Truth Behind Neo4j’s “Trillion” Relationship Graph

My interest in graph database benchmarks started with my conversation with Peter Boncz (LDBC chair) at SIGMOD 2018. He introduced me to the “choked points” concepts that he used in analyzing the well-known TPC-H relational database benchmark. He also shared with me that he has employed that choked point methodology in designing the graph data management benchmark LDBC Social Network Benchmark (LDBC-SNB). At that point, for simplicity, I had just finished comparing six graph databases using topology-only (vertex/edge without attributes) data sets and relying on the k-hop query and well-known iterative algorithms such as PageRank and Connected Components to do the comparison (which was an industry first, and followed by other graph startups and academia benchmark reports). As I worked with the largest banking, healthcare, retail, and manufacturing enterprises at TigerGraph, I realized that those benchmarks did not meet the real-world requirements for evaluating an enterprise-grade graph database. I was looking for graph benchmarks that are closer to the complex graph query patterns of real-world customer queries. And LDBC-SNB fits the bill!

Ever since then, TigerGraph has adopted the LDBC-SNB as the benchmark suite to evaluate our release performance and we have continued to deliver the world records in scalability using the largest LDBC-SNB data sets (2019 1TB, 2020 5TB).

April 6, 2020

Accumulators vs SQL GROUP BY Aggregation

Graph query language has always been among the top considerations when users choose a graph database for serious production use. Some considerations include but not limited to ease-of-use, expressiveness, and conformance to ISO standard. When it comes to putting graph databases into production, our experience shows that sufficient expressive power comes first.

In our previous blog, we anatomize the basic semantics and usage pattern of accumulators. We got a lot of feedback. One of the most frequently asked questions is that Can you do all accumulator-based aggregation in SQL GroupBy style aggregation?

July 16, 2019

Accumulator 101

Motivation

GSQL is a Turing complete Graph Database query language. Compared to other graph query languages, the biggest advantage is its support of Accumulators — global or attachable to each vertex.

In addition to providing the classic pattern match syntax, which is easy to master, GSQL supports powerful run-time vertex attributes (a.k.a local accumulators) and global state variables (a.k.a global accumulators). I have seen users learning and adopting pattern match syntax within ten minutes. However, I also witnessed the uneasiness of learning and adopting accumulators for beginners.

April 19, 2019August 12, 2019

What Are the Major Advantages of Using a Graph Database?

A graph database is a data management system software. The building blocks are vertices and edges. To put it in a more familiar context, a relational database is also a data management software in which the building blocks are tables. Both require loading data into the software and using a query language or APIs to access the data.

Relational databases boomed in the 1980s. Many commercial companies (i.e. Oracle, Ingres, IBM) backed the relational model (tabular organization) of data management. In that era, the main data management need was to generate reports.