My interest in graph database benchmarks started with my conversation with Peter Boncz (LDBC chair) at SIGMOD 2018. He introduced me to the “choked points” concepts that he used in analyzing the well-known TPC-H relational database benchmark. He also shared with me that he has employed that choked point methodology in designing the graph data management benchmark LDBC Social Network Benchmark (LDBC-SNB). At that point, for simplicity, I had just finished comparing six graph databases using topology-only (vertex/edge without attributes) data sets and relying on the k-hop query and well-known iterative algorithms such as PageRank and Connected Components to do the comparison (which was an industry first, and followed by other graph startups and academia benchmark reports). As I worked with the largest banking, healthcare, retail, and manufacturing enterprises at TigerGraph, I realized that those benchmarks did not meet the real-world requirements for evaluating an enterprise-grade graph database. I was looking for graph benchmarks that are closer to the complex graph query patterns of real-world customer queries. And LDBC-SNB fits the bill!
Ever since then, TigerGraph has adopted the LDBC-SNB as the benchmark suite to evaluate our release performance and we have continued to deliver the world records in scalability using the largest LDBC-SNB data sets (2019 1TB, 2020 5TB).