Nebula Graph | The Blog Pros

April 22, 2022

How I Cracked Chinese Wordle Using a Knowledge Graph

Wordle is going viral these days on social media. The game made by Josh Wardle allows players to try six times to guess a five-letter word, with feedback given for each guess in the form of colored tiles indicating when letters match or occupy the correct position.

We have seen many Wordle variants for languages that use the Latin script, such as the Spanish Wordle, French Wordle, and German Wordle. However, for non-alphabetic languages like Chinese, a simple adaptation of the English Wordle's rules just won't work.

November 7, 2021

Nebula Graph: How to Implement Variable-Length Pattern Matching

At the very heart of openCypher, the MATCH clause allows you to specify simple query patterns to retrieve the relationships from a graph database. A variable-length pattern is commonly used to describe paths and it is Nebula Graph’s first try to get nGQL compatible with openCypher in the MATCH clause.

As can be seen from the previous articles of this series, an execution plan is composed of physical operators. Each operator is responsible for executing unique computational logic. To implement the MATCH clause, the operators such as GetNeighbors, GetVertices, Join, Project, Filter, and Loop, which have been introduced in the previous articles, are needed. Unlike the tree structure in a relational database, the execution process expressed by an execution plan in Nebula Graph is a cyclic graph. How to transform a variable-length pattern into a physical plan in Nebula Graph is the focus of the Planner. In this article, we will introduce how variable-length pattern matching is implemented in Nebula Graph.

November 7, 2021

How to Generate an Execution Plan With Planner

Planner is an execution plan generator. It generates an execution plan based on the semantically valid AST that was validated by Validator and then passes the plan to Optimizer to generate an optimized execution plan. Finally, Executor will execute the optimized plan. An execution plan is composed of a series of nodes (PlanNode).

Structure of Source Files

Here is the structure of source files for Planner.

October 28, 2021

Practicing Nebula Operator on Cloud

Hi, everybody! As you know, Nebula Operator has been open-source, and we published an introduction to it. Today, I would like to introduce my practice on Nebula Operator on the Cloud.

About Nebula Operator

What is Nebula Operator? You can refer to Nebula Operator Overview: Automated Operation on Kubernetes. Today, this article will focus on the practicing Nebula Operator. I hope it could help you get started with it soon and enjoy Nebula Graph.

July 23, 2021

BDD-Based Integration Testing Framework for Nebula Graph: Part 2

In BDD-Based Integration Testing Framework for Nebula Graph: Part 1, I introduced the evolution of integration testing for Nebula Graph. In this article, I will introduce how to add a test case into the test set and run all the test cases successfully.

Preparing Testing Environment

At the beginning of building the testing framework for Nebula Graph 2.0, we developed some tool classes to help the testing framework quickly start and stop a single-node Nebula Graph cluster, including checking for port conflicts and modifying part configurations. Here is the original execution procedure:

July 18, 2021

Understanding Subgraph in Nebula Graph 2.0

Introduction

In An Introduction to Nebula Graph 2.0 Query Engine, I introduced how the query engine differs between V2.0 and V1.0 of Nebula Graph.

As shown in the preceding figure, you can see that when a query statement is sent from the client, how the query engine parses the statement, generates an AST, validates the AST, and then generates an execution plan. In this article, I will introduce more about the query engine through the new subgraph feature in Nebula Graph 2.0 and focus on the execution plan generation process, to help you understand the source code better.

July 6, 2021

Community Contribution: Nebula Graph 2.0 Performance Testing

This article is shared by Fanfan from the Nebula Graph community. It is about his practice of performance testing on Nebula Graph 2.0 and optimizing the data import performance of Nebula Importer. In this article, “I” refers to the author.

Background

I did some research on Nebula Graph and did tests to evaluate its performance. During the process, I got a lot of help from the Nebula Graph team. I would like to give thanks to them.

July 1, 2021

Full-Text Indexing in Nebula Graph 2.0

1. Introduction

Nebula Graph 2.0 supports full-text indexing by using an external full-text search engine. To understand this new feature, let’s review the architecture and storage model of Nebula Graph 2.0.

1.1 Architecture of Nebula Graph

May 19, 2021

Storage Format in Nebula Graph v2.0.0

Nebula Graph 2.0 has changed a lot over its releases. In the storage architecture design, the encoding format has been changed, which has the most significant impact on its users. In Nebula Graph, data is stored as KV-pairs in RocksDB. This article covers several issues such as the differences between the old and new encoding formats and why the format must be changed.

Encoding Format in Nebula Graph 1.0

Let’s start with a brief review of the encoding format in Nebula Graph 1.0. For those who are not familiar with it, I recommend that they read this post: An Introduction to Nebula Graph’s Storage Engine. In Nebula Graph 1.0, the vertex IDs can only be represented by values of the int type, so all VertexIDs are stored in int64.

May 6, 2021

Step-by-Step Tutorial: From Data Preprocessing to Using Graph Database

This article is contributed by Jiayi98, a Nebula Graph user. She shared her experience in deploying Nebula Graph offline and preprocessing a dataset provided by LDBC. It is a beginner-friendly step-by-step guide to learn Nebula Graph.

This is not standard stress testing, but a small-scale test. Through this test, I got familiar with the deployment of Nebula Graph, its data import tool, its graph query language, Java API, and data migration. Additionally, now I have a basic understanding of its cluster performance.

December 23, 2020

Troubleshooting a Crash Triggered by Clang Compiler Optimization

If someone told you that the following C++ function would cause the program to crash, what would you think it is that caused the problem?

    C++
   
xxxxxxxxxx

std::string b2s(bool b) {
    return b ? "true" : "false";
}

December 14, 2020

Practice Nebula Graph on Boss Zhipin, a Chinese Recruitment Platform

Business Background

Chinese recruitment platform, Boss Zhipin, uses a large scale graph storage and mining computing in its security and risk control. Boss Zhipin introduced a self-built HA Neo4j 1 cluster to handle its needs. However, when it comes to real time analysis, Neo4j works not well because it doesn’t support a daily data increase of 1 billion relationships.

We first adopted Dgraph to meet our needs. After too many tricky usages and meetings with Dgraph for half a year, we finally make up our mind to migrate to Nebula Graph, a database that fits our scenarios more. This post won’t cover the Benchmark because there are plenty of them on the forum. We will share some technical qualifications and selections, plus the comparisons between the two, which I think you are more interested in.

December 7, 2020

Hands-On Experience: Import Data to Nebula Graph With Spark

This article is written by Liu Jiahao, an engineer at the big data team of IntSig Information Co. Ltd (IntSig). He has been playing around with Nebula Graph and is one of our proud GitHub contributors. This post shares his experience importing data to Nebula Graph with Spark.

Why Nebula Graph?

The graph-related business has grown more and more complex, and performance bottlenecks are identified in some popular graph databases. For example, a single machine has difficulties in scaling to larger graphs. In terms of performance, the native graph storage of Neo4j has irreplaceable advantages. In my survey, JanusGraph, Dgraph, and other graph databases cannot be comparable to Neo4j in this regard. JanusGraph performs very well in OLAP and can support OLTP to some extent. However, this cannot be an advantage of JanusGraph anymore, because some technologies, such as GraphFrame, are sufficient for the OLAP requirements. Besides, since Spark 3.0 starts to support Cypher, I found that comparing with the OLTP requirements of graphs, their OLAP requirements can be satisfied with more technologies. Therefore, Nebula Graph undoubtedly turns out to be a breakthrough to the low efficiency distributed OLTP databases.

December 4, 2020

Nebula Operator: Automated the Nebula Graph Cluster Deployment and Maintenance on K8s

Nebula Operator is a plug-in to deploy, operate, and maintain Nebula Graph automatically on K8s. Building upon the excellent scalability mechanism of K8s, we introduced the operation and maintenance knowledge of Nebula Graph into the K8s system in the CRD + Controller format, which makes Nebula Graph a real cloud-native graph database.

Nebula Graph is a high-performance distributed open source graph database. From the architecture chart below, we can see that a complete Nebula Graph cluster is composed of three types of services, namely the Meta Service, Query Service (Computation Layer) and Storage Service (Storage Layer).

November 24, 2020

D3-Force Directed Graph Layout Optimization in Nebula Graph Studio

What Is D3.js

D3.js is an open-source JavaScript library for producing dynamic, interactive data visualizations in web browsers using SVG, HTML and CSS.

In addition to D3, there are other popular and powerful libraries such as ECharts and Chart.js. However, they are highly encapsulated, leaving too little room for customization.

November 19, 2020

Practicing Graph Computation With GraphX in Nebula Graph

With the rapid development of network information technology, data is gradually developing towards multi-source heterogeneity. Inside the multi-source heterogeneous data lies countless inextricable relations. And this kind of relations, together with the network structure, are surely essential for data analysis. Unfortunately, when it comes to large scale data analysis, the traditional relational databases are poor in association detection and expressions. Therefore, graph data has attracted great attention for its powerful ability in expressions. Graph computing uses a graph model to express and solve the problem. Graphs can integrate with multi-source data types.

In addition to displaying the static basic features for data, graph computing also finds its chance to display the graph structure and relationships hidden in the data. Thus graph becomes an important analysis tool in social network, recommendation system, knowledge graph, financial risk control, network security, and text retrieval.

November 10, 2020

Automating Your Project Processes with Github Actions

It’s common in both company and personal projects to use tools to deal with replicated tasks to improve efficiency.

This is especially true for the front-end development, because tackling with the repetitive tasks manually like building, deployment, unit testing is rather tedious and time-consuming.

October 28, 2020

How to Reduce Docker Image Size

If there are top ten buzzwords in the technology industry in the year 2019, the container is sure to be one of them. With the popularity of Docker, more and more scenarios are using Docker in the front-end field. This article shows how do we use Docker in the visualization interface of Nebula Graph, a distributed open-source graph database.

Why Using Docker

Docker is widely used in daily front-end development. Nebula Graph Studio (A visualization tool for Nebula Graph) uses Docker based on the following considerations: