Auto_Explain: How to Log Slow Postgres Query Plans Automatically

Do you want to know why a PostgreSQL query is slow? Then EXPLAIN ANALYZE is a great starting point. But query plans can depend on other server activity, can take a while to run, and can change over time, so if you want to see the actual execution plans of your slowest queries, auto_explain is the tool you need. In this post, we’ll look into what it does, how to configure it, and how to use those logs to speed up your queries.

What Is Auto_Explain?

Auto_explain is a PostgreSQL extension that allows you to log the query plans for queries slower than a (configurable) threshold. This is incredibly useful for debugging slow queries, especially those that are only sometimes problematic. It is one of the contribution modules, so it can be installed and configured easily on regular PostgreSQL.

Memoization in Cost-based Optimizers

Query optimization is an expensive process that needs to explore multiple alternative ways to execute the query. The query optimization problem is NP-hard, and the number of possible plans grows exponentially with the query’s complexity. For example, a typical TPC-H query may have up to several thousand possible join orders, 2–3 algorithms per join, a couple of access methods per table, some filter/aggregate pushdown alternatives, etc. Combined, this could quickly explode the search space to millions of alternative plans.

This blog post will discuss memoization — an important technique that allows cost-based optimizers to consider billions of alternative plans in a reasonable time.

Rule-Based Query Optimization

The goal of the query optimizer is to find the query execution plan that computes the requested result efficiently. In this blog post, we discuss rule-based optimization - a common pattern to explore equivalent plans used by modern optimizers. Then we explore the implementation of several state-of-the-art rule-based optimizers. Then we analyze the rule-based optimization in Apache Calcite, Presto, and CockroachDB.

Transformations

A query optimizer must explore the space equivalent execution plans and pick the optimal one. Intuitively, plan B is equivalent to plan A if it produces the same result for all possible inputs.

Assembling a Query Optimizer with Apache Calcite

Introduction

Apache Calcite is a dynamic data management framework with SQL parser, optimizer, executor, and JDBC driver.

Many examples of Apache Calcite usage demonstrate the end-to-end execution of queries using JDBC driver, some built-in optimization rules, and the Enumerable executor. Our customers often have their own execution engines and JDBC drivers. So how to use Apache Calcite for query optimization only, without its JDBC driver and Enumerable executor?

Inside Presto Optimizer

Abstract

Presto is an open-source distributed SQL query engine for big data. Presto provides a connector API to interact with different data sources, including RDBMSs, NoSQL products, Hadoop, and stream processing systems. Created by Facebook, Presto received wide adoption by the open-source world (Presto, Trino) commercial companies (e.g., Ahana, Qubole).

Presto comes with a sophisticated query optimizer that applies various rewrites to the query plan. In this blog post series, we investigate the internals of Presto optimizer. In the first part, we discuss the optimizer interface and the design of the rule-based optimizer.

Custom Traits in Apache Calcite

Abstract

Physical properties are an essential part of the optimization process that allows you to explore more alternative plans.

Apache Calcite comes with convention and collation (sort order) properties. Many query engines require custom properties. For example, distributed and heterogeneous engines that we often see in our daily practice need to carefully plan the movement of data between machines and devices, which requires a custom property to describe data location.

Oracle SQL Performance Plan Review Automation

Why Do We Need a SQL Performance Review?

  1. The current code review process is manual and doesn’t capture the Explain Plan for all modified queries.
  2. Currently, lead devs, along with developers, run Explain Plans manually in Toad/SQL Developer.
  3. To build an automated tool to capture problematic queries from an Explain Plan perspective and reduce manual oversight.
  4. To provide performance audits with data points.

Solution

  • Oracle stores all the SQL database in-page memory and indexes it by SQL ID in gv$sqltext
  • During development (PLSQL/Java/OA Framework/XML Publisher), tag all the desired SQL queries with a code comment (Release Name/User Story Number).
  • Develop a PLSQL program with below features:
    • Analyze the Execution Plan of queries executed by a concurrent program/Java program/OAF code/forms/reports/BI publisher reports.
    • Generate a report with queries which could impact performance. For example, queries with FULL TABLE SCAN, MERGE CARTESIAN JOIN, FULL INDEX SCAN.
    • Capability to categorize queries at User Story, Sprint, Release, and Scrum Team levels based on program input parameters.
    • Capability to analyze queries executed by a concurrent program/package/Java modules.
    • Capture data of relevant queries analyzed in a table for future analysis and dashboards.
    • Store the generated Explain Plan in a database table for audit purposes.
  • In the Oracle E-Business Suite world, this program can be registered as an executable of concurrent programs and assigned to a request group.
  • Before a release migration, the SysAdmin can put together a business process to execute this concurrent program to review the Explain Plan for the SQL queries for that release and catch any trouble making queries well in advance.

Sample Query

SQL
 




x
111


 
1
SELECT DISTINCT obj.object_name
2

          
3
                        ,program_line#
4

          
5
                        , cpu_time / 1000000 AS cpu_time_in_secs
6

          
7
                        , elapsed_time / 1000000 AS elapsed_time_in_secs
8

          
9
                        ,buffer_gets
10

          
11
                        ,disk_reads
12

          
13
                        ,end_of_fetch_count AS rows_fetched_per_execution
14

          
15
                        ,executions
16

          
17
                        ,optimizer_cost
18

          
19
                        ,vsql.sql_id
20

          
21
                        ,NULL operation
22

          
23
                        ,NULL options
24

          
25
                        ,vsplan.plan_hash_value
26

          
27
                        ,sql_text
28

          
29
                        , ( SUBSTR ( DBMS_LOB.SUBSTR ( sql_fulltext
30

          
31
                                                      ,4000
32

          
33
                                                      ,1
34

          
35
                                                     )
36

          
37
                                    , INSTR ( DBMS_LOB.SUBSTR ( sql_fulltext
38

          
39
                                                               ,4000
40

          
41
                                                               ,1
42

          
43
                                                              ), '/*' ) + 2
44

          
45
                                    , INSTR ( DBMS_LOB.SUBSTR ( sql_fulltext
46

          
47
                                                               ,4000
48

          
49
                                                               ,1
50

          
51
                                                              ), '*/' ) - INSTR ( DBMS_LOB.SUBSTR ( sql_fulltext
52

          
53
                                                                                                   ,4000
54

          
55
                                                                                                   ,1
56

          
57
                                                                                                  ), '/*' ) - 2
58

          
59
                                   )
60

          
61
                          ) release_string
62

          
63
                        ,NULL error_flag
64

          
65
                        ,NULL error_message
66

          
67
                        ,loads
68

          
69
                        ,first_load_time
70

          
71
                        ,user_io_wait_time
72

          
73
                        ,rows_processed
74

          
75
                        ,last_load_time
76

          
77
                        ,vsql.module
78

          
79
                        ,fnd_global.user_id created_by
80

          
81
                        ,SYSDATE creation_date
82

          
83
                        ,fnd_global.user_id last_updated_by
84

          
85
                        ,SYSDATE last_update_date
86

          
87
                        ,'-1' last_update_login
88

          
89
         FROM            gv$sql vsql
90

          
91
                        ,gv$sql_plan vsplan
92

          
93
                        ,all_objects obj
94

          
95
         WHERE           1 = 1
96

          
97
         AND             vsql.sql_id = vsplan.sql_id
98

          
99
         AND             vsql.sql_fulltext NOT LIKE '%sql_text%'
100

          
101
         AND             vsql.program_id = obj.object_id(+)
102

          
103
         AND             vsql.sql_fulltext LIKE lv_pattern
104

          
105
         AND             TO_DATE ( last_load_time, 'YYYY-MM-DD/HH24:MI:SS' ) >= ( SYSDATE - p_hours_from / 24 )
106

          
107
         AND             nvl(obj.object_name, '-') = NVL(p_program,nvl(obj.object_name, '-'))
108

          
109
         AND             nvl(vsql.module, '-') = NVL(p_module,nvl(vsql.module, '-'))
110

          
111
         ORDER BY        last_load_time DESC;


fnd_file.put_line(fnd_file.output, 'String..') – This can be used to print the SQL plan in a concurrent program output file.

Sample Output

7 Database Optimization Hacks for Web Developers

Optimizing your database comes with great rewards. Higher performance and increased query efficiency are just a few examples of these benefits.

However, the means aren’t always straightforward and may require changing the rules altogether within a developer team. Furthermore, the examples listed here might not work for your database, based on the system you use. In that case, try to follow the core principle and translate the action into the means your system allows.

SQL Plan Management With TiDB: A Review

graphic

The SQL execution plan is a critical factor that affects SQL statement performance. The stability of the SQL execution plans heavily influences the entire cluster's performance. If a relational database's optimizer chooses a wrong execution plan for a query, it usually has a negative impact on the system; for example, operations might take longer to respond or the database might get overloaded.

We've done a lot of work on optimizer stability for TiDB. However, SQL execution plans are affected by various factors. The execution plan may encounter unanticipated changes. As a result, the execution time might be too long.

Index Advisor Service for Couchbase N1QL (SQL for JSON)

Couchbase N1QL is a SQL-like language for JSON data. To retrieve and manipulate JSON data effectively, we need appropriate indexes. The rules for creating these indexes can be read here. But that involves too much reading, hence we now have an Index Advisor service that accepts a query and gives out an index recommendation that would meet the expectations of the Couchbase query engine — all without downloading the latest Couchbase server.

This service will provide index recommendations to help DBAs, developers, and architects optimize query performance and meet the SLAs.