Friends, PG Day'16 Russia successfully completed, we have translated the spirit and already thinking of how to make the upcoming event even more interesting and useful for you. We continue to publish interesting, in our opinion, of Postgres and communicate with you in the comments. Today we present the translation of the article by Pat Shaughnessy about what are indexes in PostgreSQL.

We all know that indexes are one of the most powerful and important functions of relational database servers. How to find the value? To create the index. What you need to remember to do when you merge two tables? To create the index. How to speed up an SQL query, which slowly began to work? To create an index.

But what are these indexes? And they accelerate the database search? To find out, I decided to read the source code of the PostgreSQL database server in C and see how it looks for the index for the simple text value. I expected to find sophisticated algorithms and efficient data structures. And I found them. Today I will show you how to look inside the indices of Postgres, and explain how they work.

What I didn't expect to find it — which I first discovered by reading the source code of Postgres is the theory of computer science in the basis of what he's doing. Reading the source code of Postgres turned to return to school and study that subject that I never have enough time in his youth. Comments on C inside Postgres explain not only what he does, but why.

the

Sequential scan: mindless search

When we left the crew of the Nautilus, they were exhausted and almost fainted: algorithm sequential scan of Postgres mindlessly dodged all the records in the table users! Remember my previous post, in which we performed this simple SQL query to find the Captain Nemo

Postgres processed, analyzed and planned the query. Then ExecSeqScanthe C function inside Postgres that performs site plan Sequential scan (SEQSCAN), quickly found Captain Nemo:

But then Postgres inexplicably continued to loop through the entire table of users, comparing each name with “Captain Nemo,” although we already found what I was looking for!

Imagine that our table would have millions of records, the process would take a very long time. Of course, we could avoid this by removing the sort and rewrite our query so that only the first name, but the deeper problem lies in the inefficiency of the way that Postgres searches for our target string. Use sequential scan to compare each value in the table of users with “Captain Nemo” is a slow, inefficient and depends on the random order in which names appear in the table. What are we doing wrong? There must be a better way!

The answer is simple: we forgot to create the index. Let's do it now.

the

creating the index

To create an index is very simple – you only have to run this command:

As Ruby developers, we certainly would have used instead, migration add_index ActiveRecord, which will execute the same command CREATE INDEX “under the hood”. When we re-run our select query, Postgres, as usual, will create a tree plan, but this time it will be a little different:

Creating the index solved our performance issue, but has left us with many interesting unanswered questions:

the

that represents the index in Postgres? the

If I could climb into a database Postgres and get a better look at the index, what would it have been like?

How the index speeds up the search?

Let's try to answer these questions by studying the source code of Postgres.

the

What is the index in Postgres?

We can start with learning documentation for the CREATE INDEX command.

Here you can see all the options we can use to create an index such as UNIQUE and CONCURRENTLY. Please note that there is another option, such as a method USING. He said Postgres what kind of index we need. Below on the same page there is information about method argument to a keyword USING:

It turns out that Postgres implementeret four different types of indices in [Prim. lane: now more article was written before there was BRIN, and other new options indexes]. You can use them for different data types in different situations. Since we did not specify USING our index_users_on_name index is “btree” (or B-Tree) index, the default type.

This is our first hint: the index of Postgres is a B-Tree. But what is B-Tree? Where can we find him? Inside Postgres, of course! Let's search the source code of Postgres C files containing “btree:”

The key outcome highlighted in bold: “./backend/access/nbtree.” Inside this directory there is a README file. Let's read:

miraculously, this README file was a detailed document on 12 pages! The source code of Postgres contains not only the useful and interesting comments to the code, but also documentation about the theory and implementation of database servers. Read and understand the code in open source is often difficult and scary, but not in PostgreSQL. The developers behind Postgresol, made a huge effort that we could understand their work.

The name of the document README — “Btree Indexing” — confirms that the directory contains C code that implements B-Tree indexes Postgres. But the first sentence is of even more interest: this is a reference to scientific work, which explains what a B-Tree, and how the indexes Postgres: Efficient Locking for Concurrent Operations on B-Trees, authored by Lehman (Lehman) and Yao (Yao).

We will try to understand what a B-Tree with the help of this scientific work.

the

How does a B-Tree index?

The work of Lehman and Yao explains the innovative changes they made to the algorithm B-Tree in 1981. We'll talk about that later. But they start with a simple introduction to data structure B-Tree, which was invented 9 years earlier — in 1972. One of their diagrams shows an example of a simple B-Tree:

The term B-Tree is an abbreviation for "balanced tree" — "balanced tree". The algorithm makes the search simple and fast. For example, if we wanted to find the value of 53 in this example, we would begin with the root node containing the value of 40 is:

We compare our search value 53 value that we found in the tree node. 53 is more or less than 40? Because 53 is greater than 40, we follow the pointer down and to the right. If we were looking for 29, we would go down to the left. Pointers to the right lead to high values, and the left — to the smaller ones.
Following the down pointer to the next child node of the tree we find a node that contains 2 values:

This time we compare 53 from 47 and 62 and find that 47 < 53 < 62. Please note that the values in the tree node are sorted, so this would be easy. Now we follow down through the main index.

Here we have another tree node, with three values:

After reviewing the sorted list of numbers, we find 51 < 53 < 56 and follow down to the second of the four pointers.

Finally, we come to the node a leaf of the tree:

And here it is, our search value 53!

Algorithm B-Tree accelerates the search because:

the

it sorts the values (called keys) within each site;
it is balanced: the keys evenly among the nodes, minimizing the number of transitions from one node to another. Each pointer leads to the child node that contains approximately the same number of keys as each successive child node.

the

looks Like the index of Postgres?

Lehman and Yao drew this diagram for more than 30 years ago. What does it have to do with how Postgres working today? Amazingly, the index index_users_on_name that we created earlier, is very similar to the diagram from scientific work: we have created in 2014 index, which looks exactly the same as the chart from 1981!

When we executed the command CREATE INDEX, Postgres kept all the names from our users table in B-Tree. They were the keys of the tree. Here is a node within a B-Tree index Postgres:

Each entry in the index consists of C structures called IndexTupleData followed by a bitmap (bitmap) and a value. Postgres uses a bitmap to record whether any attributes of the index key is NULL, to save space. The actual values are in the index after the bitmap.

Let's take a closer look at the structure IndexTupleData:

The image above shows that each structure IndexTupleData contains:

the

t_tid: this is a pointer to either another index tuple or record in the database. Note that this is not a pointer to physical memory. Instead, it contains numbers that Postgres can use to find the value on the memory pages.
t_info: this contains information about the elements of the index, for example, how many meanings can it have and can they be equal to null.

To understand this better, let's show some recordings from our index index_users_on_name:

I replaced the value with some names from my users table. The top node of the tree contains the key “Dr. Edna Kunde” and “Julius Powlowski” and the lower “Julius Powlowski” and “Juston Quitzon”. Please note that unlike the chart of Lehman and Yao, Postgres repeats the parent key in each child node. Here “Julius Powlowski” is the key at the top and child nodes. Pointer t_tid at the top node refers to the same named Julius in the bottom node.

To learn more about how Postgres stores the key values in the node B-Tree, refer to the header file itup.h:

the

IndexTupleData

view on postgresql.org

the

Search site B-Tree that contains Captain Nemo

Let's return to our original SELECT query:

This is the root node B-Tree index_users_on_name. I put the tree on its side, so the names got. You can see 4 and one NULL. Postgres created this root node when I created index_users_on_name. Note that in addition to the first NULL, which denotes the beginning index, the remaining 4 values more or less evenly distributed in alphabetical order.

Recall that B-Tree is a balanced tree. In this example, B-Tree has 5 child nodes:

the

the names alphabetically until Dr. Edna Kunde;
names that are among Dr. Edna Kunde and Julius Powlowski;
names located between Julius Powlowski and Monte Nicolas;
, etc.

Since we are looking for the name of Captain Nemo, Postgres clicks on the upper arrow to the right. This is because Captain Nemo alphabetically comes before Dr. Edna Kunde:

As you can see, right Postgres found node B-Tree, which contains Captain Nemo. For my test I added in the user table 1000 names. This child node B-Tree includes about 200 names (240, to be exact). So the algorithm of B-Tree is significantly narrowed down our search to Postgres.

To learn more about the specific algorithm used by Postgres to find target node B-Tree of all nodes read function _bt_search.

the

_bt_search

view on postgresql.org

the

the Search for Captain Nemo inside a specific node B-Tree

Now that Postgres narrowed the space for the search to node B-Tree, containing about 200 names, he still needs to find one of Captain Nemo. How is he going to do? Whether he will use a sequential scan to that shortened list?

No. To search for a key value inside of a tree node Postgres switches to use a binary search algorithm. He begins to compare the key located at the 50% position in the tree node with “Captain Nemo”

Since Captain Nemo alphabetically comes after Breana Witting, Postgres jumps to the position of 75% and conducts another comparison:

At this time, Captain Nemo goes to Curtis Wolf, so Postgres returns a bit ago. After another few iterations (Postgres took 8 comparisons to find the Captain Nemo in my example), Postgres finally finds what we were looking for.

To learn more about how Postgres searches for a value in a particular node B-Tree, read the _bt_binsrch function:

the

_bt_binsrch

view on postgresql.org

the

Still have much to learn

I don't have enough space in this post to tell you about all the exciting details regarding B-Tree indexes database or the internals of Postgres... maybe I should write a book Postgres under the microscope :) in the meantime here's some fun facts from the theory which you can read in Efficient Locking for Concurrent Operations on B-Trees or other scholarly work referenced.

the

Deletion from B-Tree: reverse operation is also interesting. When the key is removed from the node, Postgres consolidates sibling nodes, if possible, removing the key from their parent node. This operation can also be recursive.
B-Link - Tree: the Work of Lehman and Yao talks about innovation, which they researched in relation to concurrency and locking when multiple threads use the same B-Tree. As you remember, the code and algorithms of Postgres needs to be multithreaded because many clients can simultaneously search for and modify the same index. Adding even one pointer from each node of B-Tree to the next child node, the so – called "right-arrow" — one thread can search in the tree, even if the second stream divides the nodes without blocking the entire code:

the

don't be afraid to explore the invisible part of the iceberg

Professor Aronnax risked his life and career to find the elusive Nautilus and join Captain Nemo in a long series of stunning underwater adventure. We should do the same: don't be afraid to dive under water to the depth of the tools, languages and technologies that you use every day. You can know many things about Postgres, but do you know how it works from the inside? Take a look inside, and before you know it, you will find yourself in an underwater adventure.

Studying computer science at work behind our app is not just entertainment, but an important part of the development process of the developer. Tools for software development improves every year, writing web sites and mobile apps, it's easy, but we must not lose sight of the fundamental science upon which we depend. We all stand on the shoulders of giants – people like Lehman and Yao, as well as open source developers, who used their theory to create Postgres. Do not take the tools you use every day for granted, and learn their device. You will become wiser as a developer and will find ideas and knowledge, which was not even suspected.

Article based on information from habrahabr.ru

Поиск по этому блогу

computer express

Informatics for index Postgres

Sequential scan: mindless search

creating the index

What is the index in Postgres?

How does a B-Tree index?

looks Like the index of Postgres?

IndexTupleData

Search site B-Tree that contains Captain Nemo

_bt_search

the Search for Captain Nemo inside a specific node B-Tree

_bt_binsrch

Still have much to learn

don't be afraid to explore the invisible part of the iceberg

Комментарии

Отправить комментарий

Популярные сообщения из этого блога

A Bunch Of MODx Revolution + LiveStreet

Monitoring PostgreSQL with Zabbix

MODX Revolution meets Fenom