All you need to know about partitioning (Part 2)

Part 1

We continue the conversation about partitioning. Last time we considered a simple problem on the table, broken into sections for each day — the so-called range partitioning. Breaking a table into sections we are faced with a new challenge — support a partitioned table. We have time to create a new partition to archive old, you modify a trigger insert new data.

I must say at that time I was being disingenuous). Simple solutions yet, and everyone has to invent a Bicycle. In future versions of postgresql must appear in the syntax for creating partitioned tables (as far as I know, the patch already), but for now we will have to do everything ourselves.

Today I want to share the solution that I use to simplify the task of partitioning. We consider the same task as in the first part of the topic.

automating tasks partitioning

To what extent to automate our task, it is necessary to create some helper objects.

sections

In this table we will store our section. master_table — the name of the main table (with the schema), partition_table — the name of the section (with diagram), range_check — limit for this section.
CREATE TABLE public.table_partitions
(
master_table text NOT NULL
partition_table text NOT NULL
range_check text NOT NULL
time_added TIMESTAMP DEFAULT now() NOT NULL
CONSTRAINT table_partitions_primary_key PRIMARY KEY (master_table, partition_table)
);


* This source code was highlighted with Source Code Highlighter.

add new section

Using this function we can add new partition. Will be automatically changed conditions of insertion in the trigger.
CREATE OR REPLACE FUNCTION public.pg_add_range_partition(IN p_master_table text
IN p_partition_table text IN p_range_check text IN p_trigger_function text
OUT status_code text)
RETURNS text AS
$$
DECLARE
v_table_ddl text := 'CREATE TABLE [PARTITION_TABLE] ( CHECK ( [RANGE_CHECK] ) ) INHERITS ([MASTER_TABLE]);';
v_trigger_ddl text := 'CREATE OR REPLACE FUNCTION [TRIGGER_FUNCTION]() RETURNS TRIGGER AS $body$ ' ||
'BEGIN [RANGE_CHECKS] ELSE RAISE EXCEPTION ' ||
"'data Inserted is out of range. Fix [TRIGGER_FUNCTION].''; ' ||
'END IF; RETURN NULL; END; $body$ LANGUAGE plpgsql;';
v_range_checks text := ";
rec record;
BEGIN
IF EXISTS (SELECT 1 FROM public.table_partitions
WHERE master_table = p_master_table
AND partition_table = p_partition_table) THEN
status_code := 'Partition ' || p_partition_table || ' already exists';
RETURN;
END IF;

v_table_ddl := replace(v_table_ddl '[PARTITION_TABLE]' p_partition_table);
v_table_ddl := replace(v_table_ddl '[RANGE_CHECK]' p_range_check);
v_table_ddl := replace(v_table_ddl '[MASTER_TABLE]' p_master_table);

FOR rec IN (SELECT 'ELSIF (' || TX.range_check || ') THEN INSERT INTO ' ||
tp.partition_table || ' VALUES (NEW.*); ' AS range_check
FROM public.table_partitions tp
WHERE tp.master_table = p_master_table
ORDER BY tp.time_added DESC) LOOP
END LOOP;

v_range_checks := 'IF (' || _pg_check_to_trigger(p_master_table, p_range_check) ||
') THEN INSERT INTO ' || p_partition_table ||
' VALUES (NEW.*); ' || v_range_checks;

v_trigger_ddl := replace(v_trigger_ddl '[TRIGGER_FUNCTION]' p_trigger_function);
v_trigger_ddl := replace(v_trigger_ddl '[RANGE_CHECKS]' v_range_checks);

RAISE NOTICE 'script Partition: %' v_table_ddl;
RAISE NOTICE 'Trigger script: %' v_trigger_ddl;

EXECUTE v_table_ddl;
EXECUTE v_trigger_ddl;

INSERT INTO public.table_partitions (master_table, partition_table, range_check)
VALUES (p_master_table, p_partition_table, p_range_check);

status_code := 'OK';
RETURN;
EXCEPTION
WHEN OTHERS THEN
status_code := 'Unexpected error: ' || SQLERRM;
END;
$$ LANGUAGE 'plpgsql';

CREATE OR REPLACE FUNCTION public._pg_check_to_trigger(IN master_table text IN range_check text)
RETURNS text AS
$$
DECLARE
v_schema text := COALESCE(SUBSTRING(master_table FROM E'(.*)\\.')'public');
v_tablename text := replace(master_table, v_schema || '.' ");
v_range_check text := range_check;
rec record;
BEGIN
RAISE NOTICE '%' v_schema;
RAISE NOTICE '%' v_tablename;

FOR rec IN (SELECT column_name
FROM information_schema.columns
WHERE table_schema = v_schema
AND table_name = v_tablename) LOOP
v_range_check := replace(v_range_check, rec.column_name 'NEW.' || rec.column_name);
END LOOP;

RETURN v_range_check;
END;
$$ LANGUAGE 'plpgsql';


* This source code was highlighted with Source Code Highlighter.

Example

1. The first step is to create the master table and the trigger
CREATE TABLE analytics.events
(
event_id BIGINT DEFAULT nextval('analytics.seq_events') PRIMARY KEY
user_id UUID NOT NULL
event_type_id SMALLINT NOT NULL
event_time TIMESTAMP DEFAULT now() NOT NULL
url VARCHAR(1024) NOT NULL
referrer VARCHAR(1024)
ip INET NOT NULL
);

CREATE OR REPLACE FUNCTION analytics.events_insert_trigger()
RETURNS TRIGGER AS
$body$
BEGIN
RETURN NULL;
END; $body$ LANGUAGE plpgsql;

CREATE TRIGGER events_before_insert
BEFORE INSERT ON analytics.events
FOR EACH ROW EXECUTE PROCEDURE analytics.events_insert_trigger();


* This source code was highlighted with Source Code Highlighter.

2. Added section
SELECT *
FROM pg_add_range_partition('analytics.events'
'analytics.events_01012010'
'event_time >= TIMESTAMP ''2010-01-01 00:00:00'' AND event_time < TIMESTAMP ''2010-01-02 00:00:00'"
'analytics.events_insert_trigger');


* This source code was highlighted with Source Code Highlighter.

3. Got a partitioned table, you can start to work with it.
INSERT INTO analytics.events (user_id, event_type_id, event_time, url, referrer, ip)
'http://aymeshkov.habrahabr.ru' 'http://habrahabr.ru' '127.0.0.1'::INET);

* This source code was highlighted with Source Code Highlighter.

Epilogue

1. Of course, the solution I offer is not suitable for everyone, but I think you can modify it for your specific task will not be so difficult.
2. One small point that many make life easier — use generate_series () to understand which sections you need to create.
SELECT '2010-01-01'::DATE + num * '1 day'::INTERVAL AS day
FROM generate_series(0, 29) num


* This source code was highlighted with Source Code Highlighter.

the End of the second part

The topic came volume, mainly due to the large amount of code. Again, do not fit everything I wanted to tell. On the third part of remain following topics:
1. Managing partitions using RULE — how it works and why it's bad.
2. Don't want to bother with a trigger? And do not — use LIST PARTITIONING.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

March Habrameeting in Kiev

PostgreSQL load testing using JMeter, Yandex.Tank and Overload

Monitoring PostgreSQL with Zabbix