表分区

针对大的数据仓库事实表,往往我们可以通过对表进行分区的方式来把一个很大的表拆分成多个子表。这样的话,有两个好处:

  • 查询优化器可以针对分区表进行优化,如果查询只设计到某些分区,则查询计划只需要扫描这些分区,从而加速查询
  • 如果我们按照日期进行分区的话,我们可以简单的加入分区和删除过期的分区。

分区表的目的是为了提高查询性能,但并非所有的表都适合做分区。只有大型事实表、经常使用特定>条件[日期、地区等]查询数据、维护历史数据、数据分布均匀的情况,可通过分区策略大大提升使用>性能。且分区数不宜过多,否则会影响维护和管理工作速度。

OushuDB支持基于Range和List的两种分区方式:

  • Range分区:表被一个或者多个关键列分区成”范围”,这些范围在不同的分区里没有重叠。 比如,依据数值范围进行分区,比如日期,价格等。
  • List分区:依据一个值的列表进行分区,比如地区等。

创建新的分区表使用CREATE TABLE … PATITION BY语法实现,创建分区表步骤如下:

  1. 确定分区类型:范围分区RANGE或列表分区RANGE。
  2. 选定分区字段:范围分区多为日期、数值类型字段;列表分区多为枚举类型的特定列表;
  3. 确定分区级别:OushuDB支持子分区SUBPARTITION,可以按照不同维度实现多级分区。

定义Range分区

比如,假设我们为一个巨大的冰激凌公司构造数据库。该公司每天都测量最高温度,以及每个地区的冰激凌销售。且大多数查询都只会访问最后一周,最后一个月或者最后一个季度的数据,因为这个表的主要用途是为管理准备在线报告。为了减少需要存储的旧数据,我们决定只保留最近三年的有用数据。在每个月的开头,我们都会删除最旧的一个月的数据。因此,我们会选定日期字段作按月的范围分区:

CREATE TABLE measurement (
  city_id int not null,
  logdate date not null,
  peaktemp int,
  unitsales int
)
DISTRIBUTED BY (city_id)
PARTITION BY RANGE (logdate)
(START (date '2020-01-01') INCLUSIVE END (date '2020-02-01') EXCLUSIVE EVERY (INTERVAL '1 month'));

又例如:

# 创建一个sales表,按照date列Range分区,从2008年到2009年每月创建一个分区

postgres=# CREATE TABLE sales (id int, date date, amt decimal(10,2))
PARTITION BY RANGE (date)
( START (date '2008-01-01') INCLUSIVE
   END (date '2009-01-01') EXCLUSIVE
   EVERY (INTERVAL '1 month') );

查看创建的表信息,d+给出该表的所有信息

postgres=# \d+ sales
              Append-Only Table "public.sales"
 Column |     Type      | Modifiers | Storage | Description
--------+---------------+-----------+---------+-------------
 id     | integer       |           | plain   |
 date   | date          |           | plain   |
 amt    | numeric(10,2) |           | main    |
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: f
Child tables: sales_1_prt_1,
              sales_1_prt_10,
              sales_1_prt_11,
              sales_1_prt_12,
              sales_1_prt_2,
              sales_1_prt_3,
              sales_1_prt_4,
              sales_1_prt_5,
              sales_1_prt_6,
              sales_1_prt_7,
              sales_1_prt_8,
              sales_1_prt_9
Has OIDs: no
Options: appendonly=true
Distributed randomly
Partition by: (date)

你也可以显式得声明子分区并指定子表名字。

CREATE TABLE sales_exp (id int, date date, amt decimal(10,2))
PARTITION BY RANGE (date)
( PARTITION Jan08 START (date '2008-01-01') INCLUSIVE ,
  PARTITION Feb08 START (date '2008-02-01') INCLUSIVE ,
  PARTITION Mar08 START (date '2008-03-01') INCLUSIVE ,
  PARTITION Apr08 START (date '2008-04-01') INCLUSIVE ,
  PARTITION May08 START (date '2008-05-01') INCLUSIVE ,
  PARTITION Jun08 START (date '2008-06-01') INCLUSIVE ,
  PARTITION Jul08 START (date '2008-07-01') INCLUSIVE ,
  PARTITION Aug08 START (date '2008-08-01') INCLUSIVE ,
  PARTITION Sep08 START (date '2008-09-01') INCLUSIVE ,
  PARTITION Oct08 START (date '2008-10-01') INCLUSIVE ,
  PARTITION Nov08 START (date '2008-11-01') INCLUSIVE ,
  PARTITION Dec08 START (date '2008-12-01') INCLUSIVE
                   END (date '2009-01-01') EXCLUSIVE );

查看创建的表信息

postgres=# \d+ sales_exp
            Append-Only Table "public.sales_exp"
 Column |     Type      | Modifiers | Storage | Description
--------+---------------+-----------+---------+-------------
 id     | integer       |           | plain   |
 date   | date          |           | plain   |
 amt    | numeric(10,2) |           | main    |
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: f
Child tables: sales_exp_1_prt_apr08,
              sales_exp_1_prt_aug08,
              sales_exp_1_prt_dec08,
              sales_exp_1_prt_feb08,
              sales_exp_1_prt_jan08,
              sales_exp_1_prt_jul08,
              sales_exp_1_prt_jun08,
              sales_exp_1_prt_mar08,
              sales_exp_1_prt_may08,
              sales_exp_1_prt_nov08,
              sales_exp_1_prt_oct08,
              sales_exp_1_prt_sep08
Has OIDs: no
Options: appendonly=true
Distributed randomly
Partition by: (date)

下面是另外一个根据Range分区的例子,这次使用的是整型列进行分区。这里面我们添加了一个DEFAULT PARTITION, 在不满足其他分区的条件下,数据会被插入DEFAULT PARTITION。

CREATE TABLE rank (id int, rank int, year int, gender char(1), count int)
PARTITION BY RANGE (year)
( START (2001) END (2008) EVERY (1),
  DEFAULT PARTITION extra );

定义List分区

列表分区的分区字段可以是任意可进行等于比较的数据类型,且允许多列组合进行分区,而范围分区只允许选定一列作分区字段。列表分区时必须显式为每个分区申明分区值,比如还是上述提到的冰淇淋数据表,我们可以按照地区列表对数据进行分区操作:

CREATE TABLE magma_measurement_city (
  city text,
  logdate date,
  peaktemp int,
  unitsales int
) format 'magmaap'
PARTITION BY LIST (city)
( PARTITION beijing VALUES ('BJ'),
PARTITION shanghai VALUES ('SH'),
PARTITION guangzhou VALUES ('GZ'),
PARTITION shenzhen VALUES ('SZ'),
DEFAULT PARTITION other );

又如下面的例子创建了一个基于List的分区表。List分区表可以基于任意支持等值比较的数据类型。对与List分区,你需要 显式的指定所有子分区。

CREATE TABLE rank (id int, rank int, year int, gender char(1), count int )
  PARTITION BY LIST (gender)
  ( PARTITION girls VALUES ('F'),
   PARTITION boys VALUES ('M'),
   DEFAULT PARTITION other );

查看表信息

postgres=# \d+ rank
              Append-Only Table "public.rank"
 Column |     Type     | Modifiers | Storage  | Description
--------+--------------+-----------+----------+-------------
 id     | integer      |           | plain    |
 rank   | integer      |           | plain    |
 year   | integer      |           | plain    |
 gender | character(1) |           | extended |
 count  | integer      |           | plain    |
Compression Type: None
Compression Level: 0
Block Size: 32768
Checksum: f
Child tables: rank_1_prt_boys,
              rank_1_prt_girls,
              rank_1_prt_other
Has OIDs: no
Options: appendonly=true
Distributed randomly
Partition by: (gender)

定义多级分区

多级分区使用PARTITION BY和SUBPARTITION BY实现,使用SUBPARTITION TEMPLATE可以保证所有的子分区都一样,包括后续新增的子分区。还是上述实例,我们可以将范围分区和列表分区进行组合,使用多级分区实现:

CREATE TABLE mult_measurement (
  city text,
  logdate date,
  peaktemp int,
  unitsales int
) WITH (APPENDONLY = true, OIDS = FALSE, ORIENTATION = orc)
PARTITION BY RANGE (logdate)
SUBPARTITION BY LIST (city)
SUBPARTITION TEMPLATE
(SUBPARTITION beijing VALUES ('BJ'),
 SUBPARTITION shanghai VALUES ('SH'),
 SUBPARTITION guangzhou VALUES ('GZ'),
 SUBPARTITION shenzhen VALUES ('SZ'),
 DEFAULT SUBPARTITION other)
(START (date '2020-01-01') INCLUSIVE END (date '2020-02-01') EXCLUSIVE EVERY (INTERVAL '1 month'));

或者按照年、月进行多级分区组合:

CREATE TABLE ym_measurement (
  city_id int not null,
  r_year int,
  r_month int,
  peaktemp int,
  unitsales int
) format 'magmaap'
PARTITION BY RANGE (r_year)
SUBPARTITION BY RANGE (r_month)
SUBPARTITION TEMPLATE
(START (1) END (13) EVERY (1),DEFAULT SUBPARTITION other_months)
(START (2017) END (2020) EVERY (1), DEFAULT PARTITION other_years);

由或对上述三个条件进行组合可创建三级分区:

CREATE TABLE mult_measurement_3 (
  city text,
  r_year int,
  r_month int,
  peaktemp int,
  unitsales int
) with (appendonly=true, orientation=parquet)
PARTITION BY RANGE (r_year)
SUBPARTITION BY RANGE (r_month)
  SUBPARTITION TEMPLATE
    (START (1) END (13) EVERY (1),DEFAULT SUBPARTITION other_months)
    SUBPARTITION BY LIST (city)
      SUBPARTITION TEMPLATE (
        SUBPARTITION beijing VALUES ('BJ'),
        SUBPARTITION shanghai VALUES ('SH'),
        SUBPARTITION guangzhou VALUES ('GZ'),
        SUBPARTITION shenzhen VALUES ('SZ'),
        DEFAULT SUBPARTITION other)
(START (2017) END (2020) EVERY (1), DEFAULT PARTITION other_years);

Note

AO格式,PARQUET格式,ORC格式,MAGMA格式的表都支持分区(包括多级分区)。

Tip

当你在使用多级分区的时候,系统会产生大量的小表,有些表可能没有数据或包含很少数据,这样会对系统元数据管理产生过多压力。 建议不要创建具有过多分区的表。一般限制分区数在100或以内比较合理。

分区现有表

如果你想对一张已有大表数据进行分区,步骤如下:

  1. 按照现有大表结构创建一个新的分区表
  2. 将现有大表中的数据load到新创建的分区表中
  3. 删除原来的大表
  4. 按照原来大表名称重命名分区表名称

假设上述已有表measure,先需要进行分区,操作如下:

CREATE TABLE measurement_cp (like measurement)
PARTITION BY RANGE (logdate)
(START (date '2020-01-01') INCLUSIVE END (date '2020-04-01') EXCLUSIVE EVERY (INTERVAL '1 month'));

INSERT INTO measurement_cp SELECT * FROM measurement;

DROP TABLE measurement;

ALTER TABLE measurement_cp RENAME TO measurement;

ANALYZE measurement;

加载分区表

创建分区表后,主表中没有数据,一般不在这个父表定义任何检查约束,除非你希望约束同样也适用于所有分区。数据会路由保存到各个分区子表中。在多级分区中,只有最底层的子分区才保存数据。

一般使用默认分区接收无法映射到子表分区中的数据,任何与分区检查约束不匹配的数据会被加载到默认分区表中。查询优化器会在运行过程中扫描整个分区表的层次结构,并使用检查约束来确定要扫描哪些子分区。默认分区总会被扫描,所以默认分区中数据多是会影响扫描性能。

当使用COPY或INSERT加载数据到主表时,数据会自动路由保存到正确的分区表中。

查看你的分区设计

你可以通过pg_partitions视图来查看你的分区表设计。例如通过下面的语句可以查看出sales表的分区设计。

postgres=# SELECT partitionboundary, partitiontablename, partitionname, partitionlevel, partitionrank
postgres-# FROM pg_partitions
postgres-# WHERE tablename='sales';
                                           partitionboundary                                          | partitiontablename | partitionname | partitionlevel | partitionrank
------------------------------------------------------------------------------------------------------+--------------------+---------------+----------------+---------------
 START ('2008-01-01'::date) END ('2008-02-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_1      |               |              0 |             1
 START ('2008-02-01'::date) END ('2008-03-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_2      |               |              0 |             2
 START ('2008-03-01'::date) END ('2008-04-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_3      |               |              0 |             3
 START ('2008-04-01'::date) END ('2008-05-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_4      |               |              0 |             4
 START ('2008-05-01'::date) END ('2008-06-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_5      |               |              0 |             5
 START ('2008-06-01'::date) END ('2008-07-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_6      |               |              0 |             6
 START ('2008-07-01'::date) END ('2008-08-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_7      |               |              0 |             7
 START ('2008-08-01'::date) END ('2008-09-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_8      |               |              0 |             8
 START ('2008-09-01'::date) END ('2008-10-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_9      |               |              0 |             9
 START ('2008-10-01'::date) END ('2008-11-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_10     |               |              0 |            10
 START ('2008-11-01'::date) END ('2008-12-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_11     |               |              0 |            11
 START ('2008-12-01'::date) END ('2009-01-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | sales_1_prt_12     |               |              0 |            12
(12 rows)

又例如:

SELECT partitionboundary, partitiontablename, partitionname,partitionlevel, partitionrank FROM pg_partitions WHERE tablename='measurement';

                                        partitionboundary                                           | partitiontablename  | partitionname | partitionlevel | partitionrank
------------------------------------------------------------------------------------------------------+---------------------+---------------+----------------+---------------
 START ('2020-01-01'::date) END ('2020-02-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | measurement_1_prt_1 |               |              0 |             1
 START ('2020-02-01'::date) END ('2020-03-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | measurement_1_prt_2 |               |              0 |             2
 START ('2020-03-01'::date) END ('2020-04-01'::date) EVERY ('1 mon'::interval) WITH (appendonly=true) | measurement_1_prt_3 |               |              0 |             3

通过查询pg_partition_templates查看使用子分区模板创建的子分区信息,例如:

select * from pg_partition_templates where tablename = 'mult_measurement_3';

 schemaname |     tablename      | partitionname | partitiontype | partitionlevel | partitionrank | partitionposition | partitionlistvalues | partitionrangestart | partitionstartinclusive | partitionrangeend | partitionendinclusive | partitioneveryclause | partitionisdefault |          partitionboundary
------------+--------------------+---------------+---------------+----------------+---------------+-------------------+---------------------+---------------------+-------------------------+-------------------+-----------------------+----------------------+--------------------+-------------------------------------
 public     | mult_measurement_3 | other_months  | range         |              1 |             1 |                 1 |                     |                     | f                       |                   | f                     |                      | t                  | DEFAULT SUBPARTITION other_months
 public     | mult_measurement_3 |               | range         |              1 |             2 |                 2 |                     | 1                   | t                       | 2                 | f                     | 1                    | f                  | START (1) END (2) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |             3 |                 3 |                     | 2                   | t                       | 3                 | f                     | 1                    | f                  | START (2) END (3) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |             4 |                 4 |                     | 3                   | t                       | 4                 | f                     | 1                    | f                  | START (3) END (4) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |             5 |                 5 |                     | 4                   | t                       | 5                 | f                     | 1                    | f                  | START (4) END (5) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |             6 |                 6 |                     | 5                   | t                       | 6                 | f                     | 1                    | f                  | START (5) END (6) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |             7 |                 7 |                     | 6                   | t                       | 7                 | f                     | 1                    | f                  | START (6) END (7) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |             8 |                 8 |                     | 7                   | t                       | 8                 | f                     | 1                    | f                  | START (7) END (8) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |             9 |                 9 |                     | 8                   | t                       | 9                 | f                     | 1                    | f                  | START (8) END (9) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |            10 |                10 |                     | 9                   | t                       | 10                | f                     | 1                    | f                  | START (9) END (10) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |            11 |                11 |                     | 10                  | t                       | 11                | f                     | 1                    | f                  | START (10) END (11) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |            12 |                12 |                     | 11                  | t                       | 12                | f                     | 1                    | f                  | START (11) END (12) EVERY (1)
 public     | mult_measurement_3 |               | range         |              1 |            13 |                13 |                     | 12                  | t                       | 13                | f                     | 1                    | f                  | START (12) END (13) EVERY (1)
 public     | mult_measurement_3 | beijing       | list          |              2 |               |                 1 | 'BJ'::text          |                     |                         |                   |                       |                      | f                  | SUBPARTITION beijing VALUES('BJ')
 public     | mult_measurement_3 | shanghai      | list          |              2 |               |                 2 | 'SH'::text          |                     |                         |                   |                       |                      | f                  | SUBPARTITION shanghai VALUES('SH')
 public     | mult_measurement_3 | guangzhou     | list          |              2 |               |                 3 | 'GZ'::text          |                     |                         |                   |                       |                      | f                  | SUBPARTITION guangzhou VALUES('GZ')
 public     | mult_measurement_3 | shenzhen      | list          |              2 |               |                 4 | 'SZ'::text          |                     |                         |                   |                       |                      | f                  | SUBPARTITION shenzhen VALUES('SZ')
 public     | mult_measurement_3 | other         | list          |              2 |               |                 5 |                     |                     |                         |                   |                       |                      | t                  | DEFAULT SUBPARTITION other

查看分区表使用的分区键列通过查询pg_partition_columns获取:

select * from pg_partition_columns where tablename = 'measurement';

 schemaname |  tablename  | columnname | partitionlevel | position_in_partition_key
------------+-------------+------------+----------------+---------------------------
 public     | measurement | logdate    |              0 |                         1