Release Notes

Oushu Database 3.1.2 new features (Jun 6, 2018)

Oushu Database 3.1.2 has the following enhancements to Oushu Database 3.1.1:

  • Support &&/@>/<@ for bigint[] with new executor.
  • Add hive protocol, enable accessing hive table directly by using hawq, only support csv, text, orc format.
OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3
Redhat/Centos 7.4

Upgrade from 3.1.1 to 3.1.2

Upgrading from 3.1.1 to 3.1.2 only needs binary replacement. There are no catalog and storage format changes. Please note that before upgrade, please backup your configurations, and copy back your configuration files after replacing the binary.

Oushu Database 3.1.1 new features (Mar 28, 2018)

Oushu Database 3.1.1 has the following enhancements to Oushu Database 3.1.0:

OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3

Upgrade from 3.1.0 to 3.1.1

Upgrading from 3.1.0 to 3.1.1 only needs binary replacement. There are no catalog and storage format changes. Please note that before upgrade, please backup your configurations, and copy back your configuration files after replacing the binary.

Oushu Database 3.1.0 new features (Mar 6, 2018)

Oushu Database 3.1.0 has the following enhancements to Oushu Database 3.0.1:

  • Support limit operator for new executor.
  • Support sort operator for new executor.
  • Support max/min aggregate function for new executor.
  • Add built-in euclidean metric function for new executor.
  • Improve performance for table scan with new executor.
  • Improve performance for aggregate function with new executor.
  • Support dictionary encoding for text column of ORC table.
  • Fix the compatible issues with apache ORC formats.
  • Support more data types for ORC tables: bool, char, varchar, bytea, date, time, timestamp, array.
  • Support parallel read & write for ORC tables.
  • Support ORC partitioned tables.
  • Key bug fixes.
    • oushu-17 : Coredump issue when querying ORC large table with text column
    • oushu-22 : Group by performance issue
    • oushu-350 : Hdfs external table nameservice is case sensitive.
    • oushu-374 : Cancel in new executor has adverse effect on next query
    • oushu-421 : Memory leak in new executor
OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3

Upgrade from 3.0.1 to 3.1.0

  1. Stop 3.0.1 cluster on master node
sudo su gpadmin
hawq stop cluster -a
  1. Backup 3.0.1 configuration on each hawq node
hawq ssh -f hostfile -e 'cp -rf $GPHOME/etc ~/'
  1. Setup 3.1.0 yum repository on each hawq node

3.1 If the system contains command avx, please configure the following YUM repo:

hawq ssh -f hostfile -e 'sudo wget -O /etc/yum.repos.d/oushu-database.repo http://yum.oushu-tech.com/oushurepo/yumrepo/release/oushu-database/centos7/3.1.0.0/release/oushu-database.repo'
hawq ssh -f hostfile -e 'sudo wget -O /etc/yum.repos.d/oushu-database-utils.repo http://yum.oushu-tech.com/oushurepo/yumrepo/oushu-database-utils/centos7/1.1.0.0/oushu-database-utils.repo'
hawq ssh -f hostfile -e 'sudo yum makecache'

3.2 If the system does not contain command avx, please configure the following YUM repo:

hawq ssh -f hostfile -e 'sudo wget -O /etc/yum.repos.d/oushu-database.repo http://yum.oushu-tech.com/oushurepo/yumrepo/release/oushu-database/centos7/3.1.0.0/noavx/oushu-database.repo'
hawq ssh -f hostfile -e 'sudo wget -O /etc/yum.repos.d/oushu-database-utils.repo http://yum.oushu-tech.com/oushurepo/yumrepo/oushu-database-utils/centos7/1.1.0.0/oushu-database-utils.repo'
hawq ssh -f hostfile -e 'sudo yum makecache'
  1. Install 3.1.0 rpm package on each hawq node
hawq ssh -f hostfile -e 'sudo yum remove -y hawq'
sudo yum install -y hawq
hawq ssh -f hostfile -e 'sudo yum install -y hawq'
  1. Configure 3.1.0 on each node
hawq ssh -f hostfile -e 'cp -rf ~/etc/* $GPHOME/etc/'
  1. Install array distance functions (Only support old executor for array distance functions after upgrade now. If you want to use them with new executor, please reinitialize cluster.)

    1) Configure the cluster to upgrade mode

    hawq start cluster
    hawq config -c upgrade_mode -v on --skipvalidation
    hawq restart cluster -a
    

    2) Install array distance functions in pg_catalog.pg_proc table in template1 on master node. Please find $hawq_master_address_port in $GPHOME/etc/hawq-site.xml

    PGOPTIONS='-c gp_session_role=utility' psql -a -p $hawq_master_address_port -d template1 -f $GPHOME/share/postgresql/array_distance_install.sql > array_distance_install.out 2>&1
    

    3) Install array distance functions in pg_catalog.pg_proc table in template1 on each segment node. Please find $hawq_segment_address_port in $GPHOME/etc/hawq-site.xml

    source /usr/local/hawq/greenplum_path.sh
    PGOPTIONS='-c gp_session_role=utility' psql -a -p $hawq_segment_address_port -d template1 -f $GPHOME/share/postgresql/array_distance_install.sql > array_distance_install.out 2>&1
    

    4) Configure the cluster to normal mode

    hawq config -c upgrade_mode -v off --skipvalidation
    hawq restart cluster -a
    

    5) Get user database name on master node

    psql -a -d template1 -c "select datname from pg_database where datname not in ('hcatalog', 'template0', 'template1') order by datname;"
    

    6) Install array distance functions in pg_catalog.pg_proc table in each user database on master node

    psql -a -d $user_database_name -f $GPHOME/share/postgresql/array_distance_install.sql > array_distance_install.out 2>&1
    

    7) Restart cluster on master node

    hawq restart cluster -a
    

Oushu Database 3.0.1 new features (Oct 30, 2017)

Oushu Database 3.0.1 has the following enhancements to Oushu Database 3.0:

  • Support CSV, TEXT and ORC as internal table formats. The new executor released in 3.0 only supports ORC format, so internal ORC tables are added to let users use the new executor in more scenarios.
  • Before 3.0.1, for an external table created with default parameters, it is readonly. 3.0.1 change it to readable and writable. It is because more and more users create tables that need to be readable and writable.
  • The “new_executor” GUC value is changed from “auto” to “on” by default. With this change, users will receive explicit errors on ORC tables when using functions not implemented in the new executor. If users want to switch automatically to old executor when using functions not supported, the GUC can be set to “auto”.
  • Key bug fixes.
    • oushu-8 : Fix potential coredump issue when reading ORC file with stripe in dictionary encoding
    • oushu-9 : Fix wrong result issue when using count() function with a scalar parameter
    • oushu-10: Fix potential coredump issue when using avg() function
    • oushu-11: Copy udt reports type not found in segment
    • oushu-12: Fix potential coredump issue when reading ORC file with null values in string column
    • oushu-13: QD hung when error occurs in query dispatch

The following list gives the operating systems supported by Oushu Database:

OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3

Support CSV, TEXT and ORC as internal table formats

This section shows how to use the new formats supported for internal tables.

Create internal ORC, TEXT and CSV tables

1
2
3
CREATE TABLE test_orc  (i int, j float) format 'orc';
CREATE TABLE test_text (i int, j float) format 'text';
CREATE TABLE test_csv  (i int, j float) format 'csv';

The location of the internal table is:

1
hdfs://hawq_dfs_url/tablespacename/dbname/schemaname/tablename

where hawq_dfs_url is the value specified in hawq_site.xml

Delete the internal table

1
drop table test_orc;

After dropping the internal table, the storage files are also deleted automatically. This is different from an external table. For external tables, dropping them only drop the metadata, and the external files are kept.

For external tables created by default, the behaviors change from readable to readable and writable

The following example shows the change when creating an external table.

1
2
3
// create an writable external table
CREATE EXTERNAL TABLE test (i int, j float) location ('hdfs://localhost:9000/hawq_default/test');
INSERT INTO test VALUES (1,0.1);

Upgrade from 3.0 to 3.0.1

Upgrading from 3.0 to 3.0.1 only needs binary replacement. There are no catalog and storage format changes. Please note that before upgrade, please backup your configurations, and copy back your configuration files after replacing the binary.

Oushu Database 3.0 new features (Sep 20, 2017)

Oushu Database 3.0 is a major release. The major feature of this release is the brand new executor which is about 10 times faster than old executor. The new executor makes full use of new hardware instructions, such as SIMD et al.

The executor of Oushu Database 2.2 is based on Apache HAWQ executor, and Apache HAWQ executor comes from greenplum database, which in turn comes from PostgreSQL. It is a very mature executor and has been optimized for many years. To achive 10x performance is very challenging. And it takes Oushu development team many months make the breakthrough.

More specifically, Oushu Database 3.0 delivers the following new features:

  • Brand new executor. Basic operators, such as scan, projection, filter and aggregation are supported. It is about 10x times faster than old executor. The operators not supported will fallback to old executor.
  • Support ORC as an external storage format. Used together with the new executor, the performance is about 10X-50X times faster.
  • Support a new pluggable storage framework. Users only need to write several functions to add a new storage as a format to HAWQ.
  • Support PostGIS which provides the capability to store and query spatial objects.
  • Key bug fixes.

The following list gives the operating systems supported by Oushu Database:

OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3

Try new executor

This section shows the basic usage of the new executor.

Create lineitem table and generate TPCH lineitem data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
CREATE EXTERNAL WEB TABLE E_LINEITEM ( L_ORDERKEY    INT8 ,
                                       L_PARTKEY     INTEGER ,
                                       L_SUPPKEY     INTEGER ,
                                       L_LINENUMBER  INTEGER ,
                                       L_QUANTITY    FLOAT ,
                                       L_EXTENDEDPRICE  FLOAT ,
                                       L_DISCOUNT    FLOAT ,
                                       L_TAX         FLOAT ,
                                       L_RETURNFLAG  VARCHAR(1) ,
                                       L_LINESTATUS  VARCHAR(1) ,
                                       L_SHIPDATE    TEXT ,
                                       L_COMMITDATE  TEXT ,
                                       L_RECEIPTDATE TEXT ,
                                       L_SHIPINSTRUCT CHAR(25) ,
                                       L_SHIPMODE     VARCHAR(10) ,
                                       L_COMMENT      VARCHAR(44) )
                                       EXECUTE 'bash -c "$GPHOME/bin/dbgen -b $GPHOME/bin/dists.dss -T L -s 1 -N 6 -n $((GP_SEGMENT_ID + 1))"'
                                       on 6 format 'text' (delimiter '|');

Create ORC table. “localhost:9000” is the HDFS url.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
CREATE WRITABLE EXTERNAL TABLE lineitem
                            ( L_ORDERKEY    INT8,
                              L_PARTKEY     INTEGER,
                              L_SUPPKEY     INTEGER,
                              L_LINENUMBER  INTEGER,
                              L_QUANTITY    FLOAT,
                              L_EXTENDEDPRICE  FLOAT,
                              L_DISCOUNT    FLOAT,
                              L_TAX         FLOAT,
                              L_RETURNFLAG  TEXT,
                              L_LINESTATUS  TEXT,
                              L_SHIPDATE    TEXT,
                              L_COMMITDATE  TEXT,
                              L_RECEIPTDATE TEXT,
                              L_SHIPINSTRUCT TEXT,
                              L_SHIPMODE     TEXT,
                              L_COMMENT      TEXT)
LOCATION ('hdfs://localhost:9000/hawq_default/lineitem')
FORMAT 'orc';

Insert data:

1
INSERT INTO lineitem SELECT * FROM e_lineitem;

Use new executor (Note: the new executor currently only support ORC format. features not supported will fallback to old executor automatically)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
-----Get row count------
postgres=# set new_executor = on;
SET
postgres=# SELECT COUNT(*) FROM lineitem;
count
---------
6001215
(1 row)
Time: 17.006 ms

postgres=# set new_executor = off;
SET
postgres=# SELECT COUNT(*) FROM lineitem;
count
---------
6001215
(1 row)
Time: 213.248 ms

-----TPCH Q1 ------
postgres=# set new_executor = on;
SET
postgres=#  SELECT
        l_returnflag,
        l_linestatus,
        sum(l_quantity)::bigint as sum_qty,
        sum(l_extendedprice)::bigint as sum_base_price,
        sum(l_extendedprice * (1 - l_discount))::bigint as sum_disc_price,
        sum(l_extendedprice * (1 - l_discount) * (1 + l_tax))::bigint as sum_charge,
        avg(l_quantity)::bigint as avg_qty,
        avg(l_extendedprice)::bigint as avg_price,
        avg(l_discount)::bigint as avg_disc,
        count(*) as count_order
FROM
        lineitem
WHERE
        l_shipdate <= '1998-08-20'
GROUP BY
        l_returnflag,
        l_linestatus;

l_returnflag | l_linestatus | sum_qty  | sum_base_price | sum_disc_price |  sum_charge  | avg_qty | avg_price | avg_disc | count_order
--------------+--------------+----------+----------------+----------------+--------------+---------+-----------+----------+-------------
R            | F            | 37719753 |    56568041381 |    53741292685 |  55889619120 |      26 |     38251 |        0 |     1478870
N            | F            |   991417 |     1487504710 |     1413082168 |   1469649223 |      26 |     38284 |        0 |       38854
A            | F            | 37734107 |    56586554401 |    53758257135 |  55909065223 |      26 |     38273 |        0 |     1478493
N            | O            | 73808911 |   110700990251 |   105167436999 | 109377979031 |      26 |     38248 |        0 |     2894278
(4 rows)

Time: 234.376 ms

postgres=# set new_executor = off;
SET
postgres=#  SELECT
        l_returnflag,
        l_linestatus,
        sum(l_quantity)::bigint as sum_qty,
        sum(l_extendedprice)::bigint as sum_base_price,
        sum(l_extendedprice * (1 - l_discount))::bigint as sum_disc_price,
        sum(l_extendedprice * (1 - l_discount) * (1 + l_tax))::bigint as sum_charge,
        avg(l_quantity)::bigint as avg_qty,
        avg(l_extendedprice)::bigint as avg_price,
        avg(l_discount)::bigint as avg_disc,
        count(*) as count_order
FROM
        lineitem
WHERE
        l_shipdate <= '1998-08-20'
GROUP BY
        l_returnflag,
        l_linestatus;

l_returnflag | l_linestatus | sum_qty  | sum_base_price | sum_disc_price |  sum_charge  | avg_qty | avg_price | avg_disc | count_order
--------------+--------------+----------+----------------+----------------+--------------+---------+-----------+----------+-------------
R            | F            | 37719753 |    56568041381 |    53741292685 |  55889619120 |      26 |     38251 |        0 |     1478870
N            | F            |   991417 |     1487504710 |     1413082168 |   1469649223 |      26 |     38284 |        0 |       38854
A            | F            | 37734107 |    56586554401 |    53758257135 |  55909065223 |      26 |     38273 |        0 |     1478493
N            | O            | 73808911 |   110700990251 |   105167436999 | 109377979031 |      26 |     38248 |        0 |     2894278
(4 rows)

Time: 2341.147 ms

Upgrade from 2.2.0 to 3.0.0

  1. Stop 2.2.0 cluster on master node
sudo su gpadmin
hawq stop cluster -a
  1. Backup 2.2.0 configuration on each hawq node
hawq ssh -f hostfile -e 'cp -rf $GPHOME/etc ~/'
  1. Setup 3.0.0 yum repository on each hawq node
hawq ssh -f hostfile -e 'sudo wget -O /etc/yum.repos.d/oushu-database.repo http://yum.oushu-tech.com/oushurepo/yumrepo/release/oushu-database/centos7/3.0.0.0/oushu-database.repo'
hawq ssh -f hostfile -e 'sudo wget -O /etc/yum.repos.d/oushu-database-utils.repo http://yum.oushu-tech.com/oushurepo/yumrepo/oushu-database-utils/centos7/1.1.0.0/oushu-database-utils.repo'
hawq ssh -f hostfile -e 'sudo yum makecache'
  1. Install 3.0.0 rpm package on each hawq node
hawq ssh -f hostfile -e 'sudo yum remove -y hawq'
sudo yum install -y hawq
hawq ssh -f hostfile -e 'sudo yum install -y hawq'
  1. Configure 3.0.0 on each node
hawq ssh -f hostfile -e 'cp -rf ~/etc/* $GPHOME/etc/'
  1. Install ORC format

    1) Configure the cluster to upgrade mode

    hawq start cluster
    hawq config -c upgrade_mode -v on --skipvalidation
    hawq restart cluster -a
    

    2) Install ORC format in pg_catalog.pg_proc table in template1 on master node. Please find $hawq_master_address_port in $GPHOME/etc/hawq-site.xml

    PGOPTIONS='-c gp_session_role=utility' psql -a -p $hawq_master_address_port -d template1 -f $GPHOME/share/postgresql/orc_install.sql > orc_install.out 2>&1
    

    3) Install ORC format in pg_catalog.pg_proc table in template1 on each segment node. Please find $hawq_segment_address_port in $GPHOME/etc/hawq-site.xml

    source /usr/local/hawq/greenplum_path.sh
    PGOPTIONS='-c gp_session_role=utility' psql -a -p $hawq_segment_address_port -d template1 -f $GPHOME/share/postgresql/orc_install.sql > orc_install.out 2>&1
    

    4) Configure the cluster to normal mode

    hawq config -c upgrade_mode -v off --skipvalidation
    hawq restart cluster -a
    

    5) Get user database name on master node

    psql -a -d template1 -c "select datname from pg_database where datname not in ('hcatalog', 'template0', 'template1') order by datname;"
    

    6) Install ORC format in pg_catalog.pg_proc table in each user database on master node

    psql -a -d $user_database_name -f $GPHOME/share/postgresql/orc_install.sql > orc_install.out 2>&1
    

    7) Restart cluster on master node

    hawq restart cluster -a
    

Oushu Database 2.2.0 new features (June 19, 2017)

Oushu Database 2.2.0 has the following enhancements to Oushu Database 2.1.1:

  • Support multi-byte delimiters for CSV and TEXT formats. In Oushu Database 2.1.1, only single-byte delimiter is supported.
  • Key bug fixes

The following list gives the operating systems supported by Oushu Database:

OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3

Upgrade from 2.1.0/2.1.1 to 2.2.0

Upgrading from 2.1.0/2.1.1 to 2.2.0 only needs binary replacement. There are no catalog and storage format changes. Please note that before upgrade, please backup your configuations, and copy back your configuration files.

Oushu Database 2.1.1 new features (May 10, 2017)

Oushu Database 2.1.1 is a bug fix release. It fixes several bugs from Pluggable External Storeage in 2.1.0 release.

  • Fix the bug that occurs when non-admin users access Pluggable External Storage.
  • Fix the bug when HDFS files are empty in Pluggable External Storage.

The following list gives the operating systems supported by Oushu Database:

OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3

Upgrading from 2.1.0/2.1.1 to 2.2.0 only needs binary replacement. It does not need catalog and storage format changes. Please note that before upgrade, please backup your configuations, and copy back your configuration files.

Oushu Database 2.1.0 new features (March 24, 2017)

The new features in Oushu Database 2.1.0 release are:

  • Pluggable external storage
    • Replace JAVA PXF, it is several times faster than PXF. The pluggable external storage component is native to HAWQ, There is no need to install any extra components anymore. It simplifies the deployment and maintanence a lot.
    • Support CSV and TEXT formats
    • Can be used to share data between different clusters, for example, tranfer data between data warehouses and data marts
    • Can be used for high-speed data loading and data exporting.
    • Can be used to implement efficient backup and restore.
    • Can be used to implement pluggable file systems, for example, S3 and Ceph.
    • Can be used to implement pluggable file formats, for example ORC and Parquet.
  • Support Redhat/Centos 7.3
  • Key bug fixes
  • HAWQ 2.1.0 is compatible with Ambari 2.4.2 and HDP 2.5.3

The following list gives the operating systems supported by Oushu Database:

OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2
Redhat/Centos 7.3

Oushu Database 2.0.1 new features (Jan 20, 2017)

  • Support Oushu Lava Cloud
  • Support Kubernetes 1.4
  • Support simple deployment on Lava and Kubernetes platform
  • Support automatic node recovery
OS Version
Redhat/Centos 7.0
Redhat/Centos 7.1
Redhat/Centos 7.2