Severalnines

What’s in a Database Diagram?

“Data are just summaries of thousands of stories – tell a few of those stories to help make the data meaningful” - Chip & Dan Heath

Before you start playing with a data that is meaningful in a context you make sure it has been collected and filtered by a design that harness the meaningfulness.

Modeling and designing a database is a foundation step towards a working database that will back any working software exposed to the outer world. Let us be honest, it can get tricky and complex, isn’t it? Answer is clarity and simplicity, on paper and in thoughts.

How a DBMS Handles This?

Don’t you agree that visuals are a great way to give clarity to complex design, concept, making things self explanatory and easy to comprehend?

To save time and reduce complexities, any tool generating database diagrams on

Conceptual Level
Logical Level and
Physical level

It is a handy feature for a DBMS to have. The good news is that most of the DBMS have either this feature built in or have 3rd party tools to support.

Any DBMS lacking this feature these days or no support available from third part tools can hurt its certain audience if not all. Wondering how? Imagine you have been asked to extend database design of an already built e-commerce web system or to design a custom payroll system, making it more complex you have to do it manually. Mapping each table, building relationships, implementing constraints and translating them back to business requirements can easily burn you out.

What About PostgreSQL?

Well, you can do it with PostgreSQL as well and quite efficiently. PostgreSQL is the world’s most advanced open source database. It has a wide variety of 3rd party tools that support data modeling and diagram generation. In fact depending on the nature of the requirement, context of use, operating system you are working on, formats you interested to import & export, price you can afford and with some even free, you will definitely find one that suits you well.

Let’s have a look at these tools suggested by the PostgreSQL community. It’s surely a long list so don’t be amazed if you don’t find one in this list.

Data Studio

Company: AquaFold Inc (IDERA)

License: Proprietary

OS: Windows , Linux, macOS

Last Release: 20.0 (May 2019)

PostgreSQL Version Supported: 10.4, 9.x

Features:

Aqua Data Studio is a database IDE and its ER modeler has bundled some really nice features up its sleeves. You can reverse engineer an existing database, quick search entities, annotate, compare ER models, forward engineer model in to the database, import database to ER model and generate HTML reports.

You can find a complete list of features supported by Datastudio for PostgreSQL here.

Dataedo

Company: Dataedo

License: Proprietary, Free (Students and Teachers) , Open Source

OS: Windows , Linux, macOS

PostgreSQL Version Supported: 9.3, 9.4, 9.5, 9.6, 10

Last Release: Dataedo 7.4.2 (May 16th, 2019)

Features:

Dataedo can generate ER diagrams by its simple to use drag and drop feature. You can select custom columns to include in the diagram to be displayed. Its cross platform database server and engine diagram creation is surely an attractive feature. It supports reverse engineering and can document tables relationships in an efficient manner for missing FK constraints. All these features can be handy for querying, reporting services and database development. You can see more by dataedo for PostgreSQL on.

DBSchema

Company: WISE CODERS GmbH

License: Proprietary, Free (Limited to 12 tables with few features)

OS: Windows , Linux, macOS

Last Release: DbSchema 8.1.6 (May 2019)

Features:

Dbscehma claims that no database or SQL experience required using its visual tool to manage a PostgreSQL database. It offers editing tables in the diagrams. You can create multiple layouts of the schema for a better understanding that can be saved and edited offline as well. It manages its own version of schema that can be deployed on multiple databases. It can print high quality layout images that can be exported in HTML5. Visit them for more PostgreSQL specific details.

DBVisualizer

Company: DbVis Software

License: Proprietary, Free (Limited Feature set)

OS: Windows , Linux, macOS

PostgreSQL Version Supported: PostgreSQL 8.x, 9.x, 10.x, 11.x

Last Release: 10.0.21 (June 2, 2019)

Features:

DBVisualizer has a long and high profile clients list. It renders schema diagrams in a graph like a manner that generates all key constraints, using its reference graph feature. It has multiple layouts available for graphs i.e. Hierarchic, Organic, Orthogonal, or Circular to view table nodes and relations. These graphs can be zoomed, fit, animate and have a navigator pane for navigation. You can export in multiple formats and print as well. Above are few from its PostgreSQL supported features.

DBWrench

Company: Nizana Systems

License: Proprietary, Free

OS: Windows , Linux, macOS

Last Release: 4.2.1 (May 2019)

Features:

DBWrench with its forward and reverse engineering capabilities claims to provide an easy to manage database development. You can edit database objects directly in the diagrams thus no need to navigate between nodes and navigator helps you to manage large diagrams easily. It supports multiple ER notations and you can also generate HTML documentation of these diagrams.

DeZign

Company: Datanamic

License: Proprietary

OS: Windows

PostgreSQL Version Supported: 7, 8, 9, 10, 11

Last Release: 11.0.3 (April, 2019)

Features:

Like many of their competitors, Datanamic are in the market for quite some time. Their flagship product DeZign has some great features to boast of. It’s easy to use data design and modeling features are equipped with forward and reverse engineering techniques. Its data modeling offers bi-directional compare and synchronize feature for multiple use cases. They support teamwork feature so that more than one person can work on the same data modeler. DeZign supports exporting model reports in HTML, Word and PDF formats.

ModelRight

Company: ModelRight

License: Proprietary

OS: Windows

PostgreSQL Version Supported: 11, 10, 9.6, 9.4, 9.0, 8.4, 8.3, 8.3

Last Release: 4.1 (Dec 2016)

Features:

One of the interesting facts about ModelRight is that it’s built by the guy who leads the software development of famous ERWin in its earlier years. UI may not sound modern but features are worth looking into. You may find most of the features we discussed above like forward engineering, reverse engineering in to the model, model comparison, on diagram editing, model subsets of a primary model, navigator and zoom, HTML Report generation with model information and linked images to ER diagrams.

OpenSystemArchitect

Company: System Architect by codebydesign (Community Maintained)

License: Mainly Free (GPL), Proprietary

OS: Windows , Linux, macOS

PostgreSQL Version Supported: 9.x , 10.x

Last Release: 4.0.0 (2018)

Features:

Available under GPL Open System Architect is focused on data modeling at logical and physical levels. It supports ERD validation and documentation. It is free and could be worth trying if you are low on cash or a student.

PgModeler

Company: PgModeler ( Community Maintained)

License: Proprietary(Compiled Binary Packages), Open Source GPLv3 (Compile yourself)

OS: Windows , Linux, macOS

Last Release: 0.9.1 (May, 2018)

Features:

An easy to use, open source and cross platform data modeler application for PostgreSQL. Some of the notable features but not limited to are, its ability to generate a model in four different ways and generate models from existing databases. To ensure no rules or references affected during export it incorporates model validation feature as well. Like many above it can export/import models and generate diffs for model comparison.

PostgreSQL Management & Automation with ClusterControl

Learn about what you need to know to deploy, monitor, manage and scale PostgreSQL

Download the Whitepaper

PostgreSQL Maestro

Company: SQL Maestro Group

License: Proprietary, Free

OS: Windows

PostgreSQL Version Supported: 7.3 to 10.0

Last Release: 18.12 (Dec, 2018)

Features:

A Windows GUI admin tool for PostgreSQL development and management that support all PostgreSQL version from 7 to 10. An easy database object management system with handy schema designer feature that can easily reverse engineered database in to ER diagram. All objects are editable along with the support of adding more tables or defining new relationships between them.

SQL Power Architect

Company: SQL Power Group Inc

License: Free GPLv3, Proprietary

OS: Windows , Linux, macOS

PostgreSQL Version Supported: 8.0 or later

Last Release: 1.0.8 (May, 2016)

Features:

A cross platform data modeling and profiling tool. Among many few of visual specific features includes forward/reverse engineering, data model and data structures comparison, automatically generating source-to-target visual mapping reports and easy to navigate tree view. It’s database structures snapshot features allows users to design data models while working offline. Above all it’s free as well.

DBeaver

Company: Community Maintained

License: Apache License (Free), Enterprise Edition

OS: Windows, Linux, MacOS, Solaris

Last Release: 6.0.5 (May 2019)

Features:

Dbeaver is free community database tool and like all above supports multiple databases alongside PostgreSQL. It has a closed-source enterprise edition that is sold as a commercial license. DBeaver supports automatically generated ER diagrams on schema and table level. Diagrams can be exported in multiple formats. You can create custom ER diagrams as well that may contain any tables from any databases.

Vertabelo

Company: Vertabelo

License: Proprietary, Free (for educational purposes)

OS: Web based, OS independent

PostgreSQL Version Supported: 9.x

Last Release:

Features:

An intuitive web based system. Vertabelo allows multiple ways to create data model i.e. blank from your DB engine, through an example diagram, importing a SQL model or an XML model. It supports multiple databases hence during working on diagrams you have access to appropriate data types. They have done well enough to manage large diagrams using table grouping by “subject areas” with navigation tree contains list of all subject areas. More cool features include its live validation of model and collaboration where you can share read only version of your model. It supports model versioning and export to multiple formats. For using vertabelo with PostgreSQL and to learn more of its features please see details here.

Toad

Company: Quest

License: Proprietary

OS: Windows

PostgreSQL Version Supported: 8.x, 9.x

Last Release: 6.4 (April,2018)

Features:

Toad data modeler by Quest offers data modeling feature for logical and physical models. You can build ER models and forward/reverse engineer the databases. Model comparison, synchronization and customization is also supported with detailed reporting. Features list is even bigger matching its price. Have a look here.

Valentina Studio

Company: Paradigma Software

License: Proprietary, Free

OS: Windows, Linux, MacOS

PostgreSQL Version Supported: 8.4 onwards

Last Release: 9.2 (June,2019)

Features:

Valentina studio offers automatic ER diagram generation in its free version, for adding custom elements it requires upgrade to PRO version. Similarly free version supports reverse engineering but not forward engineering. It offers native applications and promise fast to work. Well, it is free and offers good features, worth trying.

DataGrip

Company: JetBrains

License: Proprietary, Free (Conditional)

OS: Windows, Linux, MacOS

Last Release: 2019.1.3 (May ,2019)

Features:

A complete database IDE that supports multiple databases other than PostgreSQL. DataGrip offers a visual table editor and supports viewing tables and their relationships in an insightful diagram that can be exported later as images. To learn more about how PostgreSQL works with DataGrip, see details here.

Navicat Data Modeler

Company: PremiumSoft

License: Proprietary

OS: Windows, Linux MacOS

PostgreSQL Version Supported: 7.3, 7.4, 8.0, 8.1, 8.2, 8.3, 8.4, 9.0, 9.1, 9.2, 9.3, 9.4

Last Release: 2.1 ( January , 2019)

Features:

Navicat is a well known name and a widely used database tool. Navicat Data Modeler is a standalone product that offers creating and converting conceptual business model in to logical relational model and finally in to physical model (database). You can create or customize ER diagrams from existing databases using reverse engineering feature or generate scripts using its forward engineering. A user friendly drawing tool to create database diagrams that can be exported later as PDF or image files. You can sync your models on cloud for easy access using integrated navicat cloud feature.

Erwin Data Modeler

Company: Erwin Inc

License: Proprietary, Academic (Limited features for students and needs approval)

OS: Windows

PostgreSQL Version Supported: Certified to work with PostgreSQL v9.6.12, v10.7, v11.2

Last Release: erwin DM 2019 R1 (April, 2019)

Features:

Here comes another big player. Erwin is in market for quite some time, a tested and trusted product and offers a wide variety of database related tools. Erwin data modeler is an integrated data modeling tool offering conceptual, logical, physical and dimensional modeling with forward/reverse data engineering, model comparison and export features. It has a comprehensive model reporting and centralize model management and collaboration system.

Tags:

PostgreSQL

postgres

diagram tools

WHM and cPanel is no doubt the most popular hosting control panel for Linux based environments. It supports a number of database backends - MySQL, MariaDB and PostgreSQL as the application datastore. WHM only supports standalone database setups and you can either have it deployed locally (default configuration) or remotely, by integrating with an external database server. The latter would be better if you want to have better load distribution, as WHM/cPanel handles a number of processes and applications like HTTP(S), FTP, DNS, MySQL and such.

In this blog post, we are going to show you how to integrate an external MySQL replication setup into WHM seamlessly, to improve the database availability and offload the WHM/cPanel hosting server. Hosting providers who run MySQL locally on the WHM server would know how demanding MySQL is in terms of resource utilization (depending on the number of accounts it hosts and the server specs).

MySQL Replication on WHM/cPanel

By default, WHM natively supports both MariaDB and MySQL as a standalone setup. You can attach an external MySQL server into WHM, but it will act as a standalone host. Plus, the cPanel users have to know the IP address of the MySQL server and manually specify the external host in their web application if this feature is enabled.

In this blog post, we are going to use ProxySQL UNIX socket file to trick WHM/cPanel in connecting to the external MySQL server via UNIX socket file. This way, you get the feel of running MySQL locally so users can use "localhost" with port 3306 as their MySQL database host.

The following diagram illustrates the final architecture:

We are having a new WHM server, installed with WHM/cPanel 80.0 (build 18). Then we have another three servers - one for ClusterControl and two for master-slave replication. ProxySQL will be installed on the WHM server itself.

Deploying MySQL Replication

At the time of this writing, we are using WHM 80.0 (build 18) which only supports up to MySQL 5.7 and MariaDB 10.3. In this case, we are going to use MySQL 5.7 from Oracle. We assume you have already installed ClusterControl on the ClusterControl server.

Firstly, setup passwordless SSH from ClusterControl server to MySQL replication servers. On ClusterControl server, do:

$ ssh-copy-id 192.168.0.31
$ ssh-copy-id 192.168.0.32

Make sure you can run the following command on ClusterControl without password prompt in between:

$ ssh 192.168.0.31 "sudo ls -al /root"
$ ssh 192.168.0.32 "sudo ls -al /root"

Then go to ClusterControl -> Deploy -> MySQL Replication and enter the required information. On the second step, choose Oracle as the vendor and 5.7 as the database version:

Then, specify the IP address of the master and slave:

Pay attention to the green tick right before the IP address. It means ClusterControl is able to connect to the server and is ready for the next step. Click Deploy to start the deployment. The deployment process should take 15 to 20 minutes.

Deploying ProxySQL on WHM/cPanel

Since we want ProxySQL to take over the default MySQL port 3306, we have to firstly modify the existing MySQL server installed by WHM to listen to other port and other socket file. In /etc/my.cnf, modify the following lines (add them if do not exist):

socket=/var/lib/mysql/mysql2.sock
port=3307
bind-address=127.0.0.1

Then, restart MySQL server on cPanel server:

$ systemctl restart mysqld

At this point, the local MySQL server should be listening on port 3307, bind to localhost only (we close it down from external access to be more secure). Now we can proceed to deploy ProxySQL on the WHM host, 192.168.0.16 via ClusterControl.

First, setup passwordless SSH from ClusterControl node to the WHM server that we want to install ProxySQL:

(clustercontrol)$ ssh-copy-id root@192.168.0.16

Make sure you can run the following command on ClusterControl without password prompt in between:

(clustercontrol)$ ssh 192.168.0.16 "sudo ls -al /root"

Then, go to ClusterControl -> Manage -> Load Balancers -> ProxySQL -> Deploy ProxySQL and specify the required information:

Fill in all necessary details as highlighted by the arrows above in the diagram. The server address is the WHM server, 192.168.0.16. The listening port is 3306 on the WHM server, taking over the local MySQL which is already running on port 3307. Further down, we specify the ProxySQL admin and monitoring users' password. Then include both MySQL servers into the load balancing set and then choose "No" in the Implicit Transactions section. Click Deploy ProxySQL to start the deployment.

Our ProxySQL is now installed and configured with two host groups for MySQL Replication. One for the writer group (hostgroup 10), where all connections will be forwarded to the master and the reader group (hostgroup 20) for all read-only workloads which will be balanced to both MySQL servers.

The next step is to grant MySQL root user and import it into ProxySQL. Occasionally, WHM somehow connects to the database via TCP connection, bypassing the UNIX socket file. In this case, we have to allow MySQL root access from both root@localhost and root@192.168.0.16 (the IP address of WHM server) in our replication cluster.

Thus, running the following statement on the master server (192.168.0.31) is necessary:

(master)$ mysql -uroot -p
mysql> GRANT ALL PRIVILEGES ON *.* TO root@'192.168.0.16' IDENTIFIED BY 'M6sdk1y3PPk@2' WITH GRANT OPTION;

Then, import 'root'@'localhost' user from our MySQL server into ProxySQL user by going to ClusterControl -> Nodes -> pick the ProxySQL node -> Users -> Import Users. You will be presented with the following dialog:

Tick on the root@localhost checkbox and click Next. In the User Settings page, choose hostgroup 10 as the default hostgroup for the user:

We can then verify if ProxySQL is running correctly on the WHM/cPanel server by using the following command:

$ netstat -tulpn | grep -i proxysql
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      17306/proxysql
tcp        0      0 0.0.0.0:6032            0.0.0.0:*               LISTEN      17306/proxysql

Port 3306 is what ProxySQL should be listening to accept all MySQL connections. Port 6032 is the ProxySQL admin port, where we will connect to configure and monitor ProxySQL components like users, hostgroups, servers and variables.

At this point, if you go to ClusterControl -> Topology, you should see the following topology:

Configuring MySQL UNIX Socket

In Linux environment, if you define MySQL host as "localhost", the client/application will try to connect via the UNIX socket file, which by default is located at /var/lib/mysql/mysql.sock on the cPanel server. Using the socket file is the most recommended way to access MySQL server, because it has less overhead as compared to TCP connections. A socket file doesn't actually contain data, it transports it. It is like a local pipe the server and the clients on the same machine can use to connect and exchange requests and data.

Having said that, if your application connects via "localhost" and port 3306 as the database host and port, it will connect via socket file. If you use "127.0.0.1" and port 3306, most likely the application will connect to the database via TCP. This behaviour is well explained in the MySQL documentation. In simple words, use socket file (or "localhost") for local communication and use TCP if the application is connecting remotely.

In cPanel, the MySQL socket file is monitored by cpservd process and would be linked to another socket file if we configured a different path than the default one. For example, suppose we configured a non-default MySQL socket file as we configured in the previous section:

$ cat /etc/my.cnf | grep socket
socket=/var/lib/mysql/mysql2.sock

cPanel via cpservd process would correct this by creating a symlink to the default socket path:

(whm)$ ls -al /var/lib/mysql/mysql.sock
lrwxrwxrwx. 1 root root 34 Jul  4 12:25 /var/lib/mysql/mysql.sock -> ../../../var/lib/mysql/mysql2.sock

To avoid cpservd to automatically re-correct this (cPanel has a term for this behaviour called "automagically"), we have to disable MySQL monitoring by going to WHM -> Service Manager (we are not going to use the local MySQL anyway) and uncheck "Monitor" checkbox for MySQL as shown in the screenshot below:

Save the changes in WHM. It's now safe to remove the default socket file and create a symlink to ProxySQL socket file with the following command:

(whm)$ ln -s /tmp/proxysql.sock /var/lib/mysql/mysql.sock

Verify the socket MySQL socket file is now redirected to ProxySQL socket file:

(whm)$ ls -al /var/lib/mysql/mysql.sock
lrwxrwxrwx. 1 root root 18 Jul  3 12:47 /var/lib/mysql/mysql.sock -> /tmp/proxysql.sock

We also need to change the default login credentials inside /root/.my.cnf as follows:

(whm)$ cat ~/.my.cnf
[client]
#password="T<y4ar&cgjIu"
user=root
password='M6sdk1y3PPk@2'
socket=/var/lib/mysql/mysql.sock

A bit of explanation - The first line that we commented out is the MySQL root password generated by cPanel for the local MySQL server. We are not going to use that, therefore the '#' is at the beginning of the line. Then, we added the MySQL root password for our MySQL replication setup and UNIX socket path, which is now symlink to ProxySQL socket file.

At this point, on the WHM server you should be able to access our MySQL replication cluster as root user by simply typing "mysql", for example:

(whm)$ mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 239
Server version: 5.5.30 (ProxySQL)

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Notice the server version is 5.5.30 (ProxySQL). If you can connect as above, we can configure the integration part as described in the next section.

WHM/cPanel Integration

WHM supports a number of database server, namely MySQL 5.7, MariaDB 10.2 and MariaDB 10.3. Since WHM is now only seeing the ProxySQL and it is detected as version 5.5.30 (as stated above), WHM will complain about unsupported MySQL version. You can go to WHM -> SQL Services -> Manage MySQL Profiles and click on Validate button. You should get a red toaster notification on the top-right corner telling about this error.

Therefore, we have to change the MySQL version in ProxySQL to the same version as our MySQL replication cluster. You can get this information by running the following statement on the master server:

mysql> SELECT @@version;
+------------+
| @@version  |
+------------+
| 5.7.26-log |
+------------+

Then, login to the ProxySQL admin console to change the mysql-server_version variable:

(whm)$ mysql -uproxysql-admin -p -h192.168.0.16 -P6032

Use the SET statement as below:

mysql> SET mysql-server_version = '5.7.26';

Then load the variable into runtime and save it into disk to make it persistent:

mysql> LOAD MYSQL VARIABLES TO RUNTIME;
mysql> SAVE MYSQL VARIABLES TO DISK;

Finally verify the version that ProxySQL will represent:

mysql> SHOW VARIABLES LIKE 'mysql-server_version';
+----------------------+--------+
| Variable_name        | Value  |
+----------------------+--------+
| mysql-server_version | 5.7.26 |
+----------------------+--------+

If you try again to connect to the MySQL by running "mysql" command, you should now get "Server version: 5.7.26 (ProxySQL)" in the terminal.

Now we can update the MySQL root password under WHM -> SQL Services -> Manage MySQL Profiles. Edit the localhost profile by changing the Password field at the bottom with the MySQL root password of our replication cluster. Click on the Save button once done. We can then click on "Validate" to verify if WHM can access our MySQL replication cluster via ProxySQL service correctly. You should get the following green toaster at the top right corner:

If you get the green toaster notification, we can proceed to integrate ProxySQL via cPanel hook.

ProxySQL Integration via cPanel Hook

ProxySQL as the middle-man between WHM and MySQL replication needs to have a username and password for every MySQL user that will be passing through it. With the current architecture, if one creates a user via the control panel (WHM via account creation or cPanel via MySQL Database wizard), WHM will automatically create the user directly in our MySQL replication cluster using root@localhost (which has been imported into ProxySQL beforehand). However, the same database user would be not added into ProxySQL mysql_users table automatically.

From the end-user perspective, this would not work because all localhost connections at this point should be passed through ProxySQL. We need a way to integrate cPanel with ProxySQL, whereby for any MySQL user related operations performed by WHM and cPanel, ProxySQL must be notified and do the necessary actions to add/remove/update its internal mysql_users table.

The best way to automate and integrate these components is by using the cPanel standardized hook system. Standardized hooks trigger applications when cPanel & WHM performs an action. Use this system to execute custom code (hook action code) to customize how cPanel & WHM functions in specific scenarios (hookable events).

Firstly, create a Perl module file called ProxysqlHook.pm under /usr/local/cpanel directory:

$ touch /usr/local/cpanel/ProxysqlHook.pm

Then, copy and paste the lines from here. For more info, check out the Github repository at ProxySQL cPanel Hook.

Configure the ProxySQL admin interface from line 16 until 19:

my $proxysql_admin_host = '192.168.0.16';
my $proxysql_admin_port = '6032';
my $proxysql_admin_user = 'proxysql-admin';
my $proxysql_admin_pass = 'mys3cr3t';

Now that the hook is in place, we need to register it with the cPanel hook system:

(whm)$ /usr/local/cpanel/bin/manage_hooks add module ProxysqlHook
info [manage_hooks] **** Reading ProxySQL information: Host: 192.168.0.16, Port: 6032, User: proxysql-admin *****
Added hook for Whostmgr::Accounts::Create to hooks registry
Added hook for Whostmgr::Accounts::Remove to hooks registry
Added hook for Cpanel::UAPI::Mysql::create_user to hooks registry
Added hook for Cpanel::Api2::MySQLFE::createdbuser to hooks registry
Added hook for Cpanel::UAPI::Mysql::delete_user to hooks registry
Added hook for Cpanel::Api2::MySQLFE::deletedbuser to hooks registry
Added hook for Cpanel::UAPI::Mysql::set_privileges_on_database to hooks registry
Added hook for Cpanel::Api2::MySQLFE::setdbuserprivileges to hooks registry
Added hook for Cpanel::UAPI::Mysql::rename_user to hooks registry
Added hook for Cpanel::UAPI::Mysql::set_password to hooks registry

From the output above, this module hooks into a number of cPanel and WHM events:

Whostmgr::Accounts::Create - WHM -> Account Functions -> Create a New Account
Whostmgr::Accounts::Remove - WHM -> Account Functions -> Terminate an Account
Cpanel::UAPI::Mysql::create_user - cPanel -> Databases -> MySQL Databases -> Add New User
Cpanel::Api2::MySQLFE::createdbuser - cPanel -> Databases -> MySQL Databases -> Add New User (requires for Softaculous integration).
Cpanel::UAPI::Mysql::delete_user - cPanel -> Databases -> MySQL Databases -> Delete User
Cpanel::Api2::MySQLFE::deletedbuser - cPanel -> Databases -> MySQL Databases -> Add New User (requires for Softaculous integration).
Cpanel::UAPI::Mysql::set_privileges_on_database - cPanel -> Databases -> MySQL Databases -> Add User To Database
Cpanel::Api2::MySQLFE::setdbuserprivileges - cPanel -> Databases -> MySQL Databases -> Add User To Database (requires for Softaculous integration).
Cpanel::UAPI::Mysql::rename_user - cPanel -> Databases -> MySQL Databases -> Rename User
Cpanel::UAPI::Mysql::set_password - cPanel -> Databases -> MySQL Databases -> Change Password

If the event above is triggered, the module will execute the necessary actions to sync up the mysql_users table in ProxySQL. It performs the operations via ProxySQL admin interface running on port 6032 on the WHM server. Thus, it's vital to specify the correct user credentials for ProxySQL admin user to make sure all users will be synced with ProxySQL correctly.

Take note that this module, ProxysqlHook.pm has never been tested in the real hosting environment (with many accounts and many third-party plugins) and obviously does not cover all MySQL related events within cPanel. We have tested it with Softaculous free edition and it worked greatly via cPanel API2 hooks. Some further modification might be required to embrace full automation.

That's it for now. In the next part, we will look into the post-deployment operations and what we could gain with our highly available MySQL server solution for our hosting servers if compared to standard standalone MySQL setup.

Tags:

In the first part of the series, we showed you how to deploy a MySQL Replication setup with ProxySQL with WHM and cPanel. In this part, we are going to show some post-deployment operations for maintenance, management, failover as well as advantages over the standalone setup.

MySQL User Management

With this integration enabled, MySQL user management will have to be done from WHM or cPanel. Otherwise, ProxySQL mysql_users table would not sync with what is configured for our replication master. Suppose we already created a user called severaln_user1 (the MySQL username is automatically prefixed by cPanel to comply to MySQL limitation), and we would like to assign to database severaln_db1 like below:

The above will result to the following mysql_users table output in ProxySQL:

If you would like to create MySQL resources outside of cPanel, you can use ClusterControl -> Manage -> Schemas and Users feature and then import the database user into ProxySQL by going to ClusterControl -> Nodes -> pick the ProxySQL node -> Users -> Import Users.

The Proxysqlhook module that we use to sync up ProxySQL users sends the debugging logs into /usr/local/cpanel/logs/error_log. Use this file to inspect and understand what happens behind the scenes. The following lines would appear in the cPanel log file if we installed a web application called Zikula using Softaculous:

[2019-07-08 11:53:41 +0800] info [mysql] Creating MySQL database severaln_ziku703 for user severalnines
[2019-07-08 11:53:41 +0800] info [mysql] Creating MySQL virtual user severaln_ziku703 for user severalnines
[2019-07-08 11:53:41 +0800] info [cpanel] **** Reading ProxySQL information: Host: 192.168.0.16, Port: 6032, User: proxysql-admin *****
[2019-07-08 11:53:41 +0800] info [cpanel] **** Checking if severaln_ziku703 exists inside ProxySQL mysql_users table *****
[2019-07-08 11:53:41 +0800] info [cpanel] **** Inserting severaln_ziku703 into ProxySQL mysql_users table *****
[2019-07-08 11:53:41 +0800] info [cpanel] **** Save and load user into ProxySQL runtime *****
[2019-07-08 11:53:41 +0800] info [cpanel] **** Checking if severaln_ziku703 exists inside ProxySQL mysql_users table *****
[2019-07-08 11:53:41 +0800] info [cpanel] **** Checking if severaln_ziku703 exists inside ProxySQL mysql_users table *****
[2019-07-08 11:53:41 +0800] info [cpanel] **** Updating severaln_ziku703 default schema in ProxySQL mysql_users table *****
[2019-07-08 11:53:41 +0800] info [cpanel] **** Save and load user into ProxySQL runtime *****

You would see some repeated lines like "Checking if {DB user} exists" because WHM creates multiple MySQL user/host for every create database user request. In our example, WHM would create these 3 users:

severaln_ziku703@localhost
severaln_ziku703@'<WHM IP address>'
severaln_ziku703@'<WHM FQDN>'

ProxySQL only needs the username, password and default hostgroup information when adding a user. Therefore, the checking lines are there to avoid multiple inserts of the exact same user.

If you would like to modify the module and make some improvements into it, don't forget to re-register the module by running the following command on the WHM server:

(whm)$ /usr/local/cpanel/bin/manage_hooks add module ProxysqlHook

Query Monitoring and Caching

With ProxySQL, you can monitor all queries coming from the application that have been passed or are passing through it. The standard WHM does not provide this level of detail in MySQL query monitoring. The following shows all MySQL queries that have been captured by ProxySQL:

With ClusterControl, you can easily look up the most repeated queries and cache them via ProxySQL query cache feature. Use the "Order By" dropdown to sort the queries by "Count Star", rollover to the query that you want to cache and click the "Cache Query" button underneath it. The following dialog will appear:

The resultset of cached queries will be stored and served by the ProxySQL itself, reducing the number of hits to the backend which will offload your MySQL replication cluster as a whole. ProxySQL query cache implementation is fundamentally different from MySQL query cache. It's time-based cache and will be expired after a timeout called "Cache TTL". In this configuration, we would like to cache the above query for 5 seconds (5000 ms) from hitting the reader group where the destination hostgroup is 20.

Read/Write Splitting and Balancing

By listening to MySQL default port 3306, ProxySQL is kind of acting like the MySQL server itself. It speaks MySQL protocols on both frontend and backend. The query rules defined by ClusterControl when setting up the ProxySQL will automatically split all reads (^SELECT .* in Regex language) to hostgroup 20 which is the reader group, and the rest will be forwarded to the writer hostgroup 10, as shown in the following query rules section:

With this architecture, you don't have to worry about splitting up read/write queries as ProxySQL will do the job for you. The users have minimal to none changes to the code, allowing the hosting users to use all the applications and features provided by WHM and cPanel natively, similar to connecting to a standalone MySQL setup.

In terms of connection balancing, if you have more than one active node in a particular hostgroup (like reader hostgroup 20 in this example), ProxySQL will automatically spread the load between them based on a number of criteria - weights, replication lag, connections used, overall load and latency. ProxySQL is known to be very good in high concurrency environment by implementing an advanced connection pooling mechanism. Quoted from ProxySQL blog post, ProxySQL doesn't just implement Persistent Connection, but also Connection Multiplexing. In fact, ProxySQL can handle hundreds of thousands of clients, yet forward all their traffic to few connections to the backend. So ProxySQL can handle N client connections and M backend connections , where N > M (even N thousands times bigger than M).

MySQL Failover and Recovery

With ClusterControl managing the replication cluster, failover is performed automatically if automatic recovery is enabled. In case of a master failure:

ClusterControl will detect and verify the master failure via MySQL client, SSH and ping.
ClusterControl will wait for 3 seconds before commencing a failover procedure.
ClusterControl will promote the most up-to-date slave to become the next master.
If the old master comes back online, it will be started as a read-only, without participating in the active replication.
It's up to users to decide what will happen to the old master. It could be introduced back to the replication chain by using "Rebuild Replication Slave" functionality in ClusterControl.
ClusterControl will only attempt to perform the master failover once. If it fails, user intervention is required.

You can monitor the whole failover process under ClusterControl -> Activity -> Jobs -> Failover to a new master as shown below:

During the failover, all connections to the database servers will be queued up in ProxySQL. They won't get terminated until timeout, controlled by mysql-default_query_timeout variable which is 86400000 milliseconds or 24 hours. The applications would most likely not see errors or failures to the database at this point, but the tradeoff is increased latency, within a configurable threshold.

At this point, ClusterControl will present the topology as below:

If we would like to allow the old master join back into the replication after it is up and available, we would need to rebuild it as a slave by going to ClusterControl -> Nodes -> pick the old master -> Rebuild Replication Slave -> pick the new master -> Proceed. Once the rebuilding is complete, you should get the following topology (notice 192.168.0.32 is the master now):

Server Consolidation and Database Scaling

With this architecture, we can consolidate many MySQL servers which resides on every WHM server into one single replication setup. You can scale more database nodes as you grow, or have multiple replication clusters to support all of them and managed by a single ClusterControl server. The following architecture diagram illustrates if we have two WHM servers connected to one single MySQL replication cluster via ProxySQL socket file:

The above allows us to separate the two most important tiers in our hosting system - application (front-end) and database (back-end). As you might know, co-locating MySQL in the WHM server commonly results to resource exhaustion as MySQL needs a huge upfront RAM allocation to start up and perform well (mostly depending on the innodb_buffer_pool_size variable). Considering the disk space is sufficient, with the above setup, you can have more hosting accounts hosted per server, where all the server resources can be utilized by the front-end tier applications.

Scaling up the MySQL replication cluster will be much simpler with a separate tier architecture. If let's say the master requires a scale up (upgrading RAM, hard disk, RAID, NIC) maintenance, we can switch over the master role to another slave (ClusterControl -> Nodes -> pick a slave -> Promote Slave) and then perform the maintenance task without affecting the MySQL service as a whole. For scale out operation (adding more slaves), you can perform that without even affecting the master by performing the staging directly from any active slave. With ClusterControl, you can even stage a new slave from an existing MySQL backup (PITR-compatible only):

Rebuilding a slave from backup will not bring additional burden to the master. ClusterControl will copy the selected backup file from ClusterControl server to the target node and perform the restoration there. Once done, the node will be connecting to the master and starts retrieving all the missing transactions since the restore time and catch up with the master. When it's lagging, ProxySQL will not include the node in the load balancing set until the replication lag is less than 10 seconds (configurable when adding a mysql_servers table via ProxySQL admin interface).

Final Thoughts

ProxySQL extends the capabilities of WHM cPanel in managing MySQL Replication. With ClusterControl managing your replication cluster, all the complex tasks involved in managing the replication cluster are now easier than ever before.

Tags:

Database systems work better when there is a distributed workload among a number of running instances or rather data is categorized in an easy manner. MongoDB utilizes sharding such that data in a given database is grouped in accordance to some key. Sharding enhances horizontal scaling which consequently results in better performance and increased reliability. In general, MongoDB offers horizontal and vertical scaling as opposed to SQL DBMS for example MySQL that only promotes vertical scaling.

MongoDB has a looser consistency model whereby a document in a collection may have an additional key that would be absent from other documents in the same collection.

Sharding

Sharding is basically partitioning data into separate chunks and then defining a range of chunks to different shard servers. A shard key which is often a field that is present in all the documents in the database to be sharded is used to group the data. Sharding works hand in hand with replication to fasten the read throughput by ensuring a distributed workload among a number of servers rather than depending on a single server. Besides, replication ensures copies of the written data are available.

Let’s say we have 120 docs in a collection, these data can be sharded such that we have 3 replica sets and each has 40 docs as depicted in the configuration setup below. If two clients send requests, one to fetch a document that is in index 35 and the other whose index is at 92, the request is received by the query router (a mongos process) that in turn contacts the configuration node which keeps a record of how the ranges of chunks are distributed among the shards. When the specified document identity is found, it is then fetched from the associated shard. For example above, the first client’s document will be fetched from Shard A and for client B, the document will be fetched from Shard C. In general there will be a distributed workload which is defined as horizontal scaling.

For the given shards, if the size of a collection in a shard exceeds the chunk_size, the collection will be split and balanced across the shards automatically using the defined shard key. In the deployment setup, for the example below we will need 3 replica sets each with a primary and some secondaries. The primary nodes act as the sharding servers too.

The minimum recommended configuration for a MongoDB production deployment will be at least three shard servers each with a replica set. For best performance, the mongos servers are deployed on separate servers while the configuration nodes are integrated with the shards.

Deploying MongoDB Shards with Ansible

Configuring shards and replica sets of a cluster separately is a cumbersome undertaking hence we resolve into simple tools like Ansible to achieve the required results with a lot of ease. Playbooks are used to write the required configurations and tasks that Ansible software will be executing.

The systematic playbook process should be:

Install mongo base packages (no-server, pymongo and command line interface)
Install mongodb server. Follow this guide to get started.
Set up mongod instances and there correspondent replica sets.
Configure and set up the config servers
Configure and set up the Mongos routing service.
Add the shards to your cluster.

The top-level playbook should look like this

- name: install mongo base packages include: mongod.yml
  tags: - mongod

- name: configure config server
  include: configServer.yml
  when: inventory_hostname in groups['mongoc-servers'] 
  tags:
  - cs

- name: configure mongos server
  include: configMongos.yml
  when: inventory_hostname in groups['mongos-server'] tags:
  - mongos

- name: add shards
  include: addShards.yml
  when: inventory_hostname in groups['mongos-servers'] 
  tags:
  - mongos
  - shards

We can save the file above as mongodbCluster.yml.

Become a MongoDB DBA - Bringing MongoDB to Production

Learn about what you need to know to deploy, monitor, manage and scale MongoDB

Download for Free

A simple mongodb.yml file will look like:

---
- hosts: ansible-test
  remote_user: root
  become: yes
  tasks:
  - name: Import the public key used by the package management system
    apt_key: keyserver=hkp://keyserver.ubuntu.com:80 id=7F0CEB10 state=present
  - name: Add MongoDB repository
    apt_repository: repo='deb <a class="vglnk" href="http://downloads-distro.mongodb.org/repo/ubuntu-upstart" rel="nofollow"><span>http</span><span>://</span><span>downloads</span><span>-</span><span>distro</span><span>.</span><span>mongodb</span><span>.</span><span>org</span><span>/</span><span>repo</span><span>/</span><span>ubuntu</span><span>-</span><span>upstart</span></a> dist 10gen' state=present
  - name: install mongodb
    apt: pkg=mongodb-org state=latest update_cache=yes
    notify:
    - start mongodb
  handlers:
    - name: start mongodb
      service: name=mongod state=started

To the general parameters required in the deployment of a replica set, we need these two more in order to add the shards.

shard: by default it is null, This is a shard connection string which should be in a format of <replicset>/host:port. For example replica0/siteurl1.com:27017
state: by default the value is present which dictates that the shard should be present, otherwise one can set it to be absent.

After deploying a replica set as explained in this blog, you can proceed to add the shards.

# add a replicaset shard named replica0 with a member running on port 27017 on mongodb0.example.net
- mongodb_shard:
    login_user: admin
    login_password: root
    shard: "replica0/mongodb1.example.net:27017"
    state: present

# add a standalone mongod shard running on port 27018 of mongodb2.example.net
- mongodb_shard:
    login_user: admin
    login_password: root
    shard: "mongodb2.example.net:27018"
    state: present

# Single node shard running on localhost
- name: Ensure shard replica0 exists
  mongodb_shard:
    login_user: admin
    login_password: root
    shard: "replica0/localhost:3001"
    state: present

# Single node shard running on localhost
- name: Ensure shard replica0 exists
  mongodb_shard:
    login_user: admin
    login_password: root
    shard: "replica0/localhost:3002"
    state: present

After setting up all these configurations we run the playbook with the command

ansible-playbook -i hosts mongodbCluster.yml

Once the playbook completes, we can log into any of the mongos servers and issue the command sh.status(). If the output is something like below, the shards have been deployed. Besides you can see the key mongodb_shard if it has been valued success.

mongos> sh.status()
    --- Sharding Status --- 
      sharding version: { "_id" : 1, "version" : 3 }
      shards:
        {  "_id" : "shardA",  "host" : "locahhost1/web2:2017,locahhost3:2017" }
        {  "_id" : "shardB",  "host" : "locahhost3/web2:2018,locahhost3:2019" }
{  "_id" : "shardC",  "host" : "locahhost3/web2:2019,locahhost3:2019" }

    databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }

To remove a shard called replica0

- mongodb_shard:
    login_user: admin
    login_password: root
    shard: replica0
    state: absent

Conclusion

Ansible has played a major role in making the deployment process easy since we only need to define the tasks that need to be executed. Imagine for example if you had 40 replica set members and you need to add shards to each. Going the normal way will take you ages and is prone to a lot of human errors. With ansible you just define these tasks in a simple file called playbook and ansible will take care of the tasks when the file is executed.

Tags:

The increase of unstructured data is a severe challenge to enterprises. Over the past decade, we could observe the rapid growth of data being produced and innovative changes to the way information is processed.

With the increasing number of portable devices, we can recognize the expansion of various data formats like binary data (images, audio/video) CSV, logs, XML, JSON, or unstructured data (emails, documents) which are challenging for database systems we knew.

Moreover maintaining data flows of each of various data access points cause trouble for commonly used data warehouses based on relational database systems.

It's quite common that with the rapid application development companies may not even have a plan of how the data will be processed, but they have strong intent to use it at some point. While it's possible to store unstructured data in the RDBMS system, storing it can be costly and complicated.

So how we can address all these problems?

In this blog, we would like to introduce you to the basics of an interesting “Data Lake” concept that may help you to address the mentioned challenges. We hope that this article will help you to get familiar with unstructured big data, especially if so far your main focus was on relational database systems.

What is a Data Lake?

A data lake is an abstract idea. In October of 2010, James Dixon, founder of Pentaho (now Hitachi Vantara), came up with the term "Data Lake." In describing his concept, he said:

"If you think of a Data Mart as a store of bottled water, cleaned and packaged and structured for easy Consumption, the Data Lake is a large body of water in a more natural state. The contents of the Data Lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples."

From a more technical point of view, we can imagine a data lake as a repository that stores a vast amount of unprocessed data in its original format. While the hierarchical data warehouse systems store information in tables, a data lake uses flat architecture to store data. Each element in the “repository” has a unique identifier assigned and is marked with a set of metadata tags. When a business query arises, the catalog can be searched for specific information, and then a smaller, separated collection of data can be analyzed to help solve a particular problem.

The idea is simple: instead of placing data in a purpose-built data store, you move it into a data lake in its original format. This eliminates the upfront costs of data ingestion, like transformation. Once data is placed into the lake, it's available for analysis by everyone in the organization."

The data lake metaphor is developed because 'lakes' are a great concept to explain one of the basic principles of big data. That is, the need to collect all data and detect exceptions, trends, and patterns using analytics and machine learning. This is because one of the basic principles of data science is the more data you get, the better your data model will ultimately be. With access to all the data, you can model using the entire set of data versus just a sample set, which reduces the number of false positives you might get.

Data Lake vs. Data Warehouse

Data lakes and data warehouses are both widely used for storing “big data”, but they are not interchangeable terms. A data lake is a vast pool of raw data, the purpose for which is not yet defined while a data warehouse is a repository for structured, filtered data that has already been processed for a specific purpose.

Raw data is data that has not yet been processed for a purpose. The most significant difference between data lakes and data warehouses is the diverse structure of raw vs. processed data. Data lakes essentially store raw, unprocessed data, while data warehouses store processed and refined data.

Processed data is raw data that has been put to a particular use. Since data warehouses only house processed data, all of the data in a data warehouse has been used for a special purpose within the organization. This means that storage space is not wasted on data that may never be used.

Hadoop is Not a Replacement for MySQL

Data lakes are designed to handle large volumes of data and a large number of concurrent clients. It really shines when you operate it at scale. If the storage requirements of an application are not something that requires a distributed database. Such solutions are complementary, although since Hadoop MapReduce requires specific data processing knowledge, some companies try to combine the best from SQL and Hadoop.

Open Source Data Lake On-Prem Platform With SQL

SQL on Hadoop opens the doors for Hadoop to gain popularity in business because corporations do not have to invest in highly qualified personnel using eg. scripts written in Java.

Apache Hive has long been offering a structured, SQL-like language for Hadoop. However, commercially available alternatives provide greater efficiency and are continually getting faster (Cloudera Impala, Presto DB, Spark SQL).

An interesting idea here is to reuse your existing database clusters and combine into a single system. In November 2013 Facebook introduced Presto. Presto allows direct connection to various database sources, including Hive, Cassandra, relational databases or even proprietary data stores so the data is stored where it leaves.

Data Lake in the Cloud

Cloud providers are constantly increasing the range of services they offer and big data processing seems to be in the center of their focus. Let’s take a look at the three main cloud providers and how they support the idea of a data lake.

AWS S3 (Simple Storage Service)

AWS Cloud offers many "built-in services," which can be used to create your data lake. The core service is Amazon S3, which is to store data storage, besides we can find :

ETL platform AWS Glue,
search and analytics platform Elasticsearch,
Amazon Athena a query service to analyze data in Amazon S3 using standard SQL.
Others which also can be used to create data lake (Amazon API Gateway,Amazon Cognito,Active Directory,IAM roles Amazon CloudWatch).

Amazon AWS example Data Lake architecture. Source: https://aws.amazon.com/solutions/data-lake-solution/

As mentioned, AWS S3 is the main component of AWS data lake. It provides unlimited storage and archival storage, respectively, including the functionality to replicate data to more than one geographic region.

With advanced S3 functionality such as "lifecycle management," data can be archived for lower cost long-term storage and can be expired after a user-defined length of time, to manage your use of the service and associated costs automatically.

Massive-scale object stores such as AWS S3 (Simple Storage Service) provide effectively infinite storage capacity for your data, including backups, which simply could be another type of data that you may want to store in your data lake.

Google Cloud

The equivalent of Amazon S3 in GCP is Google Cloud storage. Similar to S3, it offers unlimited capacity and severalnines annual durability (99.999999999%). Google highly promotes it's Machine Learning capabilities. While it has the less build in services than Microsoft and AWS it still may be a good option, especially if you already have data processing pipelines in place.

Google Cloud Data Lake Architecture. Source: https://cloud.google.com/solutions/build-a-data-lake-on-gcp

Microsoft Azure ADLS Gen 2

Microsoft offers dedicated objective storage for Data Lakes. It seems to be a mature solution with the support of various components either created by Microsoft or of well know services (HDInsight, Spark, U-SQL, WebHDFS).

In 2018 Microsoft introduced it’s the second generation of data lake implementation called ADLS Gen2. ADLS Gen2 is the combination of the Microsoft Data Lake solutions and Blob storage. ADLS Gen2 has all the features of both, which means it will have features such as limitless storage capacity, support all Blob tiers, Azure Active Directory integration, hierarchical file system, and read-access geo-redundant storage.

Example architecture Microsoft Azure Data Lake. Source: https://docs.microsoft.com/en-us/azure/data-factory/load-azure-data-lake-storage-gen2

That's all for part one. In the next section, we will show you how you can build your data lake using existing open source databases and SQL.

Tags:

cloud

big data

data lake

Databases are intended to efficiently store and query data. The problem is, there are many different types of data we can store: numbers, strings, JSON, geometrical data. Databases use different methods to store different types of data - table structure, indexes. Not always the same way of storing and querying the data is efficient for all of its types, making it quite hard to use one-fits-all solution. As a result, databases try to use different approaches for different data types. For example, in MySQL or MariaDB we have generic, well performing solution like InnoDB, which works fine in majority of the cases, but we also have separate functions to work with JSON data, separate spatial indexes to speed up querying geometric data or fulltext indexes, helping with text data. In this blog, we will take a look at how MariaDB can be used to work with full text data.

Regular B+Tree indexes in InnoDB can also be used to speed up searches for the text data. The main issue is that, due to their structure and nature, they can only help with search for the leftmost prefixes. It is also expensive to index large volumes of text (which, given the limitations of the leftmost prefix, doesn’t really make sense). Why? Let’s take a look at a simple example. We have the following sentence:

“The quick brown fox jumps over the lazy dog”

Using regular indexes in InnoDB we can index the full sentence:

“The quick brown fox jumps over the lazy dog”

The point is, when looking for this data, we have to lookup full leftmost prefix. So a query like:

SELECT text FROM mytable WHERE sentence LIKE “The quick brown fox jumps”;

Will benefit from this index but a query like:

SELECT text FROM mytable WHERE sentence LIKE “quick brown fox jumps”;

Will not. There’s no entry in the index that starts from ‘quick’. There’s an entry in the index that contains ‘quick’ but starts from ‘The’, thus it cannot be used. As a result, it is virtually impossible to efficiently query text data using B+Tree indexes. Luckily, both MyISAM and InnoDB have implemented FULLTEXT indexes, which can be used to actually work with text data on MariaDB. The syntax is slightly different than with regular SELECTs, let’s take a look at what we can do with them. As for data we used random index file from the dump of Wikipedia database. The data structure is as below:

617:11539268:Arthur Hamerschlag
617:11539269:Rooster Cogburn (character)
617:11539275:Membership function
617:11539282:Secondarily Generalized Tonic-Clonic Seizures
617:11539283:Corporate Challenge
617:11539285:Perimeter Mall
617:11539286:1994 St. Louis Cardinals season

As a result, we created table with two BIG INT columns and one VARCHAR.

MariaDB [(none)]> CREATE TABLE ft_data.ft_table (c1 BIGINT, c2 BIGINT, c3 VARCHAR, PRIMARY KEY (c1, c2);

Afterwards we loaded the data:

MariaDB [ft_data]> LOAD DATA INFILE '/vagrant/enwiki-20190620-pages-articles-multistream-index17.txt-p11539268p13039268' IGNORE INTO  TABLE ft_table COLUMNS TERMINATED BY ':';

MariaDB [ft_data]> ALTER TABLE ft_table ADD FULLTEXT INDEX idx_ft (c3);
Query OK, 0 rows affected (5.497 sec)
Records: 0  Duplicates: 0  Warnings: 0

We also created the FULLTEXT index. As you can see, the syntax for that is similar to regular index, we just had to pass the information about the index type as it defaults to B+Tree. Then we were ready to run some queries.

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('Starship');
+-----------+----------+------------------------------------+
| c1        | c2       | c3                                 |
+-----------+----------+------------------------------------+
| 119794610 | 12007923 | Starship Troopers 3                |
| 250627749 | 12479782 | Miranda class starship (Star Trek) |
| 250971304 | 12481409 | Starship Hospital                  |
| 253430758 | 12489743 | Starship Children's Hospital       |
+-----------+----------+------------------------------------+
4 rows in set (0.009 sec)

As you can see, the syntax for the SELECT is slightly different than what we are used to. For fulltext search you should use MATCH() … AGAINST () syntax, where in MATCH() you pass the column or columns you want to search and in AGAINST() you pass coma-delimited list of keywords. You can see from the output that by default search is case insensitive and it searches the whole string, not just the beginning as it is with B+Tree indexes. Let’s compare how it will look like if we would add normal index on the ‘c3’ column - FULLTEXT and B+Tree indexes can coexist on the same column without any problems. Which would be used is decided based on the SELECT syntax.

MariaDB [ft_data]> ALTER TABLE ft_data.ft_table ADD INDEX idx_c3 (c3);
Query OK, 0 rows affected (1.884 sec)
Records: 0  Duplicates: 0  Warnings: 0

After the index has been created, let’s take a look at the search output:

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE c3 LIKE 'Starship%';
+-----------+----------+------------------------------+
| c1        | c2       | c3                           |
+-----------+----------+------------------------------+
| 253430758 | 12489743 | Starship Children's Hospital |
| 250971304 | 12481409 | Starship Hospital            |
| 119794610 | 12007923 | Starship Troopers 3          |
+-----------+----------+------------------------------+
3 rows in set (0.001 sec)

As you can see, our query returned only three rows. This is expected as we are looking for rows which only start with a string ‘Starship’.

MariaDB [ft_data]> EXPLAIN SELECT * FROM ft_data.ft_table WHERE c3 LIKE 'Starship%'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ft_table
         type: range
possible_keys: idx_c3,idx_ft
          key: idx_c3
      key_len: 103
          ref: NULL
         rows: 3
        Extra: Using where; Using index
1 row in set (0.000 sec)

When we check the EXPLAIN output we can see that the index has been used to lookup for the data. But what if we want to look for all the rows which contain the string ‘Starship’, no matter if it is at the beginning or not. We have to write the following query:

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE c3 LIKE '%Starship%';
+-----------+----------+------------------------------------+
| c1        | c2       | c3                                 |
+-----------+----------+------------------------------------+
| 250627749 | 12479782 | Miranda class starship (Star Trek) |
| 253430758 | 12489743 | Starship Children's Hospital       |
| 250971304 | 12481409 | Starship Hospital                  |
| 119794610 | 12007923 | Starship Troopers 3                |
+-----------+----------+------------------------------------+
4 rows in set (0.084 sec)

The output matches what we got from the fulltext search.

MariaDB [ft_data]> EXPLAIN SELECT * FROM ft_data.ft_table WHERE c3 LIKE '%Starship%'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ft_table
         type: index
possible_keys: NULL
          key: idx_c3
      key_len: 103
          ref: NULL
         rows: 473367
        Extra: Using where; Using index
1 row in set (0.000 sec)

The EXPLAIN is different though - as you can see it still uses index but this time it does a full index scan. That is possible as we indexed full c3 column so all the data is available in the index. Index scan will result in random reads from the table but for such small table MariaDB decided it more efficient than reading the whole table. Please note the execution time: 0.084s for our regular SELECT. Comparing this to fulltext query, it is bad:

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('Starship');
+-----------+----------+------------------------------------+
| c1        | c2       | c3                                 |
+-----------+----------+------------------------------------+
| 119794610 | 12007923 | Starship Troopers 3                |
| 250627749 | 12479782 | Miranda class starship (Star Trek) |
| 250971304 | 12481409 | Starship Hospital                  |
| 253430758 | 12489743 | Starship Children's Hospital       |
+-----------+----------+------------------------------------+
4 rows in set (0.001 sec)

As you can see, query which use FULLTEXT index took 0.001s to execute. We are talking here about orders of magnitude differences.

MariaDB [ft_data]> EXPLAIN SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('Starship')\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ft_table
         type: fulltext
possible_keys: idx_ft
          key: idx_ft
      key_len: 0
          ref:
         rows: 1
        Extra: Using where
1 row in set (0.000 sec)

Here’s how the EXPLAIN output look like for the query using FULLTEXT index - that fact is indicated by type: fulltext.

Fulltext queries also have some other features. It is possible, for example, to return rows which might be relevant to the search term. MariaDB looks for words located near the row that you search for and then run a search also for them.

MariaDB [(none)]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('Starship');
+-----------+----------+------------------------------------+
| c1        | c2       | c3                                 |
+-----------+----------+------------------------------------+
| 119794610 | 12007923 | Starship Troopers 3                |
| 250627749 | 12479782 | Miranda class starship (Star Trek) |
| 250971304 | 12481409 | Starship Hospital                  |
| 253430758 | 12489743 | Starship Children's Hospital       |
+-----------+----------+------------------------------------+
4 rows in set (0.001 sec)

In our case, word ‘Starship’ can be related to words like ‘Troopers’, ‘class’, ‘Star Trek’, ‘Hospital’ etc. To use this feature we should run the query with “WITH QUERY EXPANSION” modifyer:

MariaDB [(none)]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('Starship' WITH QUERY EXPANSION) LIMIT 10;
+-----------+----------+-------------------------------------+
| c1        | c2       | c3                                  |
+-----------+----------+-------------------------------------+
| 250627749 | 12479782 | Miranda class starship (Star Trek)  |
| 119794610 | 12007923 | Starship Troopers 3                 |
| 253430758 | 12489743 | Starship Children's Hospital        |
| 250971304 | 12481409 | Starship Hospital                   |
| 277700214 | 12573467 | Star ship troopers                  |
|  86748633 | 11886457 | Troopers Drum and Bugle Corps       |
| 255120817 | 12495666 | Casper Troopers                     |
| 396408580 | 13014545 | Battle Android Troopers             |
|  12453401 | 11585248 | Star trek tos                       |
|  21380240 | 11622781 | Who Mourns for Adonais? (Star Trek) |
+-----------+----------+-------------------------------------+
10 rows in set (0.002 sec)

The output contained large number of rows but this sample is enough to see how it works. The query returned rows like:

“Troopers Drum and Bugle Corps”

“Battle Android Troopers”

Those are based on the search for the word ‘Troopers’. It also returned rows with strings like:

“Star trek tos”

“Who Mourns for Adonais? (Star Trek)”

Which, obviously, are based on the lookup for the word ‘Start Trek’.

If you would need more control over the term you want to search for, you can use “IN BOOLEAN MODE”. It allows to use additional operators. The full list is in the documentation, we’ll show just a couple of examples.

Let’s say we want to search not just for the word ‘Star’ but also for other words which start with the string ‘Star’:

MariaDB [(none)]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('Star*' IN BOOLEAN MODE) LIMIT 10;
+----------+----------+---------------------------------------------------+
| c1       | c2       | c3                                                |
+----------+----------+---------------------------------------------------+
| 20014704 | 11614055 | Ringo Starr and His third All-Starr Band-Volume 1 |
|   154810 | 11539775 | Rough blazing star                                |
|   154810 | 11539787 | Great blazing star                                |
|   234851 | 11540119 | Mary Star of the Sea High School                  |
|   325782 | 11540427 | HMS Starfish (19S)                                |
|   598616 | 11541589 | Dwarf (star)                                      |
|  1951655 | 11545092 | Yellow starthistle                                |
|  2963775 | 11548654 | Hydrogenated starch hydrolysates                  |
|  3248823 | 11549445 | Starbooty                                         |
|  3993625 | 11553042 | Harvest of Stars                                  |
+----------+----------+---------------------------------------------------+
10 rows in set (0.001 sec)

As you can see, in the output we have rows that contain strings like ‘Stars’, ‘Starfish’ or ‘starch’.

Another use case for the BOOLEAN mode. Let’s say we want to search for rows which are relevant to the House of Representatives in Pennsylvania. If we will run regular query, we will get results somehow related to any of those strings:

MariaDB [ft_data]> SELECT COUNT(*) FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('House, Representatives, Pennsylvania');
+----------+
| COUNT(*) |
+----------+
|     1529 |
+----------+
1 row in set (0.005 sec)

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('House, Representatives, Pennsylvania') LIMIT 20;
+-----------+----------+--------------------------------------------------------------------------+
| c1        | c2       | c3                                                                       |
+-----------+----------+--------------------------------------------------------------------------+
| 198783294 | 12289308 | Pennsylvania House of Representatives, District 175                      |
| 236302417 | 12427322 | Pennsylvania House of Representatives, District 156                      |
| 236373831 | 12427423 | Pennsylvania House of Representatives, District 158                      |
| 282031847 | 12588702 | Pennsylvania House of Representatives, District 47                       |
| 282031847 | 12588772 | Pennsylvania House of Representatives, District 196                      |
| 282031847 | 12588864 | Pennsylvania House of Representatives, District 92                       |
| 282031847 | 12588900 | Pennsylvania House of Representatives, District 93                       |
| 282031847 | 12588904 | Pennsylvania House of Representatives, District 94                       |
| 282031847 | 12588909 | Pennsylvania House of Representatives, District 193                      |
| 303827502 | 12671054 | Pennsylvania House of Representatives, District 55                       |
| 303827502 | 12671089 | Pennsylvania House of Representatives, District 64                       |
| 337545922 | 12797838 | Pennsylvania House of Representatives, District 95                       |
| 219202000 | 12366957 | United States House of Representatives House Resolution 121              |
| 277521229 | 12572732 | United States House of Representatives proposed House Resolution 121     |
|  20923615 | 11618759 | Special elections to the United States House of Representatives          |
|  20923615 | 11618772 | List of Special elections to the United States House of Representatives  |
|  37794558 | 11693157 | Nebraska House of Representatives                                        |
|  39430531 | 11699551 | Belgian House of Representatives                                         |
|  53779065 | 11756435 | List of United States House of Representatives elections in North Dakota |
|  54048114 | 11757334 | 2008 United States House of Representatives election in North Dakota     |
+-----------+----------+--------------------------------------------------------------------------+
20 rows in set (0.003 sec)

As you can see, we found some useful data but we also found data which is totally not relevant to our search. Luckily, we can refine such query:

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('+House, +Representatives, +Pennsylvania' IN BOOLEAN MODE);
+-----------+----------+-----------------------------------------------------+
| c1        | c2       | c3                                                  |
+-----------+----------+-----------------------------------------------------+
| 198783294 | 12289308 | Pennsylvania House of Representatives, District 175 |
| 236302417 | 12427322 | Pennsylvania House of Representatives, District 156 |
| 236373831 | 12427423 | Pennsylvania House of Representatives, District 158 |
| 282031847 | 12588702 | Pennsylvania House of Representatives, District 47  |
| 282031847 | 12588772 | Pennsylvania House of Representatives, District 196 |
| 282031847 | 12588864 | Pennsylvania House of Representatives, District 92  |
| 282031847 | 12588900 | Pennsylvania House of Representatives, District 93  |
| 282031847 | 12588904 | Pennsylvania House of Representatives, District 94  |
| 282031847 | 12588909 | Pennsylvania House of Representatives, District 193 |
| 303827502 | 12671054 | Pennsylvania House of Representatives, District 55  |
| 303827502 | 12671089 | Pennsylvania House of Representatives, District 64  |
| 337545922 | 12797838 | Pennsylvania House of Representatives, District 95  |
+-----------+----------+-----------------------------------------------------+
12 rows in set (0.001 sec)

As you can see, by adding ‘+’ operator we made it clear we are interested only in the output where given word exists. As a result the data we got in response is exactly what we were looking for.

We can also exclude words from the search. Let’s say that we are looking for flying things but our search results are contaminated by different flying animals we are not interested in. We can easily get rid of foxes, squirrels and frogs:

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('+flying -fox* -squirrel* -frog*' IN BOOLEAN MODE) LIMIT 10;
+----------+----------+-----------------------------------------------------+
| c1       | c2       | c3                                                  |
+----------+----------+-----------------------------------------------------+
| 13340153 | 11587884 | List of surviving Boeing B-17 Flying Fortresses     |
| 16774061 | 11600031 | Flying Dutchman Funicular                           |
| 23137426 | 11631421 | 80th Flying Training Wing                           |
| 26477490 | 11646247 | Kites and Kite Flying                               |
| 28568750 | 11655638 | Fear of Flying                                      |
| 28752660 | 11656721 | Flying Machine (song)                               |
| 31375047 | 11666654 | Flying Dutchman (train)                             |
| 32726276 | 11672784 | Flying Wazuma                                       |
| 47115925 | 11728593 | The Flying Locked Room! Kudou Shinichi's First Case |
| 64330511 | 11796326 | The Church of the Flying Spaghetti Monster          |
+----------+----------+-----------------------------------------------------+
10 rows in set (0.001 sec)

Final feature we would like to show is the ability to search for the exact quote:

MariaDB [ft_data]> SELECT * FROM ft_data.ft_table WHERE MATCH(c3) AGAINST ('"People\'s Republic of China"' IN BOOLEAN MODE) LIMIT 10;
+-----------+----------+------------------------------------------------------------------------------------------------------+
| c1        | c2       | c3                                                                                                   |
+-----------+----------+------------------------------------------------------------------------------------------------------+
|  12093896 | 11583713 | Religion in the People's Republic of China                                                           |
|  25280224 | 11640533 | Political rankings in the People's Republic of China                                                 |
|  43930887 | 11716084 | Cuisine of the People's Republic of China                                                            |
|  62272294 | 11789886 | Office of the Commissioner of the Ministry of Foreign Affairs of the People's Republic of China in t |
|  70970904 | 11824702 | Scouting in the People's Republic of China                                                           |
| 154301063 | 12145003 | Tibetan culture under the People's Republic of China                                                 |
| 167640800 | 12189851 | Product safety in the People's Republic of China                                                     |
| 172735782 | 12208560 | Agriculture in the people's republic of china                                                        |
| 176185516 | 12221117 | Special Economic Zone of the People's Republic of China                                              |
| 197034766 | 12282071 | People's Republic of China and the United Nations                                                    |
+-----------+----------+------------------------------------------------------------------------------------------------------+
10 rows in set (0.001 sec)

As you can see, fulltext search in MariaDB works quite well, it is also faster and more flexible than search using B+Tree indexes. Please keep in mind though that this is by no means a way of handling large volumes of data - with the data growth, feasibility of this solution will reduce. Still, for the small data sets this solution is perfectly valid. It can definitely buy you more time to, eventually, implement dedicated full text search solutions like Sphinx or Lucene. Of course, all of the features we described are available in MariaDB clusters deployed from ClusterControl.

Tags:

MariaDB

Monitoring is the action of watching and checking over a period of time in order to see how what you are monitoring develops and performs. You do it so you can make any necessary changes to ensure things work correctly. It is essential that processes are monitored to produce good insights into the activities that are being performed in order to plan, organize, and run an efficient solution.

Docker is a program created to operate as a builder ready to answer the question “Will the software run on my machine?” It basically assembles different parts together creating a model easy to store and transport. The model, also known as container, can be shipped to any computer which has Docker installed.

In this article we will be introduced to Docker, describing some ways of configuration and comparing them. Furthermore PostgreSQL comes to play, and we will deploy it inside a Docker container in a smart way, and finally, we will see the benefits that ClusterControl can provide, to monitor key metrics about PostgreSQL and the OS outside of it.

Docker Overview

When Docker starts, it creates a new network connection to allow the transmission of data between two endpoints, called bridge, and this is where new containers are held by default.

In the following, we will see details about this bridge, and discuss why it’s not a good idea to use in production.

$ docker network inspect bridge

Inspecting the Docker default bridge docker0.

Please note the embedded options, because if you run a container without specify the desired network you will agree with it.

On this default network connection, we lose some good advantages of networking, like DNS. Imagine that you want to access Google, which one address do you prefer, www.google.com, or 172.217.165.4?

I don’t know about you but I prefer the earlier choice, and to be honest, I don’t type the ‘www.’.

User-Defined Bridge Network

So we want DNS in our network connection, and the benefit of isolation, because imagine the scenario where you deploy different containers inside of the same network.

When we create a Docker container, we can give a name into it, or Docker makes it for us randomizing two words connected by underline, or ‘_’.

If we don’t use a User-Defined Bridge Network, in the future could be a problem in the middle of the IP addresses, because we are clearly not machines, and remember these values can be hard and error-prone.

Creating a custom bridge, or a User-Defined Bridge Network, is very easy.

Before creating our first one, to understand the process better, let’s open a new terminal, type a Linux command of the package iproute2, and forget about it by now.

$ ip monitor all

Initializing a terminal to monitor the network traffic in the Docker Host.

This terminal will now be listening to the network activity and displaying there.

You may have seen commands like “ifconfig”, or “netstat” before, and I tell you that they are deprecated since 2001. The package net-tools is still widely used, but not updated anymore.

Now it’s time to create our first custom network, so open another terminal and enter:

$ docker network create --driver bridge br-db

Creating the User-Defined Bridge Network "br-db".

This very long mix of letters and numbers is called UUID, or Universally Unique IDentifier. It’s basically saying that the network has been created successfully.

The given name for our first network manually created has been “br-db”, and it needs to be in the final of the command, but before that, we have the argument ‘“-driver”, and then the value “bridge”, why?

In the Community Edition, Docker provides three different drivers available: bridge, host, and none. At creation time, like this, the default is bridge and it doesn’t need to be specified, but we are learning about it.

If you’ve done everything with me, look at the other terminal because I will explain what is going on to you.

Docker has created the network, called “br-db”, but this is only called like this by him.

On the other side of this custom bridge created, there is another layer, the OS. The given name for the same bridge network has changed, and becomes a representation of the nomenclature for bridge, “br”, followed by the first 12 characters of the UUID value above, in red.

Monitoring Docker IP address changes.

With our new network connection, we have a subnet, gateway, and broadcast.

The Gateway, as the name suggests, is where all the traffic of packets happens between the bridge endpoints, and it is called “inet” for the OS as you can see.

Broadcast stands in the last IP address, and is responsible for sending the same traffic of data for all the IP addresses in the subnet when necessary.

They are always present in network connections, and this is why we have at the beginning of the output, the value “[ADDR]”. This value represents IP address changes, and it is like an event being fired for the network activity monitoring, because we have created a new network connection!

Docker Container

Please visit the Docker Hub, and see that what is there is known as Docker Image, and it is basically the skeleton of the container (model).

Docker Images are generated by Dockerfiles, and they contain all the information needed to create a container, in order to make it easy to maintain and customize.

If you look with attention into the Docker Hub, it’s easy to see that the PostgreSQL image, called postgres, have different tags and versions, and if you don’t specify one of them the default will be used, the latest, but maybe in the future if you need different containers of PostgreSQL working together, you may want them to be in the same version.

Let’s create our first right container now, remember the need for the argument ‘--network’, or it will be deployed in the default bridge.

$ docker container run --name postgres-1 --network br-db -e POSTGRES_PASSWORD=5af45Q4ae3Xa3Ff4 -p 6551:5432 -d postgres

Running a PostgreSQL container into the network "br-db".

Again the UUID, success, and in the other terminal, what is going on?

Network interface changes is the event happening right now, or simply “[LINK]”. Anything after the “veth” you can forget, trust me, the value always changes when the container is restarted or something similar happens.

Monitoring Docker network interface changes.

The other option “-e POSTGRES_PASSWORD=?” stands for Environment, and it is only available when running PostgreSQL containers, it is configuring the password for the super user account in the database, called postgres.

Publish is the long name for the “-p 6551:5432” parameter, and it’s basically making the OS port 6551 bi-directional bound to the port 5432 of the container.

Last but not less important, is the Detach option, “-d”, and what it does is make the container run independent of this terminal.

The Docker Image name must be in the end, following the same pattern of the network creation, having all the arguments and options in the left, and at the end the most important thing, in this case, the image name.

Remember that the containers are held in the network subnet, standing on IP addresses allowed for each one of them, and they will never be in the first or last address, because the Gateway and Broadcast will be always there.

We have shown the details of the created bridge, and now will be displayed by each one of the endpoints these details kept by them.

$ docker network inspect br-db

Inspecting the Docker User-Defined Network Interface "br-db".

$ brctl show

Displaying information about the User-Defined Bridge Network by the Docker Host.

As you can see, the network and container names differ, having the second being recognized as an interface by the OS, called “veth768ff71”, and the original name given by us “postgres-1” for Docker.

In the Docker command is possible to note the IP address for the container created earlier, the subnet matching what appeared in the other terminal opened moments ago, and lastly the options empty for this custom network.

The Linux command “brctl show” is part of the package bridge-utils, and as the name suggests, its purpose is to provide a set of tools to configure and manage Linux Ethernet bridges.

Another Custom Bridge Network

We will discuss about DNS soon, but it has been very good make things simple for us in the future. Great configurations tend to make the life of the DBA easier in the future.

With networks is the same, so we can change how the OS recognizes subnet the address and the network name adding more arguments at creation time.

$ docker network create --driver bridge --subnet 172.23.0.0/16 -o “com.docker.network.bridge.name”=”bridge-host” bridge-docker

Creating a User-Defined Bridge Network Interface with custom options.

$ ip route show

Displaying the Docker routing table.

With this Linux command, we can see almost the same thing as the other command earlier, but now instead of listing the “veth” interfaces for each network, we simply have this “linkdown” displaying those who are empty.

The desired subnet address has been specified as an argument, and similar to the Environment option for the container creation, for network we have the “-o” followed by the pair of key and value. In this case, we are telling Docker, to tell the OS, that he should call the network as “bridge-host”.

The existence of those three networks can be verified in Docker too, just enter:

$ docker network ls

Displaying network interfaces on Docker Engine.

Now we have discussed earlier that the values of these “veth” for the containers don’t matter, and I will show you in practice.

The exercise consists of verifying the current value, then restart the container, then verify again. To do so we will need a mix of Linux commands used before, and Docker ones, which are new here but very simple:

$ brctl show
$ docker container stop postgres-1
$ docker container start postgres-1
$ brctl show

Checking how "iptables" makes the container names volatile for the Docker Host.

When the container is stopped, the IP address must be set free to receive a new one if necessary, and that’s a reminder of how DNS can be important.

The OS gives these names for the interfaces every time a container stands in an IP address, and they are generated using the package iptables, who will soon be replaced by the new framework called nftables.

It’s not recommended to change these rules, but exists tools available to help with their visualization, if necessary, like dockerveth.

Container Restart Policy

When we restart the Docker program, or even the computer, the networks created by him are kept in the OS, but empty.

Containers have what is called Container Restart Policy, and this is another very important argument specified at creation time. PostgreSQL, as a production database, his availability is crucial, and this is how Docker can help with it.

$ docker container run --name postgres-2 --network bridge-docker --restart always -e POSTGRES_PASSWORD=5af45Q4ae3Xa3Ff4 -p 6552:5432 -d postgres

Specifying the Container Restart Policy at creation time.

Unless you stop it manually, this container “postgres-2” will be always available.

To understand it better, the containers running will be displayed and the Docker program restarted, then the first step again:

$ docker container ls
$ systemctl restart docker
$ docker container ls

Checking the Container Restart Policy in "postgres-2".

Only the container “postgres-2” is running, the another “postgres-1” container doesn’t starts alone. More information about it can be seen in the Docker Documentation.

Domain Name System (DNS)

One benefit of the User-Defined Bridge Network is the organization, certainly, because we can create as many as we want to run new containers and even connect old ones, but another benefit that we don’t have using the Docker default bridge, is DNS.

When containers need to communicate with each other can be painful for the DBA to memorize the IP addresses, and we have discussed it earlier showing the example of www.google.com and 172.217.165.4. DNS solves this problem seamlessly, making possible to interact with containers using their names given at creation time by us, like “postgres-2”, instead of the IP address “172.23.0.2”.

Let’s see how it works. We will create another container just for demonstration purposes in the same network called “postgres-3”, then, we’ll install the package iputils-ping to transmit packets of data to the container “postgres-2”, using its name of course.

$ docker container run --name postgres-3 --network bridge-docker --restart always -e POSTGRES_PASSWORD=5af45Q4ae3Xa3Ff4 -p 6553:5432 -d postgres
$ docker container exec -it postgres-3 bash

For a better understanding let’s separate it into parts, in the following outputs, the blue arrow will indicate when the command is performed inside of a container, and in red, in the OS.

Running a temporary container to test the DNS provided by the User-Defined Bridge Network Interface.

$ apt-get update && apt-get install -y iputils-ping

Installing the package "iputils-ping" for testing the DNS.

$ ping postgres-2

Showing the DNS working successfully.

Summary

PostgreSQL is running inside of Docker, and his availability is guaranteed by now. When used inside of a User-Defined Bridge Network, the containers can be managed easier with many benefits such as DNS, custom subnet addresses and OS names for the networks.

We have seen details about Docker, and the importance of being aware of updated packages and frameworks on the OS.

Tags:

We’re excited to announce the 1.7.3 release of ClusterControl - the only database management system you’ll ever need to take control of your open source database infrastructure.

In this release we have added support for running multiple PostgreSQL instances on the same server as well as improvements to PgBackRest to continue our expanding support for PostgreSQL environments.

We have also added additional cluster types to our cloud deployment and support for scaling out cloud deployed clusters with automated instance creation. You can now deploy MySQL Replication, PostgreSQL, and TimeScaleDB clusters from ClusterControl onto Amazon AWS, Google Cloud Platform, and Microsoft Azure.

Release Highlights

PostgreSQL Improvements

Manage multiple PostgreSQL instances on the same host
Improvements to our support for pgBackRest by adding non-standard instance ports and custom stanzas
New PostgreSQL Configuration Management page to manage your database configuration files
Newly added metrics allowing you to monitor PostgreSQL Logical Replication setups

Improved Cloud Integration

Automatically launch a cloud instance and scale out your database cluster by adding a new DB node (Galera) or replication slave (Replication).
Deploy the following new replication database clusters:
- Oracle MySQL Server 8.0
- Percona Server 8.0
- MariaDB Server 10.3
- PostgreSQL 11.0 (Streaming Replication)
- TimescaleDB 11.0 (Streaming Replication)

Additional Improvements

Backup verification jobs with xtrabackup can use the --use-memory parameter to limit the memory usage.
A running backup verification server now shows up in the Topology viewer
MongoDB sharded clusters can now add or register an existing MongoDB configuration node
Improved Configuration Management for MySQL, MongoDB, and MySQL NDB Cluster.
Improved Email Notification Settings
New Performance->Transaction Logs
Code Clean-up: legacy ExtJS pages have been migrated to AngularJS
CMON API Depreciation: The clustercontrol-cmonapi package is deprecated from now on as it is no longer required for ClusterControl operations

View Release Details and Resources

Release Details

Running Multiple PostgreSQL Instances from a Single Host

Saving money and resources is something every SysAdmin and DBA is looking to do. One of the ways this can be achieved is by leveraging the same hardware to run multiple database instances, allow the operating system (not the server) handle the traffic routing virtually. In this release we are enabling support for this type of setup for PostgreSQL. Look for more information soon on how this can be achieved.

New Cloud Deployment Options

ClusterControl has offered the ability to deploy databases (and backup databases) in the cloud since ClusterControl 1.6. In this new version we have expanded this functionality to include new database types, rounding out support for the most popular open source databases on the market. With this release ClusterControl can now deploy…

MySQL
MySQL Galera Cluster
MySQL Replication
MariaDB
PostgreSQL Streaming Replication
Percona Server
TimescaleDB Streaming Replication
MongoDB Replica Set

Photo author

Photo description

Tags:

We recently announced the release of ClusterControl 1.7.3 which includes a variety of improvements and newly added features. One of these new features is the addition of support in ClusterControl to allow a user to setup and manage multiple PostgreSQL instances on the same host. This new feature is what we will be discussing in our blog below, including reasons for why this type of setup can help you save on resources as well as provide step-by-step instructions for how to achieve this type of installation in ClusterControl.

Why Would You Need A Multiple-PostgreSQL Installation on a Single Host?

With today's rapid development and improvement of technologies from hardware to software, the scope of requirements has become more adaptable, flexible, and scalable. Some organizations even prefer to leverage the technology stack as scaling it is easier. In addition, there are situations where you might want to deploy a database server on a high-end, powerful server which contains a large CPU, lots of memory, and fast, powerful, and non-volatile storage devices such as SSD/Fusion IO/NVMe. This, however, can sometimes be a waste of resources if you are looking to run the shared-resources of a database server (such as using it as a slave, a hot-backup machine, or even as a backup verification server). In certain setups, you might want to use the resources available in your powerful server as both your development and QA server to avoid unwanted hardware costs (instead of buying a dedicated machine or spawning a new compute instance in the cloud).

How To Setup A Multi-PostgreSQL Installation

For this example we'll create a cluster with a multi-PostgreSQL installation along with multi-PostgreSQL running instances in a single host using ClusterControl.

Note: as of the current version (i.e. ClusterControl 1.7.3), ClusterControl does not allow you to create a cluster or initialize a cluster if you specify a master and slave information with a multi-version installed PostgreSQL or with a multi-instances of PostgreSQL running within a single host. However, you can import a node with multi-version installed or multi-instances of PostgreSQL running in a single host.

Server Details and Information

Since we cannot currently initiate or create a cluster when there's a multiple version installed of PostgreSQL, we'll import an existing or running instance of PostgreSQL. Below are the server information.

IP: 192.168.30.10
OS user: vagrant
OS type and version: Ubuntu 16.04.6 LTS (xenial)

and some information from my /etc/postgresql/9.6/multi_pg/postgresql.conf,

data_directory = '/data/pgsql/master/data'
hba_file = '/etc/postgresql/9.6/multi_pg/pg_hba.conf'   
ident_file = '/etc/postgresql/9.6/multi_pg/pg_ident.conf'
external_pid_file = '/var/run/postgresql/9.6-main.pid'  
listen_addresses = '*'  
port = 7654
max_connections = 100   
shared_buffers = 511995kB
work_mem = 10239kB
maintenance_work_mem = 127998kB 
dynamic_shared_memory_type = posix
wal_level = hot_standby 
full_page_writes = on   
wal_log_hints = on
checkpoint_completion_target = 0.9
max_wal_senders = 16
wal_keep_segments = 32  
hot_standby = on
effective_cache_size = 1535985kB
logging_collector = on  
log_timezone = 'Etc/UTC'
cluster_name = '9.6/multi_pg'   
stats_temp_directory = '/var/run/postgresql/9.6-main.pg_stat_tmp'
datestyle = 'iso, mdy'
timezone = 'Etc/UTC'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8' 
default_text_search_config = 'pg_catalog.english'

Wherein an existing versions have already been installed:

root@debnode1:/home/vagrant# dpkg -l | grep 'object-relational'
ii  postgresql-11                     11.4-1.pgdg16.04+1                         amd64        object-relational SQL database, version 11 server
ii  postgresql-9.2                    9.2.24-1.pgdg16.04+1                       amd64        object-relational SQL database, version 9.2 server
ii  postgresql-9.6                    9.6.14-1.pgdg16.04+1                       amd64        object-relational SQL database, version 9.6 server

Additionally, for this setup, there are additional instances which are running...

root@debnode1:/data/pgsql/master# ps axufwww | grep 'postgre[s]'
postgres  1243  0.0  0.8 186064 17916 ?        S    15:59   0:00 /usr/lib/postgresql/9.2/bin/postgres -D /var/lib/postgresql/9.2/main -c config_file=/etc/postgresql/9.2/main/postgresql.conf
postgres  1285  0.0  0.1 186064  3860 ?        Ss   15:59   0:00  \_ postgres: checkpointer process   
postgres  1286  0.0  0.2 186064  4620 ?        Ss   15:59   0:00  \_ postgres: writer process   
postgres  1287  0.0  0.1 186064  3860 ?        Ss   15:59   0:00  \_ postgres: wal writer process   
postgres  1288  0.0  0.2 186808  6008 ?        Ss   15:59   0:00  \_ postgres: autovacuum launcher process   
postgres  1289  0.0  0.1 145808  3736 ?        Ss   15:59   0:00  \_ postgres: stats collector process   
postgres  1246  0.0  1.2 309600 25884 ?        S    15:59   0:00 /usr/lib/postgresql/11/bin/postgres -D /var/lib/postgresql/11/main -c config_file=/etc/postgresql/11/main/postgresql.conf
postgres  1279  0.0  0.1 309600  4028 ?        Ss   15:59   0:00  \_ postgres: 11/main: checkpointer   
postgres  1280  0.0  0.1 309600  4028 ?        Ss   15:59   0:00  \_ postgres: 11/main: background writer   
postgres  1281  0.0  0.4 309600  9072 ?        Ss   15:59   0:00  \_ postgres: 11/main: walwriter   
postgres  1282  0.0  0.3 310012  6496 ?        Ss   15:59   0:00  \_ postgres: 11/main: autovacuum launcher   
postgres  1283  0.0  0.1 164516  3528 ?        Ss   15:59   0:00  \_ postgres: 11/main: stats collector   
postgres  1284  0.0  0.3 309892  6596 ?        Ss   15:59   0:00  \_ postgres: 11/main: logical replication launcher

For this example, we will use PostgreSQL 9.6.

Building The Master-Slave PostgreSQL Cluster

In order to create a cluster, we need to setup the PostgreSQL instance manually and then import that instance into ClusterControl later. Alternatively, we can create a cluster with just one master node and let ClusterControl handle it but to do this we will need to shutdown all other running nodes. This would not be ideal if you are operating on busy PostgreSQL database servers.

Now, let's do the manual setup...

root@debnode1:/etc/postgresql/9.6/multi_pg# sudo -iu postgres /usr/lib/postgresql/9.6/bin/pg_ctl -D /data/pgsql/master/data initdb
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /data/pgsql/master/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... Etc/UTC
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/lib/postgresql/9.6/bin/pg_ctl -D /data/pgsql/master/data -l logfile start

Then start the database by running the command below,

root@debnode1:/etc/postgresql/9.6/multi_pg# sudo -iu postgres /usr/lib/postgresql/9.6/bin/pg_ctl -D /data/pgsql/master/data  -o "-c config_file=/etc/postgresql/9.6/multi_pg/postgresql.conf" -l /var/log/postgresql/postgresql-9.6-master.log start  
server starting

Now, let's verify if the instance runs and uses the desired port we used:

root@debnode1:/etc/postgresql/9.6/multi_pg# netstat -ntlvp46|grep postgres
tcp        0      0 127.0.0.1:5432          0.0.0.0:*               LISTEN      1246/postgres
tcp        0      0 127.0.0.1:5433          0.0.0.0:*               LISTEN      1243/postgres
tcp        0      0 0.0.0.0:7654            0.0.0.0:*               LISTEN      18403/postgres
tcp6       0      0 :::7654                 :::*

Now, it looks correct. The pid of 18403 shows that we are able to run it and has both IPv4 and IPv6 open.

Now, let's import this to ClusterControl. Go to Deploy → Import Existing Server/Database, to import the desired master node we just setup.

After you hit button Import, you'll be able to have a cluster with one master node just like below:

Now, let's create a slave within the same host (i.e. with IP 192.168.30.10).

And don’t worry, ClusterControl will handle it for you as a sample job activity log shows below.

You can see that it was successfully setup and installed. Technically, ClusterControl will create a directory under /etc/postgresql/<version>/p<port_number> for Debian/Ubuntu based system and generate the required configuration files. While for RHEL/Centos/Fedora based systems, it will generate under data_dir path.

Now let's confirm with pg_lsclusters and see if the multi-PostgreSQL instance are running in parallel in a host. See below:

root@debnode1:/var/log/postgresql# pg_lsclusters 
Ver Cluster  Port Status          Owner    Data directory               Log file
9.2 main     5433 online          postgres /var/lib/postgresql/9.2/main /var/log/postgresql/postgresql-9.2-main.log
9.6 multi_pg 7654 online          postgres /data/pgsql/master/data      /var/log/postgresql/postgresql-9.6-master.log
9.6 pg_7653  7653 online,recovery postgres /data/pgsql/slave/data       pg_log/postgresql-%Y-%m-%d_%H%M%S.log
11  main     5432 online          postgres /var/lib/postgresql/11/main  /var/log/postgresql/postgresql-11-main.log

In addition to this, the metrics as for the Logical Replication clusters are seen below:

Promoting the Slave in a Multi-PostgreSQL Running Instances in a Single Host

Slave promotion is easy for a multi-PostgreSQL running instances in a single host. As you can see below, this type of environment works flawlessly when it’s handled by ClusterControl.

Now, let's see what happens in the background while ClusterControl promotes the slave. See the complete job spec and details

[09:01:02]:Successfully promoted a new master.
[09:01:02]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: promote finished (this is the new master).
[09:01:02]:Servers after promote:
<em style='color: #1abc9c;'>192.168.30.10</em>:7653:
&bull; Role: master (slaves: 1)
&bull; Status: CmonHostOnline (NODE_CONNECTED)
&bull; Receive/replay: 0/30020C0; 0/30020C0

<em style='color: #1abc9c;'>192.168.30.10</em>:7654:
&bull; Role: slave (slaves: 0)
&bull; Status: CmonHostOnline (NODE_CONNECTED)
&bull; Receive/replay: 0/30020C0; 0/30020C0
&bull; Master: 192.168.30.10:7653


[09:01:02]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Restarted with new master.
[09:01:02]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Started PostgreSQL.
[09:00:53]:<em style='color: #1abc9c;'>192.168.30.10</em>: done
server started
[09:00:53]:<em style='color: #1abc9c;'>192.168.30.10</em>: waiting for server to start....
[09:00:52]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Executing: su - postgres -c '/usr/lib/postgresql/9.6/bin/pg_ctl start -w -o "-p 7654" --pgdata=/etc/postgresql/9.6/multi_pg/ --log /var/log/postgresql/postgresql-11-main.log'
[09:00:51]:192.168.30.10:7654: Start postgreSQL node.
[09:00:51]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Starting PostgreSQL.
[09:00:51]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Successfully created '<em style='color: #109602;'>/data/pgsql/master/data/recovery.conf</em>'.
[09:00:50]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Creating '<em style='color: #109602;'>/data/pgsql/master/data/recovery.conf</em>': Setting <em style='color: #1abc9c;'>192.168.30.10</em>:7653 as master.
[09:00:50]:<em style='color: #1abc9c;'>192.168.30.10</em>: servers diverged at WAL position 0/3001890 on timeline 1
no rewind required
[09:00:49]:Running /usr/lib/postgresql/9.6/bin/pg_rewind --target-pgdata=/data/pgsql/master/data --source-server="host=192.168.30.10 port=7653 user=dbapgadmin password=***** dbname=postgres"
[09:00:47]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: Granting host (<em style='color: #1abc9c;'>192.168.30.10</em>:7654).
[09:00:45]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Stopped PostgreSQL.
[09:00:38]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Waiting to stop.
[09:00:38]:192.168.30.10:7654: node is already stopped. No need to stop it.
[09:00:38]:192.168.30.10:7654: Stop postgreSQL node.
[09:00:38]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Stopping PostgreSQL.
[09:00:38]:Switching slaves to the new master.
[09:00:38]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: Became master, ok.
[09:00:37]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: Waiting to become a master.
[09:00:37]:<em style='color: #1abc9c;'>192.168.30.10</em>: server promoting
[09:00:36]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: Attempting to promote using <strong style='color: #59a449;'>pg_ctl</strong>.
[09:00:36]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: Promoting host.
[09:00:35]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Stopped PostgreSQL.
[09:00:28]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Waiting to stop.
[09:00:28]:<em style='color: #1abc9c;'>192.168.30.10</em>: done
server stopped
[09:00:28]:<em style='color: #1abc9c;'>192.168.30.10</em>: waiting for server to shut down....
[09:00:27]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Executing: su - postgres -c '/usr/lib/postgresql/9.6/bin/pg_ctl stop --pgdata=/etc/postgresql/9.6/multi_pg/'
[09:00:26]:192.168.30.10:7654: Stop postgreSQL node.
[09:00:26]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Stopping PostgreSQL.
[09:00:26]:<em style='color: #1abc9c;'>192.168.30.10</em>:7654: Stopping the current master.
[09:00:26]:Switching over to <em style='color: #1abc9c;'>192.168.30.10</em>:7653 (previous master is <em style='color: #1abc9c;'>192.168.30.10</em>:7654)
[09:00:26]:Servers:
<em style='color: #1abc9c;'>192.168.30.10</em>:7653:
&bull; Role: slave (slaves: 0)
&bull; Status: CmonHostOnline (NODE_CONNECTED)
&bull; Receive/replay: 0/3001820; 0/3001820
&bull; Master: 192.168.30.10:7654

<em style='color: #1abc9c;'>192.168.30.10</em>:7654:
&bull; Role: master (slaves: 1)
&bull; Status: CmonHostOnline (NODE_CONNECTED)
&bull; Receive/replay: 0/3001820; 0/3001820


[09:00:26]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: Current master is <em style='color: #1abc9c;'>192.168.30.10</em>:7654.
[09:00:26]:<em style='color: #1abc9c;'>192.168.30.10</em>:7653: Promoting server to master.
Job spec: {
    "command": "promote_replication_slave",
    "group_id": 1,
    "group_name": "admins",
    "job_data": 
    {
        "clusterId": "6",
        "slave_address": "192.168.30.10:7653"
    },
    "user_id": 1,
    "user_name": "paul@severalnines.com"
}

As you see, it was handled smoothly even on the same host. The topology result shows that it has been successfully promoted.

Conclusion

We’re excited about the release of ClusterControl 1.7.3 think it has a lot to offer. We also think this new Multi-PostgreSQL instances running on the same host feature is another great step in improving our overall support for PostgreSQL. Try it out and let us know what are your thoughts below on this new feature.

Tags:

PostgreSQL

postgres

roi

The Severalnines Webmaster is responsible for developing, maintaining, and updating Severalnines’ online properties. The ideal candidate has a strong background in developing Drupal websites who is looking to expand beyond the traditional developer role to flex their muscles in project management, operations, and web strategy. The role will be based inside the Marketing team working hand-in-hand on acquisition marketing programs and projects with a focus on creating a positive user experience.

The Webmaster will maintain and develop our overall online presence. This is a critical marketing position that will have a direct impact on generating sales pipeline and contributing to the revenue stream of Severalnines. It requires an individual who has excellent communication skills, a keen business sense, who is detail oriented and ready to tackle any challenge.

Responsibilities

Function as primary contact for the day-to-day operations and content management for all of our online properties
Develop new services or functionalities for our web properties
Maintain a development environment for each web property
Ensure our sites are up-to-date, secure, and running at peak efficiency
Monitor web usage & performance; proactively making recommendations for improvements
Build, implement & manage any new web properties
Oversee web project unit and user testing
Support our processes for and integrations with CRM & Marketing Automation tools (Salesforce and Pardot)
Work closely with our design team to implement new design features and page layouts
Work closely with our design team to maintain our design house
Work closely with our support team for server and hosting related management & maintenance
Create, document, and maintain all web-related processes & procedures
Provide occasional support for our internal Wiki

Requirements

3-5 years experience building and maintaining Drupal websites
3+ years strong Drupal 7 platform knowledge with custom module development experience
Advanced knowledge of LAMP stack, HTML, CSS, JavaScript, jQuery
Advanced experience using GIT
Experience using version control and bug tracking systems such as GitHub JIRA
Experience with integrating Drupal with other systems like Salesforce, Pardot, GoToWebinar, etc.
Experience with using vector graphics (SVG)
Experience with HTML email design & cross-platform layout building
Experience with Google Analytics (including custom dashboards, goal creation & web integrations)
Experience with Google Webmaster Tools and general knowledge of SEO best practices
Understanding of how websites function for acquisition marketing
Strong project management, presentation and communication skills
Ability to meet deadlines, manage multiple tasks and influence peers at all levels

Additional Skills

These items are nice-to-haves for this role but not required.

Experience using Project Management tools like Asana & Trello
Experience with video editing
Experience with Photoshop, Illustrator, InDesign
Experience with building ebooks
Experience with Apex (Salesforce Development Language)
Experience with databases (such as MySQL & MariaDB)
Experience working with the Linux operating system

Additional Details

We’re a software startup with a globally distributed team. As we don’t have any offices, you will be working remotely; in other words, we all work from our homes. Therefore, a good laptop and access to a reliable, high-speed Internet connection are required. The company is based in Europe and although no travel is required, we do encourage attendance at our annual meeting which rotates cities. Contact us to find out more about the working life at Severalnines!

Nowadays, it’s common to see a large amount of data in a company’s database, but depending on the size, it could be hard to manage and the performance could be affected during high traffic if we don’t configure or implement it in a correct way. In general, if we have a huge database and we want to have a low response time, we’ll want to scale it. PostgreSQL is not the exception to this point. There are many approaches available to scale PostgreSQL, but first, let’s learn what scaling is.

Scalability is the property of a system/database to handle a growing amount of demands by adding resources.

The reasons for this amount of demands could be temporal, for example, if we’re launching a discount on a sale, or permanent, for an increase of customers or employees. In any case, we should be able to add or remove resources to manage these changes on the demands or increase in traffic.

In this blog, we’ll look at how we can scale our PostgreSQL database and when we need to do it.

Horizontal Scaling vs Vertical Scaling

There are two main ways to scale our database...

Horizontal Scaling (scale-out): It’s performed by adding more database nodes creating or increasing a database cluster.
Vertical Scaling (scale-up): It’s performed by adding more hardware resources (CPU, Memory, Disk) to an existing database node.

For Horizontal Scaling, we can add more database nodes as slave nodes. It can help us to improve the read performance balancing the traffic between the nodes. In this case, we’ll need to add a load balancer to distribute traffic to the correct node depending on the policy and the node state.

To avoid a single point of failure adding only one load balancer, we should consider adding two or more load balancer nodes and using some tool like “Keepalived”, to ensure the availability.

As PostgreSQL doesn’t have native multi-master support, if we want to implement it to improve the write performance we’ll need to use an external tool for this task.

For Vertical Scaling, it could be needed to change some configuration parameter to allow PostgreSQL to use a new or better hardware resource. Let’s see some of these parameters from the PostgreSQL documentation.

work_mem: Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. Several running sessions could be doing such operations concurrently, so the total memory used could be many times the value of work_mem.
maintenance_work_mem: Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. Larger settings might improve performance for vacuuming and for restoring database dumps.
autovacuum_work_mem: Specifies the maximum amount of memory to be used by each autovacuum worker process.
autovacuum_max_workers: Specifies the maximum number of autovacuum processes that may be running at any one time.
max_worker_processes: Sets the maximum number of background processes that the system can support. Specify the limit of the process like vacuuming, checkpoints, and more maintenance jobs.
max_parallel_workers: Sets the maximum number of workers that the system can support for parallel operations. Parallel workers are taken from the pool of worker processes established by the previous parameter.
max_parallel_maintenance_workers: Sets the maximum number of parallel workers that can be started by a single utility command. Currently, the only parallel utility command that supports the use of parallel workers is CREATE INDEX, and only when building a B-tree index.
effective_cache_size: Sets the planner's assumption about the effective size of the disk cache that is available to a single query. This is factored into estimates of the cost of using an index; a higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be used.
shared_buffers: Sets the amount of memory the database server uses for shared memory buffers. Settings significantly higher than the minimum are usually needed for good performance.
temp_buffers: Sets the maximum number of temporary buffers used by each database session. These are session-local buffers used only for access to temporary tables.
effective_io_concurrency: Sets the number of concurrent disk I/O operations that PostgreSQL expects can be executed simultaneously. Raising this value will increase the number of I/O operations that any individual PostgreSQL session attempts to initiate in parallel. Currently, this setting only affects bitmap heap scans.
max_connections: Determines the maximum number of concurrent connections to the database server. Increasing this parameter allows PostgreSQL running more backend process simultaneously.

At this point, there is a question that we must ask. How can we know if we need to scale our database and how can we know the best way to do it?

Monitoring

Scaling our PostgreSQL database is a complex process, so we should check some metrics to be able to determine the best strategy to scale it.

We can monitor the CPU, Memory and Disk usage to determine if there is some configuration issue or if actually, we need to scale our database. For example, if we’re seeing a high server load but the database activity is low, it's probably not needed to scale it, we only need to check the configuration parameters to match it with our hardware resources.

Checking the disk space used by the PostgreSQL node per database can help us to confirm if we need more disk or even a table partitioning. To check the disk space used by a database/table we can use some PostgreSQL function like pg_database_size or pg_table_size.

From the database side, we should check

Amount of connection
Running queries
Index usage
Bloat
Replication Lag

These could be clear metrics to confirm if the scaling of our database is needed.

ClusterControl as a Scaling and Monitoring System

ClusterControl can help us to cope with both scaling ways that we saw earlier and to monitor all the necessary metrics to confirm the scaling requirement. Let’s see how...

If you’re not using ClusterControl yet, you can install it and deploy or import your current PostgreSQL database selecting the “Import” option and follow the steps, to take advantage of all the ClusterControl features like backups, automatic failover, alerts, monitoring, and more.

Horizontal Scaling

For horizontal scaling, if we go to cluster actions and select “Add Replication Slave”, we can either create a new replica from scratch or add an existing PostgreSQL database as a replica.

Let's see how adding a new replication slave can be a really easy task.

As you can see in the image, we only need to choose our Master server, enter the IP address for our new slave server and the database port. Then, we can choose if we want ClusterControl to install the software for us and if the replication slave should be Synchronous or Asynchronous.

In this way, we can add as many replicas as we want and spread read traffic between them using a load balancer, which we can also implement with ClusterControl.

Now, if we go to cluster actions and select “Add Load Balancer”, we can deploy a new HAProxy Load Balancer or add an existing one.

And then, in the same load balancer section, we can add a Keepalived service running on the load balancer nodes for improving our high availability environment.

Vertical Scaling

For vertical scaling, with ClusterControl we can monitor our database nodes from both the operating system and the database side. We can check some metrics like CPU usage, Memory, connections, top queries, running queries, and even more. We can also enable the Dashboard section, which allows us to see the metrics in more detailed and in a friendlier way our metrics.

From ClusterControl, you can also perform different management tasks like Reboot Host, Rebuild Replication Slave or Promote Slave, with one click.

Conclusion

Scaling our PostgreSQL database can be a time consuming task. We need to know what we need to scale and what the best way is to do it. As we could see, there are some metrics to take into account at time to scale it and they can help to know what we need to do.

ClusterControl provides a whole range of features, from monitoring, alerting, automatic failover, backup, point-in-time recovery, backup verification, to scaling of read replicas. This can help us to scale our PostgreSQL database in a horizontal or vertical way from a friendly and intuitive UI.

Tags:

PostgreSQL

postgres

big data

MariaDB Server is no longer a straight imitate of MySQL. It grew into a mature fork, which implements new functionalities similar to what proprietary database systems offer in the upstream. MariaDB 10.3 greatly extends the list of enterprise features, and with new SQL_MODE=Oracle becomes an exciting choice for companies that would like to migrate their Oracle databases to an open source database. However, operational management is an area where there is still some catching up to do, and MariaDB requires that you build your own scripts.

Perhaps a good opportunity to look into an automation system?

Automated procedures are accurate and consistent. They can give you much-needed repeatability so you can minimize the risk of change in the production systems. However, as modern open source databases develop so fast, it's more challenging to keep your management systems on par with all new features.

The natural next step is to look for automation platforms. There are many platforms that you can use to deploy systems. Puppet, Chef, and Ansible are probably the best examples of that new trend. These platforms are suitable for the fast deployment of various software services. They are perfect for deployments, but still require you to maintain the code, cover feature changes, and usually, they cover just one aspect of your work. Things like backups, performance, and maintenance still need external tools or scripts.

On the other side, we have cloud platforms, with polished interfaces and a variety of additional services for a fully managed experience. However, it may not be feasible; for instance, hybrid environments where you might be using the cloud, but with still a significant on-prem footprint.

So, how about a dedicated management layer for your MariaDB databases?

ClusterControl was designed to automate the deployment and management of MariaDB as well as other open-source databases. At the core of ClusterControl is functionality that lets you automate the database tasks you have to perform regularly, like deploying new database instances and clusters, managing backups, high availability and failover, topology changes, upgrades, scaling new nodes and more.

ClusterControl installation

To start with ClusterControl, you need a dedicated virtual machine or host. The VM and supported systems requirements are described here. At the minimum you can start from tiny VM 2 GB RAM, 2 CPU cores and 20 GB storage space, either on-prem or in the cloud.

The primary installation method is to download an installation wizard that walks you through all the steps (OS configuration, package download and installation, metadata creation, and others).

For environments without internet access, you can use the offline installation process.

ClusterControl is agentless so you don't need to install additional software. It requires only SSH access to the database hosts. It also supports agent-based monitoring for higher resolution monitoring data.

To set up passwordless SSH to all target nodes (ClusterControl and all database hosts), run the following commands on the ClusterControl server:

$ ssh-keygen -t rsa # press enter on all prompts
$ ssh-copy-id -i ~/.ssh/id_rsa [ClusterControl IP address]
$ ssh-copy-id -i ~/.ssh/id_rsa [Database nodes IP address] # repeat this to all target database nodes

One of the most convenient ways to try out cluster control maybe the option to run it in docker container.

docker run -d --name clustercontrol \
--network db-cluster \
--ip 192.168.10.10 \
-h clustercontrol \
-p 5000:80 \
-p 5001:443 \
-v /storage/clustercontrol/cmon.d:/etc/cmon.d \
-v /storage/clustercontrol/datadir:/var/lib/mysql \
-v /storage/clustercontrol/sshkey:/root/.ssh \
-v /storage/clustercontrol/cmonlib:/var/lib/cmon \
-v /storage/clustercontrol/backups:/root/backups \
severalnines/clustercontrol

After successful deployment, you should be able to access the ClusterControl Web UI at {host's IP address}:{host's port}, for example:

HTTP: http://192.168.10.100:5000/clustercontrol
HTTPS: https://192.168.10.100:5001/clustercontrol

Installation of MariaDB Cluster

Once we enter the ClusterControl interface, the first thing to do is to deploy a new database or import an existing one. The version 1.7.2 introduced support for version 10.3 (along with 10.0,10.1,10.2). In 1.7.3 which was released this week, we can see the improved deployment of installation in the cloud.

ClusterControl: Deploy/Import

At the time of writing this blog, the current versions are 10.3.16. Latest packages are picked up by default. Select the option "Deploy Database Cluster" and follow the instructions that appear.

Now is the time to provide data needed for the connection between ClusterControl and DB nodes. At this step, you would have clean VM's or images of OS that you use inside your organization. When choosing MariaDB, we must specify User, Key or Password and port to connect by SSH to our servers.

ClusterControl: Deploy Database Cluster

After setting up the SSH access information, we must enter the data to access our database, for MariaDB that will be the superuser root. We can also specify which repository to use. You can have three types of repositories when deploying database server/cluster using ClusterControl:

Use Vendor Repository. Provision software by setting up and using the database vendor's preferred software repository. ClusterControl will install the latest version of what is provided by the database vendor repository.
Do Not Setup Vendor Repositories. No repositories will be set up by ClusterControl. ClusterControl will rely on the system configuration (your default repository files).
Create and mirror the current database vendor's repository and then deploy using the local mirrored repository. This allows you to "freeze" the current versions of the software packages.

When all is set, hit the deploy button. The deployment process will also take care of the installation of additional tools provided by MariaDB like mariabackup and tools from external vendors, popular in database administration.

Import a New Cluster

We also have the option to manage an existing setup by importing it into ClusterControl. Such an environment can be created by ClusterControl or other methods (puppet, chef, ansible, docker …). The process is simple and doesn't require specialized knowledge.

First, we must enter the SSH access credentials to our existing database servers. Then we enter the access credentials to our database, the server data directory, and the version. We add the nodes by IP or hostname, in the same way as when we deploy, and press on Import. Once the task is finished, we are ready to manage our cluster from ClusterControl. At this point, we can also define the options for the node or cluster auto recovery.

ClusterControl: Import existing 10.3 database cluster

Scaling MariaDB, Adding More Nodes to DB Cluster

With ClusterControl, adding more servers to the server is an easy step. You can do that from the GUI or CLI. For more advanced users, you can use ClusterControl Developer Studio and write a resource base condition to expand your cluster automatically.

ClusterControl: Adding MariaDB Node

ClusterControl supports an option to use an existing backup, so there is no need to overwhelm the production master node with additional work.

Securing MariaDB

The default MariaDB installation comes with relaxed security. This has been improved with the recent versions however production-grade systems still require tweaks in the default my.cnf configuration. ClusterControl deployments come with non-default my.cnf settings (different for different cluster types).

ClusterControl removes human error and provides access to a suite of security features, to automatically protect your databases from hacks and other threats.

ClusterControl: Security Panel

ClusterControl enables SSL support for MariaDB connections. Enabling SSL adds another level of security for communication between the applications (including ClusterControl) and database. MariaDB clients open encrypted connections to the database servers and verify the identity of those servers before transferring any sensitive information.

ClusterControl will execute all necessary steps, including creating certificates on all database nodes. Such certificates can be maintained later on in the Key Management tab.

With ClusterControl you can also enable auditing. It uses the audit plugin provided by MariaDB. Continuous auditing is an imperative task for monitoring your database environment. By auditing your database, you can achieve accountability for actions taken or content accessed. Moreover, the audit may include some critical system components, such as the ones associated with financial data to support a precise set of regulations like SOX, or the EU GDPR regulation. The guided process lets you choose what should be audited and how to maintain the audit log files.

Monitoring and Alerting

When working with database systems, you should be able to monitor them. That will enable you to identify trends, plan for upgrades or improvements or react effectively to any problems or errors that may arise.

ClusterControl: Overview

The new ClusterControl is using Prometheus as the data store with PromQL query language. The list of dashboards includes Server General, Server Caches, InnoDB Metrics, Replication Master, Replication Slave, System Overview, and Cluster Overview Dashboards.

ClusterControl: DashBoard

ClusterControl installs Prometheus agents, configures metrics and maintains access to Prometheus exporters configuration via its GUI, so you can better manage parameter configuration like collector flags for the exporters (Prometheus).

As a database operator, we need to be informed whenever something critical occurs in our database. The three main methods in ClusterControl to get an alert includes:

email notifications
integrations
advisors

ClusterControl: Integration Services

You can set the email notifications on a user level. Go to Settings > Email Notifications. Where you can choose between criticality and type of alert to be sent.

The next method is to use the Integration services. This is to pass the specific category of events to the other service like ServiceNow tickets, Slack, PagerDuty, etc. so you can create advanced notification methods and integrations within your organization.

The last one is to involve sophisticated metrics analysis in the Advisor section, where you can build intelligent checks and triggers.

ClusterControl: Advisors

SQL Monitoring

The SQL Monitoring is divided into three sections.

Top Queries - presents the information about queries that take a significant chunk of resources.
Query Monitor: Top queries
Running Queries - it’s a process list of information combined from all database cluster nodes into one view. You can use that to kill queries that affect your database operations.
Query Monitor: Running Queries
Query Outliers - present the list of queries with execution time longer than average.
Query Monitor: Query Outliers

Backup and Recovery

Now that you have your MariaDB up and running, and have your monitoring in place, it is time for the next step: ensure you have a backup of your data.

ClusterControl: Backup repository

ClusterControl provides an interface for MariaDB backup management with support for scheduling and creative reports. It gives you two options for backup methods.

Logical backup (text): mysqldump
Binary backups: xtrabackup (lower versions), mariabackup

A good backup strategy is a critical part of any database management system. ClusterControl offers many options for backups and recovery/restore.

ClusterControl backup retention is configurable; you can choose to retain your backup for any time period or to never delete backups. AES256 encryption is employed to secure your backups against rogue elements. For rapid recovery, backups can be restored directly into a new cluster - ClusterControl handles the full restore process from the launch of a new database setup to the recovery of data, removing error-prone manual steps from the process.

Backups can be automatically verified upon completion, and then uploaded to cloud storage services (AWS, Azure and Google). Different retention policies can be defined for local backups in the data center as well as backups that are uploaded in the cloud.

Node and cluster auto-recovery

ClusterControl provides advanced support for failure detection and handling. It also allows you to deploy different proxies to integrate them with your HA stack, so there is no need to adjust application connection string or DNS entry to redirect the application to the new master node.

When the master server is down, ClusterControl will create a job to perform automatic failover. ClusterControl does all the background work to elect a new master, deploy failover slave servers, and configure load balancers.

ClusterControl automatic failover was designed with the following principles:

Make sure the master is really dead before you failover
Failover only once
Do not failover to an inconsistent slave
Only write to the master
Do not automatically recover the failed master

With the built-in algorithms, failover can often be performed pretty quickly so you can assure the highest SLA's for your database environment.

ClusterControl: Auto Recovery

The process is highly configurable. It comes with multiple parameters that you can use to adopt recovery to the specifics of your environment. Among the different options you can find replication_stop_on_error, replication_auto_rebuild_slave, replication_failover_blacklist, replication_failover_whitelist, replication_skip_apply_missing_txs, replication_onfail_failover_script and many others.

Failover is the process of moving to a healthy standby component, during a failure or maintenance event, in order to preserve uptime. The quicker it can be done, the faster you can be back online. If you're looking at minimizing downtime and meet your SLAs through an automated approach for TimescaleDB, then this blog is for you.

MaxScale Load Balancer

In addition to MariaDB 10.3, ClusterControl adds an option of MaxScale 2.3 load balancer. MaxScale is a SQL-aware proxy that can be used to build highly available environments. It comes with numerous features, however, the main goal is to enable load balancing and high availability.

ClusterControl: MaxScale

MaxScale can be used to track the health of the master MariaDB node and, should it fail, perform a fast, automatic failover. Automated failover is crucial in building up a highly available solution that can recover promptly from the failure.

Load Balance Database Sessions

Read-write splitting is a critical feature to allow read scaling. It is enough for the application to connect to the MaxScale, and it detects the topology, determine which MariaDB acts as a master and which act as slaves. It routes the traffic accordingly to this.

Summary

We hope that this blog helps you to get familiar with ClusterControl and MariaDB 10.3 administration modules. The best option is to download ClusterControl and test each of them.

Tags:

MariaDB

management

automation

This is the second part of the multi-series How to Monitor PostgreSQL Running Inside a Docker Container. In Part 1, I presented an overview of docker containers, policies and networking. In this part we will continue with docker configuration and finally enable monitoring via ClusterControl.

PostgreSQL is an old school, open-source database whose popularity is still increasing, being widely used and accepted on most cloud environments of today.

When it’s used inside of a container it can be easily configured and managed by Docker, using different tags for the creation and shipped to any computer in the world with Docker installed, but this is all about Docker.

Now we will discuss about PostgreSQL, and to start, let’s list their six main processes running simultaneously inside of a container by the OS user called “postgres”, which is a different one from the “postgres” user inside of the database, that one is a super user.

Remember that the blue arrow in the images is displaying commands entered inside of a container.

$ ps auxww

PostgreSQL main processes

The first process on the list is the PostgreSQL server, and the others are started by him. Their duties are basically to analyze what is going on in the server, being sub-processes performing statistics inputs, writing logs, and these kinds of things.

We will use ClusterControl to monitor the activity inside of this container having the PostgreSQL server, and to do so we will need to have SSH installed in order to connect them safely.

Secure Shell Server (SSH)

To collect the information about the PostgreSQL container, nothing is better than SSH. It gives remote access for one IP address to another, and this all the ClusterControl needs to perform the job.

SSH needs to be downloaded from the repository, and to do so we must be inside of the container.

$ docker container exec -it postgres-2 bash
$ apt-get update && apt-get install -y openssh-server openssh-client

Installing SSH in the container "postgres-2"

After install we’ll edit the configuration, start the service, setup a password for the root user, and finally leave the container:

$ sed -i 's|^PermitRootLogin.*|PermitRootLogin yes|g' /etc/ssh/sshd_config
$ sed -i 's|^#PermitRootLogin.*|PermitRootLogin yes|g' /etc/ssh/sshd_config
$ service ssh start
$ passwd
$ exit

Configuring the SSH in the "postgres-2" container, part 1/2

Monitoring with ClusterControl

ClusterControl has an in-depth integration system able to monitor all the processes of PostgreSQL in real-time, also coming with a library of Advisors to keep the data secure, track the database performance, and of course providing alerts when anomalies happens.

With SSH configured, ClusterControl can monitor the OS hardware activity and give insights about both the database and the outside layer.

We will run be running a new container, and publishing it to the port 5000 of our computer, then we will be able to access the system through our browser.

$ docker container run --name s9s-ccontrol --network bridge-docker -p 5000:80 -d severalnines/clustercontrol

Running the container "s9s-ccontrol" for Severalnines ClusterControl

Once deployed, only remains the SSH configuration, and we have good news, because we are in a User-Defined Bridge Network, we can use DNS!

$ docker container exec -it s9s-ccontrol bash
$ ssh-copy-id postgres-2

Configuring the SSH in the "postgres-2" container, part 2/2

After enter “yes” and specify the password provided earlier, it’s possible to access the “postgres-2” container as root using SSH:

$ ssh postgres-2

Checking the SSH connection successfully

This new color, light blue, will be used in the following to represent the activity inside of the database. In the example above we have accessed the “postgres-2” container from “s9s-ccontrol”, but it’s still the root user. Keep your attention and criticism with me.

So the next step is going to the browser and access http://localhost:5000/clustercontrol/users/welcome/ and register an account, or if you already have one, visit http://localhost:5000/clustercontrol/users/login.

Then “Import Existing Server/Cluster” and enter the earlier configuration. The tab “PostgreSQL & TimescaleDB” must be selected. In the field “SSH User” for this demonstration just type “root”. Then finally enter the “Cluster Name”, and it can be any name that you want, is simply who will hold as many necessaries PostgreSQL containers that you want to import and monitor.

Importing the "postgres-2" database, part 1/2

Now it’s time to enter the information about the PostgreSQL container, the User “postgres”, Password “5af45Q4ae3Xa3Ff4” and the desired containers. It’s extremely important to remember that the SSH service must be active inside of the PostgreSQL container.

Importing the "postgres-2" container, part 2/2

After press the “Import” button, ClusterControl will start to manage the PostgreSQL container “postgres-2” inside of the Cluster called “PostgreSQL”, and it will inform when the process of importing is done.

Log about the process of importing the container "postgres-2"

Once finished, the system will be shown under the Clusters tab, our most recently created Cluster and different options separated in sections

Our first visualization step will be in the Overview option.

PostgreSQL Cluster imported successfully

As you can imagine, our database is empty and we don’t have any chaos here for our fun, but the graphic still works being engaged in a small scale containing statistics about the SQL and Database processes.

Displaying statistics about the SQL and Database activity

Real World Scenario Simulation

To give some action, I’ve created a CSV file using Python, exploring the GitHub repository of Socratica, who provides amazing courses on YouTube, and they make those files available for free.

In summary, the created CSV file have 9 millions, 999 thousands and 999 records about persons, each one containing ID, first name, last name, and birthday. The size of the file is 324 MB:

$ du -s -h persons.csv

Checking the size of the CSV file

We will copy this CSV file into the PostgreSQL container, then copy it again but this time into the database, and finally check the statistics in ClusterControl.

$ docker container cp persons.csv postgres-2:/persons.csv
$ docker container exec -it postgres-2 bash
$ du -s -h persons.csv
$ su - postgres
$ psql

Transfering the CSV file to the container and entering in the database

Ok, so we are in the database now, as the super user “postgres”, please note the different colors in the arrows.

Now, we must create the database, table, and populate it with the data contained in the CSV file, and finally check if everything is working fine.

$ CREATE DATABASE severalnines;
$ \c severalnines;
$ CREATE TABLE persons (id SERIAL NOT NULL, first_name VARCHAR(50), last_name VARCHAR(50), birthday DATE, CONSTRAINT persons_pkey PRIMARY KEY (id));
$ COPY persons (id, first_name, last_name, birthday) FROM '/persons.csv' DELIMITER ',' CSV HEADER;

Connecting to the new database and importing the CSV file

This process takes some minutes to complete.

Ok, so now let’s enter some queries:

Queries in the database "severalnines"

If you look at ClusterControl now, some movement in the statistics about the hardware happened:

Displaying statistics about the CPU inside of ClusterControl

An entire section to monitor the queries is provided with ease of use UI:

Displaying statistics about the Queries inside of ClusterControl

Statistics about the PostgreSQL database serve the best DBAs to perform their entire potential on their main duties, and ClusterControl is a complete system to analyze every activity happening at real time, giving information based on all the data collected from the database processes.

With ClusterControl the DBA also can easily extend their skills using a full set of tools to create Backups locally or in the Cloud, Replications, Load Balancers, integrations with services, LDAP, ChatOps, Prometheus, and so much more.

Conclusion

Through this article, we’ve been configuring PostgreSQL inside of Docker, and integrating with ClusterControl using User-Defined Bridge Network and SSH, simulating a scenario populating the database with a CSV file, and then doing an overall quick check in the ClusterControl user interface.

Tags:

NEWS ALERT: “The Marriott just revealing a massive data breach involving guest reservation database at its Starwood brand. Marriott says it was unauthorized access since 2014. This includes up to 500 million guests... For approximately 327 million of these guests, the information that was exposed includes a combination of name, mailing address, phone number, email address, you ready for this - passport number… Starwood preferred guest account information, date of birth, gender, arrival and departure information… including reservation dates and communication preference. As you know, when you go to a hotel, especially internationally, they take your passport. Often times, they take a copy of your passport.”– CNBC News Report

This is a real excerpt given by a CNBC News reporter in 2018. It took Marriott’s technical team almost four years to detect the breach of their system. The damages? It has been estimated to be between $200 and $600 million dollars.

Unfortunately, these types of reports are becoming all too common. When a hacker wants to breach a system, they use botnets to increase the success of their criminal activity. These internet-connected private computers are equipped with malicious software and are used to steal personal data.

Most of the time, they are acting for financial gain. In rare cases, attacks are performed to retrieve highly classified information. The motive for the attack is determined only after analysts track precisely what the data has been used for. If the data does not show up on the dark web, there is a high probability that nation-states executed the breach.

Regulatory requirements have become more demanding but fail to slow down cybercriminals. The 2019 Data Breach Report revealed additional shocking results. Hackers have been able to successfully infiltrate systems due to a broad range of root causes.

The causes of these security breaches include, but are not limited to:

Organizations Failing to Implement Security Patches
Social Engineering Attacks
Human Error
Downloading Malware
Misuse by Authorized Users

To prevent these intrusions, organizations rely heavily on database administrators (DBA) to perform critical daily tasks. However, network penetrations are still on the rise. If you ponder over the statement mentioned above, it does not make any sense. I’ll repeat it. To prevent these intrusions, organizations rely heavily on database administrators (DBA) to perform critical daily tasks. Again, human error is part of this massive problem. But how can this be? Utilizing DBAs to prevent cyber attacks has been used for decades.

Everyone knows that computers can complete computations at a high rate of speed, 24 hours a day until a system has been successfully infiltrated. It’s virtually impossible for anyone to secure and maintain a database under these conditions. The expectation for a human being to compete with the performance of a malicious computer is not reasonable unless you are on the other end to pull the plug. To put up the best defense against cybercriminals, you must use a computer to defend your network against a computer.

The Oracle Autonomous Database does just that.

The Database That Uses Machine Learning

To grasp full knowledge of the Autonomous Database, you must understand its infrastructure. New hardware and software have been carefully crafted to meet the needs of database users. Oracle has established an advanced network of dedicated computers to protect the parameter of their cloud, in addition to each customer zone. This prevents the malicious activity of one customer from spreading to another customer using the Oracle service. Therefore, each customer will receive the highest level of protection.

The database itself is made up of two separate products that seamlessly work together to generate the highest level of performance. The Autonomous Data Warehouse (ADW) manages analytic workloads, which systematically stores data on metrics and performance. The Autonomous Transaction Processing (ATP)manages mixed workloads,which oversees the activity of business transactions. Combined together, each product plays a critical role in the function of the system.

As you can see from figure one, both ADW and ATP include an exciting element called machine learning. It’s a branch of general artificial intelligence that’s modeled after the human brain. Computer scientists take a sizable amount of data and upload it to the neural network. Using that data as its base for accurate database performance, the computer will be able to identify patterns and make highly accurate predictions. These predictions can be used to make business decisions and increase revenue.

The database can also monitor and detect unusual activity within the system, even if the event is something the computer has never seen before. Abnormal events will be automatically flagged for further analysis. After careful inspection of the event, the machine will decide if access should be granted, or if it should be revoked to maintain the security of the system. If a manual database alteration causes an increase in vulnerability, the system will automatically take steps to correct the human error. As time goes on, the database becomes more secure because it will be comprised of a larger volume of accurate data.

Apache Zeppelin SQL Notebook is the back-end technology that allows for data processing to occur, and it is bundled with the ADW cloud service. Information is collected and displayed in the form of tables and charts for further analysis. Data scientists can then work together to review fraudulent transactions, analyze customer statistics, and gain access to the machine learning algorithms that are embedded with the service.

Key Features of the Apache Zeppelin SQL Notebook include:

System Tutorials for Ease of Use
Access to Shared Scheduler, Permissions, & Pre-Built Templates
Individual & Shared Access to SQL Notebooks
Scripting Language Assistance
Access to Machine Learning Algorithms

Ultimately, machine learning gives this database the ability to inspect large quantities of data, learn from it with no human interference, and modify system activities to stop cyber criminal’s dead in their tracks. While your typical database requires downtime for events activities, the Autonomous Database does not. The system can automatically monitor, scale, tune, upgrade, and installs the security patches. The encryption of data in transit and at rest is enabled by default.

The Collaboration of Human and Machine

Even though the database is automated, the security of the database is a combination of machine learning and human implementation. It’s a shared responsibility because each DBA must provide ensure they are entering the appropriate information for each user. The role each user is assigned plays a large part in the level of access that is granted.

When a user severs ties with the organization, the DBA should act swiftly to revoke access to maintain security. If these critical steps are not executed appropriately, the organization will be at risk of experiencing a data breach.

It’s essential to test the security of your system before a hacker does it for you. However, it can be challenging to measure the level of protection your system has, especially after a DBA completes manual changes. Oracle has equipped their system with a Database Security Assessment Tool (DBSAT). It’s a free application that analyzes the database and provides a detailed report on how secure your system is. The report also includes an analysis of user privileges based on their job description and configuration settings.

Security should always be a top priority for any business. It only takes one data breach for customers to lose their trust in the company, and discontinue association. As a result, a loss in business revenue will be unavoidable. Large businesses will typically see a significant drop in the stock market, while small businesses will be forced to close their doors. The risk is simply too great. Take advantage of the highest level of database security and go autonomous.

Tags:

Considering the current major use-case of a database as to retrieve data, it becomes very important that its performance is very high and It can only be achieved if data is fetched in the most efficient possible way from storage. There have been many successful inventions and implementations done to achieve the same. One of the well known approaches adopted by most databases is to have an index on the table.

What is a Database Index?

Database Index, as the name suggests, maintains an index to the actual data and thereby improves performance to retrieve data from the actual table. In a more database terminology, the index allows fetching page containing indexed data in a very minimal traversal as data is sorted in specific order. Index benefit comes at the cost of additional storage space in order to write additional data. Indexes are specific to the underlying table and consist of one or more keys (i.e. one or more columns of the specified table). There are primarily two types of index architecture

Clustered Index – Index data gets stored along with other part of data and data gets sorted based on index key. At most there can be only one index in this category for a specified table.
Non-Clustered Index – Index data gets stored separately and it has a pointer to the storage where other part of data is stored. This is also known as secondary index. There can be as many indexes of this category as you want on a specified table.

There are various data structures used for implementing indexes, some of the widely adopted by the majority of databases are B-Tree and Hash.

What is a PostgreSQL Index?

PostgreSQL supports only non-clustered index. That means index data and complete data (here onward referred to as heap data) are stored in a separate storage. Non-clustered indexes are like “Table of Content” in any document, wherein first we check the page number and then check those page numbers to read the whole content. In order to get the complete data based on an index, it maintains a pointer to corresponding heap data. It's the same as after knowing page number, it needs to go to that page and get the actual content of the page.

PostgreSQL: Data read using Index

For example, consider a table with three columns and an index on column ID. In order to READ the data based on the key ID=2, first, the Indexed data with the ID value 2 is searched. This contains a pointer (called as Item Pointer) in terms of the page number (i.e. block number) and offset of data within that page. In the current example, the index points to page number 5 and the second line item in the page which in turn keeps offset to the whole data(2,”Shaun”,100). Notice whole data also contains the indexed data which means the same data is repeated in two storages.

How does INDEX help to improve performance? Well, in order to select any INDEX record, it does not scan all pages sequentially, rather it just needs to partially scan some of the pages using the underlying Index data structure. But there is a twist, since each record found from Index data, it needs to look in Heap data for whole data, which causes a lot of random I/O and it’s considered to perform slower than Sequential I/O. So only if a small percentage of records are getting selected (which decided based on the PostgreSQL optimizer Cost), then only PostgreSQL chooses Index Scan otherwise even though there is an index on the table, it continues to use Sequence Scan.

In summary, though Index creation speeds up the performance ,it should be carefully chosen as it has overhead in terms of storage, degraded INSERT performance.

Now we may wonder, in-case we need only the index part of data, can we fetch only from the index storage page? Well, the answer to this is directly related to how MVCC works on the index storage as explained next.

Using MVCC for Indexing

Like Heap pages, index page maintains multiple versions of index tuple but it does not maintain visibility information. As explained in my previous MVCC blog, in order to decide suitable visible version of tuples, it requires comparing transaction. The transaction which inserted/updated/deleted tuple are maintained along with heap tuple but the same is not maintained with index tuple. This is purely done to save storage and it’s a trade-off between space and performance.

Now coming back to the original question, since the visibility information in Index tuple is not there, it needs to consult the corresponding heap tuple to see if the data selected is visible. So even though other parts of the data from heap tuple is not required, still need to access the heap pages to check visibility. But again, there is a twist in-case all tuples on a given page (page pointed by index i.e. ItemPointer) are visible then does not need to refer each item of Heap page for “visibility check” and hence the data can be returned only from the Index page. This special case is called “Index Only Scan”. In order to support this, PostgreSQL maintains a visibility map for each page to check the page level visibility.

As shown in the above image, there is an index on the table “demo“ with a key on column “id”. If we try to select only index field (i.e. id) then it chose the “Index Only Scan” (considering referring page fully visible).

Clustered Index

There is no support of direct clustered index in PostgreSQL but there is an indirect way to partially achieve the same. This is achieved by below SQL commands:

CLUSTER [VERBOSE] table_name [ USING index_name ]
CLUSTER [VERBOSE]

The first command instructs the database to cluster a table (i.e. to sort table) using the given index. This index should have been already created. This clustering is only one time operation and its impact does not remain after the subsequent operation on this table i.e. if more records are inserted/updated, the table may not remain ordered. If needed by the user to still keep table clustered (ordered) then they can use the first command without giving an index name.

The second command is only useful to re-cluster table (i.e. the table which was already clustered using some index). This command re-clusters all tables in the current database visible to the current connected user.

For example in the below figure, the first SELECT returns records in unsorted order as there is no clustered index. Even though there is already non-clustered index but the records get selected from the heap area where the records are not sorted.

The second SELECT returns the records sorted by column “id” as it has been clustered using index containing column “id”.

The third SELECT returns partial records in sorted order but newly inserted records are not sorted. The fourth SELECT again returns all records in sorted order as the table has been clustered again

PostgreSQL Cluster Command

Index Type

PostgreSQL provides several types of indexes as below:

B-Tree
Hash
GiST
GIN
BRIN

Each index type implements different kinds of underlying data-structure, which is best suited for different types of queries. By default B-Tree index gets created, which is widely used indexes. Details of each index type will be covered in a future blog.

Misc: Partial and Expression Index

We have only discussed about indexes on one or more columns of a table but there are other two ways to create indexes on PostgreSQL

Partial Index: Partial Index is an index built using the subset of a key column for a particular table. The subset is defined by the conditional expression given during create index. So with the partial index, storage space for storing index data gets saved. So the user should choose the condition in such a way that those are not very common values, as for more frequent (common) values anyway index scan will not be chosen. The rest of the functionality remains the same as for a normal index.
Example: Partial Index
Expression Index: Expression indexes gives another kind of flexibility in PostgreSQL. All indexes discussed till now, including partial indexes, are on a particular set of columns. But what if a query involves access of a table based on the expression (expression involving one or more columns), without an expression index it will not choose index scan. So in order to fast access this kind of queries, PostgreSQL allows to create an index on an expression. The rest of the functionality remains the same as for a normal index.
Example: Expression Index

Index Storage in InnoDB

The usage and functionality of Index is mostly the same as that in PostgreSQL with a major difference in terms of Clustered Index.

InnoDB supports two categories of Indexes:

Clustered Index
Secondary Index

Clustered Index

Clustered Index is a special kind of index in InnoDB. Here the indexed data is not stored separately rather it is part of the whole row data. In other words, the clustered index just forces the table data to be sorted physically using the key column of the index. It can be considered as “Dictionary”, where data is sorted based on the alphabet.

Since the clustered index sort rows using an index key, there can be only one clustered index. Also, there must be one clustered index as InnoDB uses same to optimally manipulate data during various data operations.

Clustered index are created automatically (as part of table create) using one of the table columns as per below priority:

Using the primary key if the primary key is mentioned as part of the table creation.
Chooses any unique column where all the key columns are NOT NULL.
Otherwise internally generates a hidden clustered index on a system column which contains the row ID of each row.

Unlike PostgreSQL non-clustered index, InnoDB access a row using clustered index faster because the index search leads directly to the page with all row data and hence avoiding random I/O.

Also getting the table data in sorted order using the clustered index is very fast as all data are already sorted and also whole data is available.

Secondary Index

The index created explicitly in InnoDB is considered to be a Secondary index, which is similar to PostgreSQL non-clustered index. Each record in the secondary index storage contains a primary key columns of the rows (which were used for creating Clustered Index) and also the columns specified to create a secondary index.

InnoDB: Data read using index

Fetching of data using a secondary index is similar as in-case of PostgreSQL except that InnoDB secondary index lookup gives a primary key as a pointer to fetch the remaining data from the clustered index.

For example, as shown in the above picture, the clustered index is on column ID, so table data is sorted by the same. The secondary Index is on column “name”, so as we can see the secondary index has both ID and name value. Once we lookup using the secondary index, it finds the appropriate slot with the corresponding key value. Then the corresponding primary key is used to refer to the remaining part of the data from the clustered index.

MVCC for Index

The clustered index MVCC usage the traditional InnoDB Undo Model (Actually the same as whole data MVCC, as the clustered index is nothing but whole data).

But the secondary Index MVCC usage a bit different approach to maintain MVCC. On update of the secondary index, the old index entry is delete-marked and new records are inserted in the same storage i.e. UPDATE is not in-place. Finally, old index entries get purged. By now you might have noticed that InnoDB secondary index MVCC is almost the same as that of PostgreSQL MVCC model.

Index Type

InnoDB supports only B-Tree type of Index and hence not required to be specified while creating index.

Misc: Adaptive Hash Indexes

As mentioned in the previous section that only B-Tree type index supported by InnoDB but there is a twist. InnoDB has the functionality to automatically detect if the query can benefit from building a hash index and also whole data of the table can fit into memory, then it automatically does so.

The hash index is built using the existing B-Tree index depending on the query. If there are multiple secondary B-Tree indexes, then it will choose the one which qualifies as per the query. The hash index built is not complete, it just builds partial index as per the data usage pattern.

This is one of the really powerful features to dynamically improve query performance.

Conclusion

The use of any index in any database is really helpful to improve READ performance but at the same time, it degrades INSERT/UPDATE performance as it needs to write additional data. So the index should be chosen very wisely and should be created only if the index keys are getting used as a predicate to fetch data.

InnoDB provides a very good feature in terms of the clustered index, which might be very useful depending on the use-cases. Also, its adaptive hash indexing is very powerful.

Whereas PostgreSQL provides various type of indexes, which can really give feature reach options and one or all can be used depending on the business use-case. Also the partial and the expression indexes are quite useful depending on the use case.

Tags:

PostgreSQL

index

innodb

Despite the fact that PHP 5 has reached end-of-life, there are still legacy applications built on top of it that need to run in production or test environments. If you are installing PHP packages via operating system repository, there is still a chance you will end up with PHP 5 packages, e.g. CentOS 7 operating system. Having said that, there is always a way to make your legacy applications run with the newer database versions, and thus take advantage of new features.

In this blog post, we’ll walk you through how we can run PHP 5 applications with the latest version of MySQL 8.0 on CentOS 7 operating system. This blog is based on actual experience with an internal project that required PHP 5 application to be running alongside our new MySQL 8.0 in a new environment. Note that it would work best to run the latest version of PHP 7 alongside MySQL 8.0 to take advantage of all of the significant improvements introduced in the newer versions.

PHP and MySQL on CentOS 7

First of all, let's see what files are being provided by php-mysql package:

$ cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
$ repoquery -q -l --plugins php-mysql
/etc/php.d/mysql.ini
/etc/php.d/mysqli.ini
/etc/php.d/pdo_mysql.ini
/usr/lib64/php/modules/mysql.so
/usr/lib64/php/modules/mysqli.so
/usr/lib64/php/modules/pdo_mysql.so

By default, if we installed the standard LAMP stack components come with CentOS 7, for example:

$ yum install -y httpd php php-mysql php-gd php-curl mod_ssl

You would get the following related packages installed:

$ rpm -qa | egrep 'php-mysql|mysql|maria'
php-mysql-5.4.16-46.el7.x86_64
mariadb-5.5.60-1.el7_5.x86_64
mariadb-libs-5.5.60-1.el7_5.x86_64
mariadb-server-5.5.60-1.el7_5.x86_64

The following MySQL-related modules will then be loaded into PHP:

$ php -m | grep mysql
mysql
mysqli
pdo_mysql

When looking at the API version reported by phpinfo() for MySQL-related clients, they are all matched to the MariaDB version that we have installed:

$ php -i | egrep -i 'client.*version'
Client API version => 5.5.60-MariaDB
Client API library version => 5.5.60-MariaDB
Client API header version => 5.5.60-MariaDB
Client API version => 5.5.60-MariaDB

At this point, we can conclude that the installed php-mysql module is built and compatible with MariaDB 5.5.60.

Installing MySQL 8.0

However, in this project, we are required to run on MySQL 8.0 so we chose Percona Server 8.0 to replace the default existing MariaDB installation we have on that server. To do that, we have to install Percona Repository and enable the Percona Server 8.0 repository:

$ yum install https://repo.percona.com/yum/percona-release-latest.noarch.rpm
$ percona-release setup ps80
$ yum install percona-server-server

However, we got the following error after running the very last command:

--> Finished Dependency Resolution
Error: Package: 1:mariadb-5.5.60-1.el7_5.x86_64 (@base)
           Requires: mariadb-libs(x86-64) = 1:5.5.60-1.el7_5
           Removing: 1:mariadb-libs-5.5.60-1.el7_5.x86_64 (@anaconda)
               mariadb-libs(x86-64) = 1:5.5.60-1.el7_5
           Obsoleted By: percona-server-shared-compat-8.0.15-6.1.el7.x86_64 (ps-80-release-x86_64)
               Not found
Error: Package: 1:mariadb-server-5.5.60-1.el7_5.x86_64 (@base)
           Requires: mariadb-libs(x86-64) = 1:5.5.60-1.el7_5
           Removing: 1:mariadb-libs-5.5.60-1.el7_5.x86_64 (@anaconda)
               mariadb-libs(x86-64) = 1:5.5.60-1.el7_5
           Obsoleted By: percona-server-shared-compat-8.0.15-6.1.el7.x86_64 (ps-80-release-x86_64)
               Not found
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

The above simply means that the Percona Server shared compat package shall obsolete the mariadb-libs-5.5.60, which is required by the already installed mariadb-server packages. Since this is a plain new server, removing the existing MariaDB packages is not a big issue. Let's remove them first and then try to install the Percona Server 8.0 once more:

$ yum remove mariadb mariadb-libs
...
Resolving Dependencies
--> Running transaction check
---> Package mariadb-libs.x86_64 1:5.5.60-1.el7_5 will be erased
--> Processing Dependency: libmysqlclient.so.18()(64bit) for package: perl-DBD-MySQL-4.023-6.el7.x86_64
--> Processing Dependency: libmysqlclient.so.18()(64bit) for package: 2:postfix-2.10.1-7.el7.x86_64
--> Processing Dependency: libmysqlclient.so.18()(64bit) for package: php-mysql-5.4.16-46.el7.x86_64
--> Processing Dependency: libmysqlclient.so.18(libmysqlclient_18)(64bit) for package: perl-DBD-MySQL-4.023-6.el7.x86_64
--> Processing Dependency: libmysqlclient.so.18(libmysqlclient_18)(64bit) for package: 2:postfix-2.10.1-7.el7.x86_64
--> Processing Dependency: libmysqlclient.so.18(libmysqlclient_18)(64bit) for package: php-mysql-5.4.16-46.el7.x86_64
--> Processing Dependency: mariadb-libs(x86-64) = 1:5.5.60-1.el7_5 for package: 1:mariadb-5.5.60-1.el7_5.x86_64
---> Package mariadb-server.x86_64 1:5.5.60-1.el7_5 will be erased
--> Running transaction check
---> Package mariadb.x86_64 1:5.5.60-1.el7_5 will be erased
---> Package perl-DBD-MySQL.x86_64 0:4.023-6.el7 will be erased
---> Package php-mysql.x86_64 0:5.4.16-46.el7 will be erased
---> Package postfix.x86_64 2:2.10.1-7.el7 will be erased

Removing mariadb-libs will also remove other packages that depend on this from the system. Our primary concern is the php-mysql packages which will be removed because of the dependency on libmysqlclient.so.18 provided by mariadb-libs. We will fix that later.

After that, we should be able to install Percona Server 8.0 without error:

$ yum install percona-server-server

At this point, here are MySQL related packages that we have in the server:

$ rpm -qa | egrep 'php-mysql|mysql|maria|percona'
percona-server-client-8.0.15-6.1.el7.x86_64
percona-server-shared-8.0.15-6.1.el7.x86_64
percona-server-server-8.0.15-6.1.el7.x86_64
percona-release-1.0-11.noarch
percona-server-shared-compat-8.0.15-6.1.el7.x86_64

Notice that we don't have php-mysql packages that provide modules to connect our PHP application with our freshly installed Percona Server 8.0 server. We can confirm this by checking the loaded PHP module. You should get empty output with the following command:

$ php -m | grep mysql

Let's install it again:

$ yum install php-mysql
$ systemctl restart httpd

Now we do have them and are loaded into PHP:

$ php -m | grep mysql
mysql
mysqli
pdo_mysql

And we can also confirm that by looking at the PHP info via command line:

$ php -i | egrep -i 'client.*version'
Client API version => 5.6.28-76.1
Client API library version => 5.6.28-76.1
Client API header version => 5.5.60-MariaDB
Client API version => 5.6.28-76.1

Notice the difference on the Client API library version and the API header version. We will see the after affect of that later during the test.

Let's start our MySQL 8.0 server to test out our PHP5 application. Since we had MariaDB use the datadir in /var/lib/mysql, we have to wipe it out first, re-initialize the datadir, assign proper ownership and start it up:

$ rm -Rf /var/lib/mysql
$ mysqld --initialize
$ chown -Rf mysql:mysql /var/lib/mysql
$ systemctl start mysql

Grab the temporary MySQL root password generated by Percona Server from the MySQL error log file:

$ grep root /var/log/mysqld.log
2019-07-22T06:54:39.250241Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: 1wAXsGrISh-D

Use it to login during the first time login of user root@localhost. We have to change the temporary password to something else before we can perform any further action on the server:

$ mysql -uroot -p

mysql> ALTER USER root@localhost IDENTIFIED BY 'myP455w0rD##';

Then, proceed to create our database resources required by our application:

mysql> CREATE SCHEMA testdb;
mysql> CREATE USER testuser@localhost IDENTIFIED BY 'password';
mysql> GRANT ALL PRIVILEGES ON testdb.* TO testuser@localhost;

Once done, import the existing data from backup into the database, or create your database objects manually. Our database is now ready to be used by our application.

Errors and Warnings

In our application, we had a simple test file to make sure the application is able to connect via socket, or in other words, localhost on port 3306 to eliminate all database connections via network. Immediately, we would get the version mismatch warning:

$ php -e test_mysql.php
PHP Warning:  mysqli::mysqli(): Headers and client library minor version mismatch. Headers:50560 Library:50628 in /root/test_mysql.php on line 9

At the same time, you would also encounter the authentication error with php-mysql module:

$ php -e test_mysql.php
PHP Warning:  mysqli::mysqli(): (HY000/2059): Authentication plugin 'caching_sha2_password' cannot be loaded: /usr/lib64/mysql/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory in /root/test_mysql.php on line 9

Or, if you were running with MySQL native driver library (php-mysqlnd), you would get the following error:

$ php -e test_mysql.php
PHP Warning:  mysqli::mysqli(): The server requested authentication method unknown to the client [caching_sha2_password] in /root/test_mysql.php on line 9

Plus, there would be also another issue you would see regarding charset:

PHP Warning:  mysqli::mysqli(): Server sent charset (255) unknown to the client. Please, report to the developers in /root/test_mysql.php on line 9

Solutions and Workarounds

Authentication plugin

Neither php-mysqlnd nor php-mysql library for PHP5 supports the new authentication method for MySQL 8.0. Starting from MySQL 8.0.4 authentication method has been changed to 'caching_sha2_password', which offers a more secure password hashing if compare to 'mysql_native_password' which default in the previous versions.

To allow backward compatibility on our MySQL 8.0. Inside MySQL configuration file, add the following line under [mysqld] section:

default-authentication-plugin=mysql_native_password

Restart MySQL server and you should be good. If the database user has been created before the above changes e.g, via backup and restore, re-create the user by using DROP USER and CREATE USER statements. MySQL will follow the new default authentication plugin when creating a new user.

Minor version mismatch

With php-mysql package, if we check the library version installed, we would notice the difference:

$ php -i | egrep -i 'client.*version'
Client API version => 5.6.28-76.1
Client API library version => 5.6.28-76.1
Client API header version => 5.5.60-MariaDB
Client API version => 5.6.28-76.1

The PHP library is compiled with MariaDB 5.5.60 libmysqlclient, while the client API version is on version 5.6.28, provided by percona-server-shared-compat package. Despite the warning, you can still get a correct response from the server.

To suppress this warning on library version mismatch, use php-mysqlnd package, which does not depend on MySQL Client Server library (libmysqlclient). This is the recommended way, as stated in MySQL documentation.

To replace php-mysql library with php-mysqlnd, simply run:

$ yum remove php-mysql
$ yum install php-mysqlnd
$ systemctl restart httpd

If replacing php-mysql is not an option, the last resort is to compile PHP with MySQL 8.0 Client Server library (libmysqlclient) manually and copy the compiled library files into /usr/lib64/php/modules/ directory, replacing the old mysqli.so, mysql.so and pdo_mysql.so. This is a bit of a hassle with small chance of success rate, mostly due to deprecated dependencies of header files in the current MySQL version. Knowledge of programming is required to work around that.

Incompatible Charset

Starting from MySQL 8.0.1, MySQL has changed the default character set from latin1 to utf8mb4. The utf8mb4 character set is useful because nowadays the database has to store not only language characters but also symbols, newly introduced emojis, and so on. Charset utf8mb4 is UTF-8 encoding of the Unicode character set using one to four bytes per character, if compared to the standard utf8 (a.k.a utf8mb3) which using one to three bytes per character.

Many legacy applications were not built on top of utf8mb4 character set. So it would be good if we change the character setting for MySQL server to something understandable by our legacy PHP driver. Add the following two lines into MySQL configuration under [mysqld] section:

collation-server = utf8_unicode_ci
character-set-server = utf8

Optionally, you can also add the following lines into MySQL configuration file to streamline all client access to use utf8:

[client]
default-character-set=utf8

[mysql]
default-character-set=utf8

Don't forget to restart the MySQL server for the changes to take effect. At this point, our application should be getting along with MySQL 8.0.

That's it for now. Do share any feedback with us in the comments section if you have any other issues moving legacy applications to MySQL 8.0.

Tags:

PostgreSQL is well known as the most advanced opensource database, and it helps you to manage your data no matter how big, small or different the dataset is, so you can use it to manage or analyze your big data, and of course, there are several ways to make this possible, e.g Apache Spark. In this blog, we’ll see what Apache Spark is and how we can use it to work with our PostgreSQL database.

For big data analytics, we have two different types of analytics:

Batch analytics: Based on the data collected over a period of time.
Real-time (stream) analytics: Based on an immediate data for an instant result.

What is Apache Spark?

Apache Spark is a unified analytics engine for large-scale data processing that can work on both batch and real-time analytics in a faster and easier way.

It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.

Apache Spark Components

Apache Spark Libraries

Apache Spark includes different libraries:

Spark SQL: It’s a module for working with structured data using SQL or a DataFrame API. It provides a common way to access a variety of data sources, including Hive, Avro, Parquet, ORC, JSON, and JDBC. You can even join data across these sources.
Spark Streaming: It makes easy to build scalable fault-tolerant streaming applications using a language-integrated API to stream processing, letting you write streaming jobs the same way you write batch jobs. It supports Java, Scala and Python. Spark Streaming recovers both lost work and operator state out of the box, without any extra code on your part. It lets you reuse the same code for batch processing, join streams against historical data, or run ad-hoc queries on stream state.
MLib (Machine Learning): It’s a scalable machine learning library. MLlib contains high-quality algorithms that leverage iteration and can yield better results than the one-pass approximations sometimes used on MapReduce.
GraphX: It’s an API for graphs and graph-parallel computation. GraphX unifies ETL, exploratory analysis, and iterative graph computation within a single system. You can view the same data as both graphs and collections, transform and join graphs with RDDs efficiently, and write custom iterative graph algorithms using the Pregel API.

Apache Spark Advantages

According to the official documentation, some advantages of Apache Spark are:

Speed: Run workloads 100x faster. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG (Direct Acyclic Graph) scheduler, a query optimizer, and a physical execution engine.
Ease of Use: Write applications quickly in Java, Scala, Python, R, and SQL. Spark offers over 80 high-level operators that make it easy to build parallel apps. You can use it interactively from the Scala, Python, R, and SQL shells.
Generality: Combine SQL, streaming, and complex analytics. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.
Runs Everywhere: Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources. You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

Now, let’s see how we can integrate this with our PostgreSQL database.

How to Use Apache Spark with PostgreSQL

We’ll assume you have your PostgreSQL cluster up and running. For this task, we’ll use a PostgreSQL 11 server running on CentOS7.

First, let’s create our testing database on our PostgreSQL server:

postgres=# CREATE DATABASE testing;
CREATE DATABASE
postgres=# \c testing
You are now connected to database "testing" as user "postgres".

Now, we’re going to create a table called t1:

testing=# CREATE TABLE t1 (id int, name text);
CREATE TABLE

And insert some data there:

testing=# INSERT INTO t1 VALUES (1,'name1');
INSERT 0 1
testing=# INSERT INTO t1 VALUES (2,'name2');
INSERT 0 1

Check the data created:

testing=# SELECT * FROM t1;
 id | name
----+-------
  1 | name1
  2 | name2
(2 rows)

To connect Apache Spark to our PostgreSQL database, we’ll use a JDBC connector. You can download it from here.

$ wget https://jdbc.postgresql.org/download/postgresql-42.2.6.jar

Now, let’s install Apache Spark. For this, we need to download the spark packages from here.

$ wget http://us.mirrors.quenda.co/apache/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
$ tar zxvf spark-2.4.3-bin-hadoop2.7.tgz
$ cd spark-2.4.3-bin-hadoop2.7/

To run the Spark shell we’ll need JAVA installed on our server:

$  yum install java

So now, we can run our Spark Shell:

$ ./bin/spark-shell
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://ApacheSpark1:4040
Spark context available as 'sc' (master = local[*], app id = local-1563907528854).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.4.3
      /_/

Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_212)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

We can access our Spark context Web UI available in the port 4040 on our server:

Apache Spark UI

Into the Spark shell, we need to add the PostgreSQL JDBC driver:

scala> :require /path/to/postgresql-42.2.6.jar
Added '/path/to/postgresql-42.2.6.jar' to classpath.
scala> import java.util.Properties
import java.util.Properties

And add the JDBC information to be used by Spark:

scala> val url = "jdbc:postgresql://localhost:5432/testing"
url: String = jdbc:postgresql://localhost:5432/testing
scala> val connectionProperties = new Properties()
connectionProperties: java.util.Properties = {}
scala> connectionProperties.setProperty("Driver", "org.postgresql.Driver")
res6: Object = null

Now, we can execute SQL queries. First, let’s define query1 as SELECT * FROM t1, our testing table.

scala> val query1 = "(SELECT * FROM t1) as q1"
query1: String = (SELECT * FROM t1) as q1

And create the DataFrame:

scala> val query1df = spark.read.jdbc(url, query1, connectionProperties)
query1df: org.apache.spark.sql.DataFrame = [id: int, name: string]

So now, we can perform an action over this DataFrame:

scala> query1df.show()
+---+-----+
| id| name|
+---+-----+
|  1|name1|
|  2|name2|
+---+-----+

scala> query1df.explain
== Physical Plan ==
*(1) Scan JDBCRelation((SELECT * FROM t1) as q1) [numPartitions=1] [id#19,name#20] PushedFilters: [], ReadSchema: struct<id:int,name:string>

We can add more values and run it again just to confirm that it’s returning the current values.

PostgreSQL

testing=# INSERT INTO t1 VALUES (10,'name10'), (11,'name11'), (12,'name12'), (13,'name13'), (14,'name14'), (15,'name15');
INSERT 0 6
testing=# SELECT * FROM t1;
 id |  name
----+--------
  1 | name1
  2 | name2
 10 | name10
 11 | name11
 12 | name12
 13 | name13
 14 | name14
 15 | name15
(8 rows)

Spark

scala> query1df.show()
+---+------+
| id|  name|
+---+------+
|  1| name1|
|  2| name2|
| 10|name10|
| 11|name11|
| 12|name12|
| 13|name13|
| 14|name14|
| 15|name15|
+---+------+

In our example, we’re showing only how Apache Spark works with our PostgreSQL database, not how it manages our Big Data information.

Conclusion

Nowadays, it’s pretty common to have the challenge to manage big data in a company, and as we could see, we can use Apache Spark to cope with it and make use of all the features that we mentioned earlier. The big data is a huge world, so you can check the official documentation for more information about the usage of Apache Spark and PostgreSQL and fit it to your requirements.

Tags:

Dealing with large transactions was always a pain point in Galera Cluster. The way in which Galera writeset certification works causes troubles when transactions are long or when a single row is being modified often on multiple nodes. As a result, transactions have to be rolled back and retried causing performance drops. Luckily, this problem has been addressed in Galera 4, a new release of Galera from Codership. This library is used in MariaDB 10.4, so installing MariaDB 10.4 the easiest way of testing the newly introduced features. In this blog post we will take a look at how the streaming replication can be used to mitigate problems which used to be a standard issue in previous Galera versions.

We will use three nodes of MariaDB Galera cluster version 10.4.6, which comes with Galera version of 26.4.2.

MariaDB [(none)]> show global status like 'wsrep_provider%';
+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name               | Value                                                                                                                                          |
+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| wsrep_provider_capabilities | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO: |
| wsrep_provider_name         | Galera                                                                                                                                         |
| wsrep_provider_vendor       | Codership Oy <info@codership.com>                                                                                                              |
| wsrep_provider_version      | 26.4.2(r4498)                                                                                                                                  |
+-----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.001 sec)

There are three main pain points streaming replication is intended to deal with:

Long transactions
Large transactions
Hot spots in tables

Let’s consider them one by one and see how streaming replication may help us to deal with them but first let’s focus on the writeset certification - the root cause of those issues to show up.

Writeset certification in Galera cluster

Galera cluster consists of multiple writeable nodes. Each transaction executed on Galera cluster forms a writeset. Every writeset has to be sent to all of the nodes in the cluster for certification - a process which ensures that all the nodes can apply given transaction. Writesets have to be executed on all of the cluster nodes so if there is any conflict, transaction cannot be committed. What are typical reasons why the transaction cannot be committed? Well, the three points we listed earlier:

Long transactions - longer the transaction takes, more likely is that in the meantime another node will execute updates which will eventually conflict with the writeset and prevent it from passing the certification
Large transactions - first of all, large transactions are also longer than small ones, so that triggers the first problem. Second problem, strictly related to the large transactions, is the volume of the changes. More rows are going to be updated, more likely is that some write on another node will result in a conflict and the whole transaction would have to be rolled back.
Hot spots in tables - more likely given row is to be updated, more probably such update will happen simultaneously on multiple nodes resulting in some of the transactions to be rolled back

The main issue here is that Galera does not introduce any locking on nodes other than the initial node, on which the transaction was opened. The certification process is based on a hope that if one node could execute a transaction, others should be able to do it too. It is true but, as we discussed, there are corner cases in which the probability of this happening is significantly reduced.

In Galera 4, with streaming replication, the behavior has changed and all locks are being taken in all of the nodes. Transactions will be split into parts and each part will be certified on all nodes. After successful certification rows will be locked on all nodes in the cluster. There are a couple of variables that govern how exactly this is done - wsrep_trx_fragment_size and wsrep_trx_fragment_unit define how large the fragment should be and how it should be defined. It is very fine-grained control: you can define fragment unit as bytes, statements or rows which makes it possible to run the certification for every row modified in the transaction. Let’s take a look at how you can benefit from the streaming replication in real life.

Working with the streaming replication

Let’s consider the following scenario. We have a transaction to run that takes at least 30 seconds:

BEGIN; UPDATE sbtest.sbtest1 SET k = k - 2 WHERE id < 2000 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 2000 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 2000 ; SELECT SLEEP(30); COMMIT;

Then, while it is running, we will execute SQL that touches similar rows. This will be executed on another node:

BEGIN; UPDATE sbtest.sbtest1 SET k = k - 1 WHERE id < 20 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 20 ; COMMIT;

What would be the result?

The first transaction is rolled back as soon as the second one is executed:

MariaDB [sbtest]> BEGIN; UPDATE sbtest.sbtest1 SET k = k - 2 WHERE id < 2000 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 2000 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 2000 ; SELECT SLEEP(30); COMMIT;
Query OK, 0 rows affected (0.001 sec)

Query OK, 667 rows affected (0.020 sec)
Rows matched: 667  Changed: 667  Warnings: 0

Query OK, 667 rows affected (0.010 sec)
Rows matched: 667  Changed: 667  Warnings: 0

Query OK, 667 rows affected (0.009 sec)
Rows matched: 667  Changed: 667  Warnings: 0

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Query OK, 0 rows affected (0.001 sec)

The transaction on the second node succeeded:

MariaDB [(none)]> BEGIN; UPDATE sbtest.sbtest1 SET k = k - 1 WHERE id < 20 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 20 ; COMMIT;
Query OK, 0 rows affected (0.000 sec)

Query OK, 7 rows affected (0.002 sec)
Rows matched: 7  Changed: 7  Warnings: 0

Query OK, 7 rows affected (0.001 sec)
Rows matched: 7  Changed: 7  Warnings: 0

Query OK, 0 rows affected (0.004 sec)

What we can do to avoid it is to use streaming replication for the first transaction. We will ask Galera to certify every row change:

MariaDB [sbtest]> BEGIN; SET SESSION wsrep_trx_fragment_size=1 ; SET SESSION wsrep_trx_fragment_unit='rows' ; UPDATE sbtest.sbtest1 SET k = k - 2 WHERE id < 2000 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 2000 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 2000 ; SELECT SLEEP(30); COMMIT; SET SESSION wsrep_trx_fragment_size=0;
Query OK, 0 rows affected (0.001 sec)

Query OK, 0 rows affected (0.000 sec)

Query OK, 0 rows affected (0.000 sec)

Query OK, 667 rows affected (1.757 sec)
Rows matched: 667  Changed: 667  Warnings: 0

Query OK, 667 rows affected (1.708 sec)
Rows matched: 667  Changed: 667  Warnings: 0

Query OK, 667 rows affected (1.685 sec)
Rows matched: 667  Changed: 667  Warnings: 0

As you can see, this time it worked just fine. On the second node:

MariaDB [(none)]> BEGIN; UPDATE sbtest.sbtest1 SET k = k - 1 WHERE id < 20 ; UPDATE sbtest.sbtest1 SET k = k + 1 WHERE id < 20 ; COMMIT;
Query OK, 0 rows affected (0.000 sec)

Query OK, 7 rows affected (33.942 sec)
Rows matched: 7  Changed: 7  Warnings: 0

Query OK, 7 rows affected (0.001 sec)
Rows matched: 7  Changed: 7  Warnings: 0

Query OK, 0 rows affected (0.026 sec)

What is interesting, you can see that the UPDATE took almost 34 seconds to execute - this was caused by the fact that the initial transaction, through the streaming replication, locked all modified rows on all of the nodes and our second transaction had to wait for the first one to complete even though both transactions were executed on different nodes.

This is basically it when it comes to the streaming replication. Depending on the requirements and the traffic you may use it in a less strict manner - we certified every row but you can change that to every n-th row or every statement. You can even decide on the volume of data to certify. This should be enough to match the requirements of your environment.

There are couple more things we would like you to keep in mind and remember. First of all, streaming replication is by no means a solution that should be used by default. This is the reason why it is, by default, disabled. The recommended use case is to manually decide on transactions that would benefit from the streaming replication and enable it at the session level. This is the reason why our examples end with:

SET SESSION wsrep_trx_fragment_size=0;

This statement (setting wsrep_trx_fragment_size to 0) disables streaming replication for current session.

Another thing worth remembering - if you happen to use streaming replication, it will use ‘wsrep_streaming_log’ table in ‘mysql’ schema for persistently storing the data that is streaming. Using this table you can get some idea about the data that is being transferred across the cluster using streaming replication.

Finally, the performance. This is also one of the reasons why you do not want to use streaming replication all the time. Main reason for that is locking - with streaming replication you have to acquire row locks on all of the nodes. This takes time to get the locks and, should you have to rollback the transaction, it will also put the pressure on all nodes to perform the rollback. We ran a very quick test of the performance impact that the streaming replication have. The environment is strictly a test one so do not assume those results to be the same on the production-grade hardware, it is more for you to see what the impact could be.

We tested four scenarios:

Baseline, set global wsrep_trx_fragment_size=0;
set global wsrep_trx_fragment_unit='rows'; set global wsrep_trx_fragment_size=1;
set global wsrep_trx_fragment_unit='statements'; set global wsrep_trx_fragment_size=1;
set global wsrep_trx_fragment_unit='statements'; set global wsrep_trx_fragment_size=5;

We used sysbench r/w test:

sysbench /root/sysbench/src/lua/oltp_read_write.lua --threads=4 --events=0 --time=300 --mysql-host=10.0.0.141 --mysql-user=sbtest --mysql-password=sbtest --mysql-port=3306 --tables=32 --report-interval=1 --skip-trx=off --table-size=100000 --db-ps-mode=disable run

The results are:

Transactions: 82.91 per sec., queries: 1658.27 per sec. (100%)
Transactions: 54.72 per sec., queries: 1094.43 per sec. (66%)
Transactions: 54.76 per sec., queries: 1095.18 per sec. (66%)
Transactions: 70.93 per sec., queries: 1418.55 per sec. (86%)

As you can see, the impact is significant, performance drops even by 33%.

We hope you found this blog post informative and it gave you some insights into the streaming replication that comes with Galera 4 and MariaDB 10.4. We tried to cover use cases and potential drawbacks related to this new technology.

Tags:

MariaDB

mariadb galera cluster

streaming replication

How deep should we go with this? I’ll start by saying that as of this writing, I could locate only 3 books on Amazon about PostgreSQL in the cloud, and 117 discussions on PostgreSQL mailing lists about Aurora PostgreSQL. That doesn’t look like a lot, and it leaves me, the curious PostgreSQL end user, with the official documentation as the only place where I could really learn some more. As I don’t have the ability, nor the knowledge to adventure myself much deeper, there is AWS re:Invent 2018 for those who are looking for that kind of thrill. I can settle for Werner’s article on quorums.

To get warmed up, I started from the Aurora PostgreSQL homepage where I noted that the benchmark showing that Aurora PostgreSQL is three times faster than a standard PostgreSQL running on the same hardware dates back to PostgreSQL 9.6. As I’ve learned later, 9.6.9 is currently the default option when setting up a new cluster. That is very good news for those who don’t want to, or cannot upgrade right away. And why only 99.99% availability? One explanation can be found in Bruce Momjian’s article.

Compatibility

According to AWS, Aurora PostgreSQL is a drop-in replacement for PostgreSQL, and the documentation states:

The code, tools, and applications you use today with your existing MySQL and PostgreSQL databases can be used with Aurora.

That is reinforced by Aurora FAQs:

It means that most of the code, applications, drivers and tools you already use today with your PostgreSQL databases can be used with Aurora with little or no change. The Amazon Aurora database engine is designed to be wire-compatible with PostgreSQL 9.6 and 10, and supports the same set of PostgreSQL extensions that are supported with RDS for PostgreSQL 9.6 and 10, making it easy to move applications between the two engines.

“most” in the above text suggests that there isn’t a 100% guarantee in which case those seeking certainty should consider purchasing technical support from either AWS Professional Services, or Aamazon Aurora partners. As a side note, I did notice that none of the PostgreSQL professional Hosting Providers employing core community contributors are on that list.

From Aurora FAQs page we also learn that Aurora PostgreSQL supports the same extensions as RDS, which in turn lists most of the community extensions and a few extras.

Concepts

As part of Amazon RDS, Aurora PostgreSQL comes with its own terminology:

Cluster: A Primary DB instance in read/write mode and zero or more Aurora Replicas. The primary DB is often labeled a Master in `AWS diagrams`_, or Writer in the AWS Console. Based on the reference diagram we can make an interesting observation: Aurora writes three times. As the latency between the AZs is typically higher than within the same AZ, the transaction is considered committed as soon it's written on the data copy within the same AZ, otherwise the latency and potential outages between AZs.
Cluster Volume: Virtual database storage volume spanning multiple AZs.
Aurora URL: A `host:port` pair.
Cluster Endpoint: Aurora URL for the Primary DB. There is one Cluster Endpoint.
Reader Endpoint: Aurora URL for the replica set. To make an analogy with DNS it's an alias (CNAME). Read requests are load balanced between available replicas.
Custom Endpoint: Aurora URL to a group consisting of one or more DB instances.
Instance Endpoint: Aurora URL to a specific DB instance.
Aurora Version: Product version returned by `SELECT AURORA_VERSION();`.

PostgreSQL Performance and Monitoring on AWS Aurora

Sizing

Aurora PostgreSQL applies a best guess configuration which is based on the DB instance size and storage capacity, leaving further tuning to the DBA through the use of DB Parameters groups.

When selecting the DB instance, base your selection on the desired value for max_connections.

Scaling

Aurora PostgreSQL features auto and manual scaling. Horizontal scaling of read replicas is automated through the use of performance metrics. Vertical scaling can be automated via APIs.

Horizontal scaling takes the offline for a few minutes while replacing compute engine and performing any maintenance operations (upgrades, patching). Therefore AWS recommend performing such operations during maintenance windows.

Scaling in both directions is a breeze:

Vertical scaling: modifying instance class

Horizontal scaling: adding reader replica.

At the storage level, space is added in 10G increments. Allocated storage is never reclaimed, see below for how to address this limitation.

Storage

As mentioned above, Aurora PostgreSQL was engineered to take advantage of quorums in order to improve performance consistency.

Since the underlying storage is shared by all DB instances within the same cluster, no additional writes on standby nodes are required. Also, adding or removing DB instances doesn’t change the underlying data.

Wondering what those IOs units mean on the monthly bill? Aurora FAQs comes to the rescue again to explain what an IO is, in the context of monitoring and billing. A Read IO as the equivalent of an 8KiB database page read, and a Write IO as the equivalent of 4KiB written to the storage layer.

High Concurrency

In order to take full advantage of Aurora’s high-concurrency design, it is recommended that applications are configured to drive a large number of concurrent queries and transactions.

Applications designed to direct read and write queries to respectively standby and primary database nodes will benefit from Aurora PostgreSQL reader replica endpoint.

Connections are load balanced between read replicas.

Using custom endpoints database instances with more capacity can be grouped together in order to run an intensive workload such as analytics.

DB Instance Endpoints can be used for fine-grained load balancing or fast failover.

Note that in order for the Reader Endpoints to load balance individual queries, each query must be sent as a new connection.

Caching

Aurora PostgreSQL uses a Survivable Cache Warming technique which ensures that the date in the buffer cache is preserved, eliminating the need for repopulating or warming-up the cache following a database restart.

Replication

Replication lag time between replicas is kept within single digit millisecond. Although not available for PostgreSQL, it’s good to know that cross-region replication lag is kept within 10s of milliseconds.

According to documentation replica lag increases during periods of heavy write requests.

Query Execution Plans

Based on the assumption that query performance degrades over time due to various database changes, the role of this Aurora PostgreSQL component is to maintain a list of approved or rejected query execution plans.

Plans are approved or rejected using either proactive or reactive methods.

When an execution plan is marked as rejected, the Query Execution Plan overrides the PostgreSQL optimizer decisions and prevents the “bad” plan from being executed.

This feature requires Aurora 2.1.0 or later.

PostgreSQL High Availability and Replication on AWS Aurora

At the storage layer, Aurora PostgreSQL ensures durability by replicating each 10GB of storage volume, six times across 3 AZs (each region consists of typically 3 AZs) using physical synchronous replication. That makes it possible for database writes to continue working even when 2 copies of data are lost. Read availability survives the loss of 3 copies of data.

Read replicas ensure that a failed primary instance can be quickly replaced by promoting one of the 15 available replicas. When selecting a multi-AZ deployment one read replica is automatically created. Failover requires no user intervention, and database operations resume in less than 30 seconds.

For single-AZ deployments, the recovery procedure includes a restore from the last known good backup. According to Aurora FAQs the process completes in under 15 minutes if the database needs to be restored in a different AZ. The documentation isn’t that specific, claiming that it takes less than 10 minutes to complete the restore process.

No change is required on the application side in order to connect to the new DB instance as the cluster endpoint doesn’t change during a replica promotion or instance restore.

Step 1: delete the primary instance to force a failover:

Automatic failover Step 1: delete primary

Step 2: automatic failover completed

Automatic failover Step 2: failover completed.

For busy databases, the recovery time following a restart or crash is dramatically reduced since Aurora PostgreSQL doesn’t need to replay the transaction logs.

As part of full-managed service, bad data blocks and disks are automatically replaced.

Failover when replicas exist takes up to 120 seconds with often time under 60 seconds. Faster recovery times can be achieved by when failover conditions are pre-determined, in which case replicas can be assigned failover priorities.

Aurora PostgreSQL plays nice with Amazon RDS – an Aurora instance can act as a read replica for a primary RDS instance.

Aurora PostgreSQL supports Logical Replication which, just like in the community version, can be used to overcome built-in replication limitations. There is no automation or AWS console interface.

Security for PostgreSQL on AWS Aurora

At network level, Aurora PostgreSQL leverages AWS core components, VPC for cloud network isolation and Security Groups for network access control.

There is no superuser access. When creating a cluster, Aurora PostgreSQL creates a master account with a subset of superuser permissions:

postgres@pg107-dbt3medium-restored-cluster:5432 postgres> \du+ postgres
                               List of roles
 Role name |          Attributes               |    Member of       | Description
-----------+-------------------------------+-----------------+-------------
 postgres  | Create role, Create DB           +| {rds_superuser} |
            | Password valid until infinity  |                 |

To secure data in transit, Aurora PostgreSQL provides native SSL/TLS support which can be configured per DB instance.

All data at rest can be encrypted with minimal performance impact. This also applies to backups, snapshots, and replicas.

Encryption at rest.

Authentication is controlled by IAM policies, and tagging allows further control over what users are allowed to do and on what resources.

API calls used by all cloud services are logged in CloudTrail.

Client side Restricted Password Management is available via the rds.restrict_password_commands parameter.

PostgreSQL Backup and Recovery on AWS Aurora

Backups are enabled by default and cannot be disabled. They provide point-in-time-recovery using a full daily snapshot as a base backup.

Restoring from an automated backup has a couple of disadvantages: the time to restore may be several hours and data loss may be up to 5 minutes preceding the outage. Amazon RDS Multi-AZ Deployments solve this problem by promoting a read replica to primary, with no data loss.

Database Snapshots are fast and don’t impact the cluster performance. They can be copied or shared with other users.

Taking a snapshot is almost instantaneous:

Snapshot time.

Restoring a snapshot is also fast. Compare with PITR:

Backups and snapshots are stored in S3 which offers eleven 9’s of durability.

Aside from backups and snapshots, Aurora PostgreSQL allows databases to be cloned. This is an efficient method for creating copies of large data sets. For example, cloning multi-terabytes of data take only minutes and there is no performance impact.

Aurora PostgreSQL - Point-in-Time Recovery Demo

Connecting to cluster:

~ $ export PGUSER=postgres PGPASSWORD=postgres PGHOST=s9s-us-east-1.cluster-ctfirtyhadgr.us-east-1.rds.amazonaws.com
~ $ psql
Pager usage is off.
psql (11.3, server 10.7)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

Populate a table with data:

postgres@s9s-us-east-1:5432 postgres> create table s9s (id serial not null, msg text, created timestamptz not null default now());
CREATE TABLE

postgres@s9s-us-east-1:5432 postgres> select * from s9s;
id | msg  |            created
----+------+-------------------------------
1 | test | 2019-06-25 07:57:40.022125+00
2 | test | 2019-06-25 07:57:57.666222+00
3 | test | 2019-06-25 07:58:05.593214+00
4 | test | 2019-06-25 07:58:08.212324+00
5 | test | 2019-06-25 07:58:10.156834+00
6 | test | 2019-06-25 07:59:58.573371+00
7 | test | 2019-06-25 07:59:59.5233+00
8 | test | 2019-06-25 08:00:00.318474+00
9 | test | 2019-06-25 08:00:11.153298+00
10 | test | 2019-06-25 08:00:12.287245+00
(10 rows)

Initiate the restore:

Point-in-Time Recovery: initiate restore.

Once the restore is complete log in and check:

~ $ psql -h pg107-dbt3medium-restored-cluster.cluster-ctfirtyhadgr.us-east-1.rds.amazonaws.com
Pager usage is off.
psql (11.3, server 10.7)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

postgres@pg107-dbt3medium-restored-cluster:5432 postgres> select * from s9s;
id | msg  |            created
----+------+-------------------------------
1 | test | 2019-06-25 07:57:40.022125+00
2 | test | 2019-06-25 07:57:57.666222+00
3 | test | 2019-06-25 07:58:05.593214+00
4 | test | 2019-06-25 07:58:08.212324+00
5 | test | 2019-06-25 07:58:10.156834+00
6 | test | 2019-06-25 07:59:58.573371+00
(6 rows)

Best Practices

Monitoring and Auditing

Integrate database activity streams with third party monitoring in order to monitor database activity for compliance and regulatory requirements.
A fully-managed database service doesn’t mean lack of responsibility — define metrics to monitor the CPU, RAM, Disk Space, Network, and Database Connections.
Aurora PostgreSQL integrates with AWS standard monitoring tool CloudWatch, as well as providing additional monitors for Aurora Metrics, Aurora Enhanced Metrics, Performance Insight Counters, Aurora PostgreSQL Replication, and also for RDS Metrics that can be further grouped by RDS Dimensions.
Monitor Average Active Sessions DB Load by Wait for signs of connections overhead, SQL queries that need tuning, resource contention or an undersized DB instance class.
Setup Event Notifications.
Configure error log parameters.
Monitor configuration changes to database cluster components: instances, subnet groups, snapshots, security groups.

Replication

Use native table partitioning for workloads that exceed the maximum DB instance class and storage capacity

Encryption

Encrypted database must have backups enabled to ensure data can be restored in case the encryption key is revoked.

Master Account

Do not use psql to change the master user password.

Sizing

Consider using different instance classes in a cluster in order to reduce costs.

Parameter Groups

Fine tune using Parameter Groups in order to save $$$.

Parameter Groups Demo

Current settings:

postgres@s9s-us-east-1:5432 postgres> show shared_buffers ;
shared_buffers
----------------
10112136kB
(1 row)

Create a new parameter group and set the new cluster wide value:

Updating shared_buffers cluster wide.

Associate the custom parameter group with the cluster:

Reboot the writer and check the value:

postgres@s9s-us-east-1:5432 postgres> show shared_buffers ;
shared_buffers
----------------
1GB
(1 row)

Set the local timezone

By default, the timezone is in UTC:

postgres@s9s-us-east-1:5432 postgres> show timezone;
TimeZone
----------
UTC
(1 row)

Setting the new timezone:

Configuring timezone

And then check:

postgres@s9s-us-east-1:5432 postgres> show timezone;
TimeZone
------------
US/Pacific
(1 row)

Note that the list of timezone values accepted by Amazon Aurora is not the timezonesets found in upstream PostgreSQL.

Review instance parameters that are overridden by cluster parameters
Use the parameter group comparison tool.

Snapshots

Avoid additional storage charges by sharing the snapshots with another accounts to allow restoring into their respective environments.

Maintenance

Change the default maintenance window according to organization schedule.

Failover

Improve recovery time by configuring the Cluster Cache Management.
Lower the kernel TCP keepalive values on the client and configure the application DNS cache and TTL, and PostgreSQL connection strings.

DBA Beware!

In addition to the known limitations avoid, or be aware of the following:

Encryption

Once a database has been created the encryption state cannot be changed.

Aurora Serverless

At this time, the PostgreSQL version of Aurora Serverless is only available in limited preview.

Parallel Query

Amazon Parallel Query is not available, although the feature with the same name is available since PostgreSQL 9.6.

Endpoints

From Amazon Connection Management:

5 Custom Endpoints per cluster
Custom Endpoint names cannot exceed 63 characters
Cluster Endpoint names are unique within the same region
As seen in the above screenshot (aurora-custom-endpoint-details) READER and ANY custom endpoint types aren’t available, use the CLI
Custom Endpoints are unaware of replicas becoming temporarily unavailable

Replication

When promoting a Replica to Primary, connections via the Reader Endpoint may continue to be directed for a brief time to the promoted Replica.
Cross-region Replicas are not supported
While released at the end of November 2017, the Amazon Aurora Multi-Master preview is still not available for PostgreSQL
Watch for performance degradation when logical replication is enabled on the cluster.
Logical Replication requires a published running PostgreSQL engine 10.6 or later.

Storage

Maximum allocated storage does not shrink when data is deleted, neither is space reclaimed by restoring from snapshots. The only way to reclaim space is by performing a logical dump into a new cluster.

Backup and Recovery

Backups retention isn’t extended while the cluster is stopped.
Maximum retention period is 35 days— use manual snapshots for a longer retention period.
point-in-time recovery restores to a new DB cluster.
brief interruption of reads during failover to replicas.
Disaster Recovery scenarios are not available cross-region.

Snapshots

Restoring from snapshot creates a new endpoint (snapshots can only be restored to a new cluster).
Following a snapshot restore, custom endpoints must be recreated.
Restoring from snapshots resets the local timezone to UTC.
Restoring from snapshots does not preserve the custom security groups.
Snapshots can be shared with a maximum of 20 AWS account IDs.
Snapshots cannot be shared between regions.
Incremental snapshots are always copied as full snapshots, between regions and within the same region.
Copying snapshots across regions does not preserve the non-default parameter groups.

Billing

The 10 minutes bill applies to new instances, as well as following a capacity change (compute, or storage).

Authentication

Using IAM database authentication imposes a limit on the number of connections per second.
The master account has certain superuser privileges revoked.

Starting and Stopping

From Overview of Stopping and Staring an Aurora DB Cluster:

Clusters cannot be left stopped indefinitely as they are started automatically after 7 days.
Individual DB instances cannot be stopped.

Upgrades

In-place major version upgrades are not supported.
Parameter group changes for both DB instance and DB cluster take at least 5 minutes to propagate.

Cloning

15 clones per database (original or copy).
Clones are not removed when deleting the source database.

Scaling

Auto-Scaling requires that all replicas are available.
There can be only `one auto-scaling policy`_ per metric per cluster.
Horizontal scaling of the primary DB instance (instance class) is not fully automatic. Before scaling the cluster triggers an automatic failover to one of the replicas. After scaling completes the new instance must be manually promoted from reader to writer:
New instance left in reader mode after DB instance class change.

Monitoring

Publishing PostgreSQL logs to CloudWatch requires a minimum database engine version of 9.6.6 and 10.4.
Only some Aurora metrics are available in the RDS Console and other metrics have different names and measurement units.
By default, Enhanced Monitoring logs are kept in CloudWatch for 30 days.
Cloudwatch and Enhanced Monitoring metrics will differ, as they gather data from the hypervisor and respectively the agent running on the instance.
Performance Insights_ aggregates the metrics across all databases within a DB Instance.
SQL statements are limited to 500 characters when viewed with AWS Performance Insights CLI and API.

Migration

Only RDS unencrypted DB Snapshots can be encrypted at rest.
Migrations using the Aurora Read Replica technique take several hours per TiB.

Sizing

The smallest available instance class is db.t3.medium and the largest db.r5.24xlarge. For comparison, the MySQL engine offers db.t2.small and db.t2.medium, however no db.r5.24xlarge in the upper range.
max_connections upper limit is 262,143.

Query Plan Management

Statements inside PL/pgSQL functions are unsupported.

Migration

Aurora PostgreSQL does not provide direct migration services, rather the task is offloaded to a specialized AWS product, namely AWS DMS.

Conclusion

As a fully-managed drop-in replacement for the upstream PostgreSQL, Amazon Aurora PostgreSQL takes advantage of the technologies that power the AWS cloud to remove the complexity required to setup services such as auto-scaling, query load-balancing, low-level data replication, incremental backups, and encryption.

The architecture and a conservative approach for upgrading the PostgreSQL engine provides the performance and the stability organizations from small to large are looking for.

The inherent limitations are just a proof that building a large scale Database as a Service is a complex task, leaving the highly specialized PostgreSQL Hosting Providers with a niche market they can tap into.

Tags:

Presto is an open-source, parallel distributed, SQL engine for big data processing. It was developed from the ground-up by Facebook. The first internal release took place in 2013 and was quite a revolutionary solution for their big data problems.

With the hundreds of geo-located servers and petabytes of data, Facebook started to look for an alternative platform for their Hadoop clusters. Their infrastructure team wanted to reduce the time needed to run analytics batch jobs and simplify pipeline development by using programming language widely known in the organization - SQL.

According to Presto foundation, “Facebook uses Presto for interactive queries against several internal data stores, including their 300PB data warehouse. Over 1,000 Facebook employees use Presto daily to run more than 30,000 queries that in total scan over a petabyte each per day.”

While Facebook has an exceptional data warehouse environment, the same challenges are present in many organizations dealing with big data.

In this blog, we will take a look at how to set up a basic presto environment using a Docker server from the tar file. As a data source, we will focus on the MySQL data source, but it could be any other popular RDBMS.

Running Presto in Big Data Environment

Before we start, let's take a quick look at its main architecture principles. Presto is an alternative to tools that query HDFS using pipelines of MapReduce jobs - such as Hive. Unlike Hive Presto doesn't use MapReduce. Presto runs with a special-purpose query execution engine with high-level operators and in-memory processing.

In contrast to Hive Presto can stream data through all the stages at once running data chunks concurrently. It is designed to run ad-hoc analytic queries against single or distributed heterogeneous data sources. It can reach out from a Hadoop platform to query relational databases or other data stores like flat files.

Presto uses standard ANSI SQL including aggregations, joins or analytic window functions. SQL is well known and much easier to use comparing to MapReduce written in Java.

Deploying Presto to Docker

The basic Presto configuration can be deployed with a pre-configured Docker image or presto server tarball.

The docker server and Presto CLI containers can be easily deployed with:

docker run -d -p 127.0.0.1:8080:8080 --name presto starburstdata/presto
docker exec -it presto presto-cli

You may choose between two Presto server versions. Apache version and Enterprise version from Starburst.
Since we are going to run it in a non-production sandbox environment, we will use the Apache version in this article.

Pre-requirements

Presto is implemented entirely in Java and requires JVM to be installed on your system. It runs on both OpenJDK and Oracle Java. The minimum version is Java 8u151 or Java 11.

To download JAVA JDK visit https://openjdk.java.net/ or https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

You can check your Java version with

$ java -version
openjdk version "1.8.0_222"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_222-b10)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.222-b10, mixed mode)

Presto Installation

To install Presto we are going to download server tar and Presto CLI jar executable.

The tarball will contain a single top-level directory, presto-server-0.223, which we will call the installation directory.

$ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.223/presto-server-0.223.tar.gz
$ tar -xzvf presto-server-0.223.tar.gz
$ cd presto-server-0.223/
$ wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.223/presto-cli-0.223-executable.jar
$ mv presto-cli-0.223-executable.jar presto
$ chmod +x presto

Additionally, Presto needs a data directory for storing logs, etc.

It’s recommended to create a data directory outside of the installation directory.

$ mkdir -p ~/data/presto/

This location is the place when we start our troubleshooting.

Configuring Presto

Before we start our first instance we need to create a bunch of configuration files. Start with the creation of an etc/ directory inside the installation directory. This location will hold the following configuration files:

etc/

Node Properties - node environmental configuration
JVM Config (jvm.config) - Java Virtual Machine config
Config Properties(config.properties) -configuration for the Presto server
Catalog Properties - configuration for Connectors (data sources)
Log Properties - Loggers configuration

Below you can find some basic configuration to run Presto sandbox. For more details visit documentation.

vi etc/config.properties

Config.properties
coordinator = true
node-scheduler.include-coordinator = true
http-server.http.port = 8080
query.max-memory = 5GB
query.max-memory-per-node = 1GB
discovery-server.enabled = true
discovery.uri = http://localhost:8080
vi etc/jvm.config
-server
-Xmx8G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
vi etc/log.properties
com.facebook.presto = INFO

vi etc/node.properties

node.environment = production
node.id = ffffffff-ffff-ffff-ffff-ffffffffffff
node.data-dir = /Users/bartez/data/presto

The basic etc/ structure may look as follows:

The next step is to set up the MySQL connector.
We are going to connect to one of the 3 nodes MariaDB Cluster.

And another standalone instance running Oracle MySQL 5.7.

The MySQL connector allows querying and creating tables in an external MySQL database. This can be used to join data between different systems like MariaDB and MySQL from Oracle.

Presto uses pluggable connectors and the configuration is very easy. To configure the MySQL connector, create a catalog properties file in etc/catalog named, for example, mysql.properties, to mount the MySQL connector as the mysql catalog. Each of the files representing a connection to other server. In this case, we have two files:

vi etc/catalog/mysq.properties:

connector.name=mysql
connection-url=jdbc:mysql://node1.net:3306
connection-user=bart
connection-password=secret

vi etc/catalog/mysq2.properties

connector.name=mysql
connection-url=jdbc:mysql://node4.net:3306
connection-user=bart2
connection-password=secret

Running Presto

When all is set it’s time to start Presto instance. To start presto go to the bin directory under preso installation and run the following:

$ bin/launcher start
Started as 18363

To stop Presto run

$ bin/launcher stop

Now when the server is up and running we can connect to Presto with CLI and query MySQL database.

To start Presto console run:

./presto --server localhost:8080 --catalog mysql --schema employees

Now we can query our databases via CLI.

presto:mysql> select * from mysql.employees.departments;
 dept_no |     dept_name
---------+--------------------
 d009    | Customer Service
 d005    | Development
 d002    | Finance
 d003    | Human Resources
 d001    | Marketing
 d004    | Production
 d006    | Quality Management
 d008    | Research
 d007    | Sales
(9 rows)

Query 20190730_232304_00019_uq3iu, FINISHED, 1 node
Splits: 17 total, 17 done (100,00%)
0:00 [9 rows, 0B] [81 rows/s, 0B/s]

Both databases MariaDB cluster and MySQL has been feed with employees database.
wget https://github.com/datacharmer/test_db/archive/master.zip

mysql -uroot -psecret < employees.sql

The status of the query is also visible in the Presto web console: http://localhost:8080/ui/#

Presto Cluster overview

Conclusion

Many well known companies (like Airbnb, Netflix, Twitter) are adopting Presto for low latency performance. It’s without a doubt very interesting software which may eliminate the need for running heavy ETL data warehouse processes. In this blog, we just took a brief look at MySQL connector but you can use it to analyze data from HDFS, object stores, RDBMS (SQL Server, Oracle, PostgreSQL), Kafka, Cassandra, MongoDB, and many others.

Tags: