Simplifying MySQL Sharding with ProxySQL: A Step-by-Step Guide

Vivek Thakkar

Cloud & DevOps

Tags:

MySQL

ProxySQL

devops

Database Sharding

Introduction:

ProxySQL is a powerful SQL-aware proxy designed to sit between database servers and client applications, optimizing database traffic with features like load balancing, query routing, and failover. This article focuses on simplifying the setup of ProxySQL, especially for users implementing data-based sharding in a MySQL database.

What is Sharding?

Sharding involves partitioning a database into smaller, more manageable pieces called shards based on certain criteria, such as data attributes. ProxySQL supports data-based sharding, allowing users to distribute data across different shards based on specific conditions.

Understanding the Need for ProxySQL:

ProxySQL is an intermediary layer that enhances database management, monitoring, and optimization. With features like data-based sharding, ProxySQL is an ideal solution for scenarios where databases need to be distributed based on specific data attributes, such as geographic regions.

‍Installation & Setup:‍

There are two ways to install the proxy, either by installing it using packages or running ProxySQL in docker. ProxySQL can be installed using two methods: via packages or running it in a Docker container. For this guide, we will focus on the Docker installation.

1. Install ProxySQL and MySQL Docker Images:

To start, pull the necessary Docker images for ProxySQL and MySQL using the following commands:

CODE: https://gist.github.com/velotiotech/e4ac77a89fe51a75421421119351bcf9.js

2. Create Docker Network:

Create a Docker network for communication between MySQL containers:

CODE: https://gist.github.com/velotiotech/f2fe4a4cfd6f26bfbc52de08ec987b19.js

Note: ProxySQL setup will need connections to multiple SQL servers. So, we will set up multiple SQL servers on our docker inside a Docker network.

Containers within the same Docker network can communicate with each other using their container names or IP addresses.

You can check the list of all the Docker networks currently present by running the following command:

CODE: https://gist.github.com/velotiotech/6db66a3d41bd14afe211d88fb8d12395.js

3. Set Up MySQL Containers:

Now, create three MySQL containers within the network:

Note: We can create any number of MySQL containers.

CODE: https://gist.github.com/velotiotech/3c63c0cdfbd0f2390dec741e509cd19c.js ‍‍

Note: Adjust port numbers as necessary.

The default MySQL protocol port is 3306, but since we cannot access all three of our MySQL containers on the same port, we have set their ports to 3307, 3308, and 3309. Although internally, all MySQL containers will connect using port 3306.

--network=multi-tenant-network. This specifies that the container should be created under the specified network.

We have also specified the root password of the MySQL container to log into it, where the username is “root” and the password is “pass123” for all three of them.

After running the above three commands, three MySQL containers will start running inside the network. You can connect to these three hosts using host = localhost or 127.0.0.1 and port = 3307 / 3308 / 3309.

To ping the port, use the following command:

for macOS:

CODE: https://gist.github.com/velotiotech/e48d067957e6f073da0bbb4a351a4977.js

for Windows:

CODE: https://gist.github.com/velotiotech/36e4e9e97c1e19dcc8e47b8bb4c439f9.js

for Linux:

CODE: https://gist.github.com/velotiotech/301e3f453c7880a6a2c94b2386600c39.js

‍

4. Create Users in MySQL Containers:

Create “user_shard” and “monitor” users in each MySQL container.

The “user_shard” user will be used by the proxy to make queries to the DB.

The “monitor” user will be used by the proxy to monitor the DB.

Note: To access the MySQL container mysql_host_1, use the command:

CODE: https://gist.github.com/velotiotech/996afe24af1a56583cdbd81fad96fd25.js

Use the following commands inside the MySQL container to create the user:‍‍

CODE: https://gist.github.com/velotiotech/700a29637a3327e601184eccf032b068.js

Repeat the above steps for mysql_host_2 & mysql_host_3.‍

If, at any point, you need to drop the user, you can use the following command:

CODE: https://gist.github.com/velotiotech/1c87aeb59559c2835e719d2e311ce903.js

5. Prepare ProxySQL Configuration:

To prepare the configuration, we will need the IP addresses of the MySQL containers. To find those, we can use the following command:

CODE: https://gist.github.com/velotiotech/2bb0e50b918d2a89c3ad70a563bb4cb5.js ‍‍

By running these commands, you will get all the details of the MySQL Docker container under a field named “IPAddress” inside your network. That is the IP address of that particular MySQL container.

Example:
mysql_host_1: 172.19.0.2‍

mysql_host_2: 172.19.0.3‍

mysql_host_3: 172.19.0.4

Reference image for IP address of *mysql_host_1*: 172.19.0.2

Now, create a ProxySQL configuration file named proxysql.cnf. Include details such as IP addresses of MySQL containers, administrative credentials, and MySQL users.

Below is the content that needs to be added to the proxysql.cnf file:

CODE: https://gist.github.com/velotiotech/651c4761165736d6bdde28cf64d1548d.js

Most of the settings are default; we won’t go into much detail for each setting.

admin_variables: These variables are used for ProxySQL's administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

mysql_variables, monitor_username, and monitor_password are used to specify the username that ProxySQL will use when connecting to MySQL servers for monitoring purposes. This monitoring user is used to execute queries and gather statistics about the health and performance of the MySQL servers. This is the user we created during step 4.

mysql_servers will contain all the MySQL servers we want to be connected with ProxySQL. Each entry will have the IP address of the MySQL container, port, host group, and max_connections. Mysql_users will have all the users we created during step 4.

7. Run ProxySQL Container:

Inside the same directory where the proxysql.cnf file is located, run the following command to start ProxySQL:

CODE: https://gist.github.com/velotiotech/0846e2f890594382566ac6c53d5de2ba.js

Here, port 6032 is used for ProxySQL's administrative interface. It allows you to connect to ProxySQL and perform administrative tasks such as configuring runtime settings, managing servers, and monitoring performance.

Port 6033 is the default port for ProxySQL's MySQL protocol interface. It is used for handling MySQL client connections. Our application will use it to access the ProxySQL db and make SQL queries.

The above command will make ProxySQL run on our Docker with the configuration provided in the proxysql.cnf file.

Inside ProxySQL Container:

8. Access ProxySQL Admin Console:

Now, to access the ProxySQL Docker container, use the following command:

CODE: https://gist.github.com/velotiotech/ba76c127f3dda04d7852f743ec9373c0.js

Now, once you’re inside the ProxySQL Docker container, you can access the ProxySQL admin console using the command:

CODE: https://gist.github.com/velotiotech/4e9602738c93a2ace9f820cee9dcd2a2.js

You can run the following queries to get insights into your ProxySQL server:

i) To get the list of all the connected MySQL servers:

CODE: https://gist.github.com/velotiotech/5f767cf8c62d4467080d2f2301c5a235.js

ii) Verify the status of the MySQL backends in the monitor database tables in ProxySQL admin using the following command:

CODE: https://gist.github.com/velotiotech/11f660f92e56531137fddc99b0131a65.js

If this returns an empty set, it means that the monitor username and password are not set correctly. You can do so by using the below commands:

CODE: https://gist.github.com/velotiotech/0c0c3b90927f2edf86bf9754785846b8.js

And then restart the proxy Docker container:

iii) Check the status of DBs connected to ProxySQL using the following command:

CODE: https://gist.github.com/velotiotech/c6054d61bca6319d07f7c6e2aa62e50e.js

iv) To get a list of all the ProxySQL global variables, use the following command:

CODE: https://gist.github.com/velotiotech/43a43734675e4b97054cf3893a806297.js

v) To get all the queries made on ProxySQL, use the following command:

CODE: https://gist.github.com/velotiotech/b49e76daa01947332c8fe995eebf113e.js

Note: Whenever we change any row, use the below commands to load them:

Change in variables:

CODE: https://gist.github.com/velotiotech/ce86c6c75a6af7b3eab910fb923d1260.js

And then restart the proxy docker container.

IMPORTANT:

To connect to ProxySQL’s admin console, first get into the Docker container using the following command:

CODE: https://gist.github.com/velotiotech/05d30554ac8bb6e506acf3c435afbff1.js

Then, to access the ProxySQL admin console, use the following command:

CODE: https://gist.github.com/velotiotech/c6ef6b74bd2711e5cff440e23357e5a4.js

To access the ProxySQL MySQL console, we can directly access it using the following command without going inside the Docker ProxySQL container:

CODE: https://gist.github.com/velotiotech/af13194ffb94596d15dab2648dffea34.js

To make queries to the database, we make use of ProxySQL’s 6033 port, where MySQL is being accessed.

9. Define Query Rules:

We can add custom query rules inside the mysql_query_rules table to redirect queries to specific databases based on defined patterns. Load the rules to runtime and save to disk.

12. Sharding Example:

Now, let's illustrate how to leverage ProxySQL's data-based sharding capabilities through a practical example. We'll create three MySQL containers, each containing data from different continents in the “world” database, specifically within the “countries” table.

Step 1: Create 3 MySQL containers named mysql_host_1, mysql_host_2 & mysql_host_3.

Inside all containers, create a database named “world” with a table named “countries”.

i) Inside mysql_host_1: Insert countries using the following query:

CODE: https://gist.github.com/velotiotech/fe20656e87c70e634166359c16f572e7.js

ii) Inside mysql_host_2: Insert countries using the following query:

CODE: https://gist.github.com/velotiotech/0a23ce55cfdbf4da85b147e45701c459.js

iii) Inside mysql_host_3: Insert countries using the following query:

CODE: INSERT INTO `countries` VALUES (1, "Italy", "Europe"), (2, "Germany", "Europe"), (3, "France", "Europe");

Now, we have distinct data sets for Asia & North America in mysql_host_1, Africa & South America in mysql_host_2, and Europe in mysql_host_3..js

Now, we have distinct data sets for Asia & North America in mysql_host_1, Africa & South America in mysql_host_2, and Europe in mysql_host_3.

Step 2: Define Query Rules for Sharding

Let's create custom query rules to redirect queries based on the continent specified in the SQL statement.

For example, if the query contains the continent “Asia,” we want it to be directed to mysql_host_1.

-- Query Rule for Asia and North America

CODE: https://gist.github.com/velotiotech/5ee79b936735e38e0dab56c7024fa0b5.js

-- Query Rule for Africa and South America

CODE: https://gist.github.com/velotiotech/c69e6470f689fe880835f85a5266a404.js

-- Query Rule for Europe ‍

CODE: https://gist.github.com/velotiotech/b2b02d8642748eaa53a001b2e5790584.js ‍

Step 3: Apply and Save Query Rules

After adding the query rules, ensure they take effect by running the following commands:‍

CODE: https://gist.github.com/velotiotech/b088aea0013f318de1339a0b509cc2f6.js

Step 4: Test Sharding

Now, access the MySQL server using the ProxySQL port and execute queries:

CODE: https://gist.github.com/velotiotech/95bc2c5ae3a7092d31aa1bed49a32208.js

CODE: https://gist.github.com/velotiotech/4d1a02dc68d4c7e858766523447a898d.js

-- Example Queries:

CODE: https://gist.github.com/velotiotech/8def6e5fd0df4b304e7742f01a2eed95.js

-- This will return id=1, name=India, continent=Asia

CODE: https://gist.github.com/velotiotech/b4a3e69bfefb7b61d9ee258b2f5e626d.js

-- This will return id=1, name=Kenya, continent=Africa.

CODE: https://gist.github.com/velotiotech/e0869084891a636777c32b07bbabb86a.js

Based on the defined query rules, the queries will be redirected to the specified MySQL host groups. If no rules match, the default host group that’s specified in mysql_users inside proxysql.cnf will be used.

Conclusion:

ProxySQL simplifies access to distributed data through effective sharding strategies. Its flexible query rules, combined with regex patterns and host group definitions, offer significant flexibility with relative simplicity.

By following this step-by-step guide, users can quickly set up ProxySQL and leverage its capabilities to optimize database performance and achieve efficient data distribution.

References:

Download and Install ProxySQL - ProxySQL

How to configure ProxySQL for the first time - ProxySQL

Admin Variables - ProxySQL‍

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Simplifying MySQL Sharding with ProxySQL: A Step-by-Step Guide

Introduction:

What is Sharding?

Understanding the Need for ProxySQL: