{"id":43914,"date":"2024-06-14T08:26:25","date_gmt":"2024-06-14T08:26:25","guid":{"rendered":"https:\/\/accuweb.cloud\/resource\/?post_type=faq&#038;p=43914"},"modified":"2026-02-18T13:04:36","modified_gmt":"2026-02-18T13:04:36","slug":"galera-cluster-limitations-and-recovery","status":"publish","type":"faq","link":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery","title":{"rendered":"Galera Cluster Limitations and Recovery"},"content":{"rendered":"<h2 class=\"ack-h2\">Galera Cluster Limitations and Recovery<\/h2>\n<p><span style=\"font-weight: 400;\">Galera Cluster failures can cause write locks, split-brain situations, or complete database downtime if not handled correctly.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This guide explains common Galera Cluster limitations and shows how to safely recover a MariaDB Galera Cluster after node crashes, network failures, or improper shutdowns using proven recovery steps.<\/span><\/p>\n<div style=\"padding: 15px 30px; border-radius: 15px; border: 1px solid #ccc; box-shadow: 0 0 10px #ccc; margin-bottom: 12px;\">\n<h3 class=\"ack-h3\">Galera Cluster Recovery<\/h3>\n<p><span style=\"font-weight: 400;\">To recover a Galera Cluster after a crash, identify the node with the highest sequence number, bootstrap the cluster from that node, and start the remaining nodes sequentially.<\/span><\/p>\n<\/div>\n<div class=\"main-tooltip-btn\"><a class=\"tooltip-link\" href=\"https:\/\/accuweb.cloud\/resource\/articles\/how-to-set-up-mariadb-galera-cluster-multi-primary-synchronous\" target=\"_blank\" rel=\"noopener\"><button class=\"tooltip-btn\">Learn to Setup Galera Cluster <i class=\"fa-solid fa-arrow-right-long\"><\/i><br \/>\n<\/button><\/a><\/div>\n<h2 class=\"ack-h2\">Common Galera Cluster Limitations<\/h2>\n<ul class=\"ack-ul\">\n<li><a class=\"ack-link-color\" href=\"#primary_key_req\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Primary Key Requirement<\/span><\/a>: All tables must contain a Primary Key.<\/li>\n<li><a class=\"ack-link-color\" href=\"#ssr_specifics\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Stop\/Start\/Restart Specifics<\/span><\/a>: Specific procedures must be followed for stopping, starting, or restarting nodes.<\/li>\n<li><a class=\"ack-link-color\" href=\"#node_max_trans\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Node with Maximum Transactions<\/span><\/a>: Special handling is needed for the node with the highest transaction count.<\/li>\n<li><a class=\"ack-link-color\" href=\"#start_cluster_crash\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Starting Cluster after Crash<\/span><\/a>: Procedures for safely starting the cluster after a crash.<\/li>\n<li><a class=\"ack-link-color\" href=\"#single_node_fail\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Single Node Failure<\/span><\/a>: Steps to recover from the failure of a single node.<\/li>\n<li><a class=\"ack-link-color\" href=\"#monitoring_claera_cluster\" rel=\"noopener\"><span style=\"text-decoration: underline;\">Monitoring Galera Cluster<\/span><\/a>: Techniques and tools for monitoring cluster health and performance.<\/li>\n<\/ul>\n<p id=\"primary_key_req\">For a comprehensive list of Galera Cluster limitations, visit the <a class=\"ack-link-color\" href=\"https:\/\/mariadb.com\/kb\/en\/mariadb-galera-cluster-known-limitations\/\" target=\"_blank\" rel=\"noopener\">official website<\/a>. Here are some key limitations relevant to the platform:<\/p>\n<h2 id=\"primary_key_req\" class=\"ack-h2\">Primary Key Requirement in Galera Cluster<\/h2>\n<p><span style=\"font-weight: 400;\">Use the following query to identify tables that violate Galera Cluster requirements:<\/span><\/p>\n<p>All tables must have a Primary Key. To identify tables without a Primary Key,<\/p>\n<p>run the following query:<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\nselect tab.table_schema as database_name,\r\ntab.table_name\r\nfrom information_schema.tables tab\r\nleft join information_schema.table_constraints tco\r\non tab.table_schema = tco.table_schema\r\nand tab.table_name = tco.table_name\r\nand tco.constraint_type = 'PRIMARY KEY'\r\nwhere tco.constraint_type is null\r\nand tab.table_schema not in ('mysql', 'information_schema',\r\n'sys', 'performance_schema')\r\nand tab.table_type = 'BASE TABLE'\r\norder by tab.table_schema,\r\ntab.table_name;\r\n<\/code><\/pre>\n<div class=\"article-extra-space\"><\/div>\n<div class=\"ack-formula\"><strong>TIP<\/strong>: All tables must have a primary key, with multi-column primary keys being supported. Tables without a primary key cannot support DELETE operations, and rows in such tables may appear in different orders on different nodes.<\/div>\n<div class=\"article-space\"><\/div>\n<p><strong>MyISAM Tables:<\/strong> An experimental parameter, wsrep_replicate_myisam, has been added to the configuration file to support <a class=\"ack-link-color\" href=\"https:\/\/mariadb.com\/kb\/en\/myisam-storage-engine\/\" target=\"_blank\" rel=\"noopener\">MyISAM<\/a> tables.<\/p>\n<div class=\"article-space\"><\/div>\n<div id=\"ssr_specifics\" class=\"ack-formula\"><strong>TIP<\/strong>: Replication currently works only with the InnoDB storage engine. Writes to tables of other types, including system tables (mysql.), are not replicated. However, DDL statements that implicitly modify the mysql. tables, such as CREATE USER, are replicated. There is experimental support for <a class=\"ack-link-color\" href=\"https:\/\/mariadb.com\/kb\/en\/myisam-storage-engine\/\" target=\"_blank\" rel=\"noopener\">MyISAM<\/a> via the <a class=\"ack-link-color\" href=\"https:\/\/mariadb.com\/kb\/en\/galera-cluster-system-variables\/#wsrep_replicate_myisam\" target=\"_blank\" rel=\"noopener\">wsrep_replicate_myisam<\/a> system variable.<\/div>\n<div class=\"article-space\"><\/div>\n<h2 class=\"ack-h2\">Galera Cluster Stop, Start, and Restart Behavior<\/h2>\n<p id=\"node_max_trans\">Stopping a cluster requires a sequential shutdown of all its nodes. The final container sets itself in bootstrap mode, initiating the cluster startup from this node.<br \/>\nThe platform streamlines this process, eliminating the need for manual intervention. You can manage the Galera Cluster like any other environment, initiating start\/stop\/restart actions via the dashboard. Special events handle all necessary actions, such as sequentially withdrawing nodes from the cluster, in the background.<br \/>\nWhen restarting a single node in the cluster, standard actions apply.<\/p>\n<h2 class=\"ack-h2\">Identifying the Galera Node with Maximum Transactions<\/h2>\n<p>It&#8217;s crucial to create a backup for the <strong>\/var\/lib\/mysql<\/strong> directory on each cluster node before proceeding with any actions.<\/p>\n<h3 class=\"ack-h3\">How to Find the Correct Node for Cluster Bootstrap<\/h3>\n<p>During cluster recovery, identifying the node with the highest sequence number of the last transaction is essential, as the cluster should start from this node. You can retrieve the sequence number of the last transaction from the <strong>seqno<\/strong> value in the <strong>\/var\/lib\/mysql\/grastate.dat<\/strong> file of each node.<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\nmysql @ node270011-env-0668554 ~ $ cat \/var\/lib\/mysql\/grastate.dat | grep seqno\r\nseqno: 1133\r\nmysql @ node270016-env-0668554 ~ $ cat \/var\/lib\/mysql\/grastate.dat | grep seqno\r\nseqno: 1134\r\nmysql @ node270017-env-0668554 ~ $ cat \/var\/lib\/mysql\/grastate.dat | grep seqno\r\nseqno: 1134\r\n<\/code><\/pre>\n<div class=\"article-space\"><\/div>\n<p>Typically, selecting a node with the highest parameter value suffices. If multiple nodes share the same highest value, opt for any of them, preferably the master container of the layer.<\/p>\n<p>However, if at least one node registers a value of <strong>-1<\/strong>, it indicates potential inconsistency among nodes (the parameter resets to -1 during service restart on a non-functional cluster). In such instances, data recovery requires starting <strong>mysqld<\/strong> with the <strong>\u2013wsrep-recover<\/strong> parameter.<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\n<strong>mysqld --wsrep-recover<\/strong><\/code><\/pre>\n<div class=\"article-space\"><\/div>\n<p>Retrieve the<strong> recovered position<\/strong> data by examining the value at the end of the line after the colons (e.g., 85340 in the provided example).<br \/>\n<a href=\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/06\/GCL-R01-1.png\"><img fetchpriority=\"high\" decoding=\"async\" class=\"ack-article-image alignnone wp-image-43917 size-full\" title=\"recovered position data\" src=\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/06\/GCL-R01-1.png\" alt=\"recovered position data\" width=\"1120\" height=\"347\" srcset=\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/06\/GCL-R01-1.png 1120w, https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/06\/GCL-R01-1-300x93.png 300w, https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/06\/GCL-R01-1-1024x317.png 1024w, https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/06\/GCL-R01-1-768x238.png 768w\" sizes=\"(max-width: 1120px) 100vw, 1120px\" \/><\/a><\/p>\n<pre><code class=\"language-javascript\">\r\n....2020-12-24 10:51:15 0 [Note] WSREP: Recovered position: e94ca741-44f5-11eb-9bc4-b2e17ef1657d:85340\r\n....<\/code><\/pre>\n<div class=\"article-space\"><\/div>\n<p id=\"start_cluster_crash\">Compare the <strong>recovered<\/strong> positions on all nodes. The node with the highest value should be selected for bootstrap. In case multiple nodes have the highest value, any of them can be chosen. Then, set the <strong>safe_to_bootstrap<\/strong> variable to <strong>1<\/strong> in the <strong>grastate.dat<\/strong> file and bootstrap from this node.<\/p>\n<h2 class=\"ack-h2\">How to Start Galera Cluster After a Crash<\/h2>\n<p><span style=\"font-weight: 400;\">Follow these steps carefully to avoid data loss when recovering a crashed Galera Cluster.<\/span><\/p>\n<p>Verify the state of the <strong>MySQL<\/strong> processes on the nodes. Sometimes, these processes may appear as &#8220;<strong>running<\/strong>,&#8221; but they might not respond to normal operations like establishing a connection or stopping them via the init script. If this is the case, you&#8217;ll need to manually terminate these hung processes.<\/p>\n<p>Once you&#8217;ve ensured that there are no lingering MySQL processes, proceed to restart all MySQL containers in your cluster.<\/p>\n<p>Before initiating the cluster startup, check the value of the <strong>safe_to_bootstrap<\/strong> parameter in the <strong>\/var\/lib\/mysql\/grastate.dat<\/strong> file. This parameter should be set to 0 to ensure the safe startup of the cluster.<\/p>\n<p>If it&#8217;s set to any other value, particularly 1, it could indicate an inconsistency in the cluster state and require further investigation or corrective actions.<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\ngrep safe_to_bootstrap \/var\/lib\/mysql\/grastate.dat safe_to_bootstrap: 0\r\n<\/code><\/pre>\n<div class=\"article-space\"><\/div>\n<p>Ensure that the <strong>safe_to_bootstrap<\/strong> parameter is set to <strong>1<\/strong> on the node with the highest transaction count before starting the MySQL process.<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\nsed -i 's\/safe_to_bootstrap: 0\/safe_to_bootstrap: 1\/g' \/var\/lib\/mysql\/grastate.dat\r\ngrep safe_to_bootstrap \/var\/lib\/mysql\/grastate.dat\r\nsafe_to_bootstrap: 1\r\nservice mysql start\r\n<\/code><\/pre>\n<div class=\"article-space\"><\/div>\n<p>Proceed by sequentially starting the MySQL process on the remaining nodes.<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">service mysql start<\/code><\/pre>\n<div class=\"article-extra-space\"><\/div>\n<div class=\"ack-formula\">Keep in mind that identifying the node with the highest transaction number after a cluster crash might be challenging. In such instances, it&#8217;s advisable to set safe_to_bootstrap to 1 on the master node initially.<\/div>\n<div class=\"article-space\"><\/div>\n<p>You can continue with the remaining nodes if MySQL starts successfully on the second node.<\/p>\n<p>If an error occurs, inspect the <strong>mysqld.log<\/strong> on the second node. Look for a message resembling the following:<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\n2020-11-19 16:55:20 0 [ERROR] WSREP: gcs\/src\/gcs_group.cpp:group_post_state_exchange():422: Reversing history: 3151891 -&gt; 3150782, this member has applied 1109 more events than the primary component.Data loss is possible. Aborting.\r\n<\/code><\/pre>\n<div class=\"article-space\"><\/div>\n<p id=\"single_node_fail\">If such a record is present, it indicates that your second node has more transactions than the initially selected one (i.e., the first node where you <strong>set safe_to_bootstrap<\/strong> to <strong>1<\/strong>).<\/p>\n<p>Please return to the beginning of this section and restart the process, using the second node in the fourth step.<\/p>\n<h2 class=\"ack-h2\">Recovering from a Single Galera Node Failure<\/h2>\n<p>In the event of a single node failure, often caused by processing request limitations, you should:<\/p>\n<p>1. Verify that no MySQL processes are running on the affected node.<br \/>\n2. Change the<strong> safe_to_bootstrap<\/strong> parameter to <strong>0<\/strong> in the <strong>\/var\/lib\/mysql\/grastate.dat<\/strong> file.<br \/>\n3. Restart the node using the init script.<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\n\/etc\/init.d\/mysql restart\r\n<\/code><\/pre>\n<div class=\"article-extra-space\"><\/div>\n<div id=\"monitoring_claera_cluster\" class=\"ack-formula\" style=\"    margin-bottom: 20px;\"><strong>Note:<\/strong> If Galera cluster limitations persist, errors may reoccur after some time.<\/div>\n<div class=\"ack-formula\"><strong>Note:<\/strong>Repeated node failures often indicate workload or resource constraints and should be investigated.<\/div>\n<div class=\"article-space\"><\/div>\n<h2 class=\"ack-h2\">Monitoring Galera Cluster Health<\/h2>\n<p><span style=\"font-weight: 400;\">Use the following commands to verify Galera Cluster size, state, and node synchronization.<\/span><\/p>\n<div class=\"main-tooltip-btn\"><a class=\"tooltip-link\" href=\"https:\/\/accuweb.cloud\/register\" target=\"_blank\" rel=\"noopener\"><button class=\"tooltip-btn\">Host Galera Cluster in 1 Click <i class=\"fa-solid fa-arrow-right-long\"><\/i><br \/>\n<\/button><\/a><\/div>\n<p>You can monitor the state and various parameters of the Galera Cluster by executing the <strong>SHOW GLOBAL STATUS LIKE<\/strong> command on any node within the cluster. This command allows you to view different aspects of the cluster&#8217;s status. For example, to check the cluster size, you can use the following command:<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\nmysql -uuser -ppass -e \"SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size';\"\r\n+ -------------------- + ------- +\r\n| Variable_name | Value |\r\n+ -------------------- + ------- +\r\n| wsrep_cluster_size | 3 |\r\n+ -------------------- + ------- +\r\nmysql -uuser -ppass -e \"SHOW GLOBAL STATUS LIKE 'wsrep_cluster_status';\"\r\n+ ---------------------- + --------- +\r\n| Variable_name | Value |\r\n+ ---------------------- + --------- +\r\n| wsrep_cluster_status | Primary |\r\n+ ---------------------- + --------- +\r\nmysql -uuser -ppass -e \"SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';\"\r\n+ --------------------------- + -------- +\r\n| Variable_name | Value |\r\n+ --------------------------- + -------- +\r\n| wsrep_local_state_comment | Synced |\r\n+ --------------------------- + -------- +\r\n<\/code><\/pre>\n<div class=\"article-extra-space\"><\/div>\n<div class=\"ack-formula\"><strong>Tip:<\/strong> Refer to the official Galera cluster documentation for more examples and detailed information.<\/div>\n<div class=\"article-space\"><\/div>\n<p>This will display the current size of the cluster. Additionally, if your cluster includes <strong>ProxySQL<\/strong> nodes, you can check their status by executing the following request on any ProxySQL node:<\/p>\n<div class=\"article-space\"><\/div>\n<pre><code class=\"language-javascript\">\r\nmysql -uadmin -padmin -P6032 -h127.0.0.1 -e \"select * from runtime_mysql_servers;\"\r\nWarning: Using a password on the command line interface can be insecure.\r\n+--------------+----------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+\r\n| hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |\r\n+--------------+----------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+\r\n| 2 | node3303 | 3306 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |\r\n| 3 | node3304 | 3306 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |\r\n| 3 | node3303 | 3306 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |\r\n| 2 | node3304 | 3306 | 0 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |\r\n+--------------+----------+------+-----------+--------+--------+-------------+------------\r\n<\/code><\/pre>\n<div class=\"article-space\"><\/div>\n<p>This command provides information about the status of each ProxySQL node in the cluster. All nodes should be ONLINE for the cluster to operate effectively.<\/p>\n<div class=\"cta-btn-top-space ack-extra-image-space div\">\t\t<div data-elementor-type=\"section\" data-elementor-id=\"38668\" class=\"elementor elementor-38668\" data-elementor-settings=\"{&quot;ha_cmc_init_switcher&quot;:&quot;no&quot;}\" data-elementor-post-type=\"elementor_library\">\n\t\t\t        <section class=\"elementor-section elementor-top-section elementor-element elementor-element-882321f elementor-section-boxed elementor-section-height-default elementor-section-height-default ct-header-fixed-none ct-row-max-none\" data-id=\"882321f\" data-element_type=\"section\" data-settings=\"{&quot;_ha_eqh_enable&quot;:false}\">\n            \n                        <div class=\"elementor-container elementor-column-gap-default \">\n                    <div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7cc79cc\" data-id=\"7cc79cc\" data-element_type=\"column\">\n        <div class=\"elementor-widget-wrap elementor-element-populated\">\n                    \n        \t\t<div class=\"elementor-element elementor-element-e31b40f elementor-widget elementor-widget-shortcode\" data-id=\"e31b40f\" data-element_type=\"widget\" data-widget_type=\"shortcode.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-shortcode\"><\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t            <\/div>\n        <\/div>\n                    <\/div>\n        <\/section>\n        \t\t<\/div>\n\t\t<\/div>\n<div class=\"cta-btn-mob-space\"><\/div>\n<h3 class=\"ack-h3\">FAQs<\/h3>\n<p><b>Q) What are the main limitations of a Galera Cluster?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Galera Cluster requires all tables to have a primary key, supports only the InnoDB storage engine for replication, and needs strict procedures for stopping, starting, and recovering nodes after failures.<\/span><\/p>\n<p><b>Q) Why is a primary key mandatory in Galera Cluster?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Galera uses primary keys to ensure row-level consistency across nodes. Tables without a primary key cannot reliably replicate DELETE operations and may cause data inconsistency.<\/span><\/p>\n<p><b>Q) What is split-brain in a Galera Cluster?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Split-brain occurs when cluster nodes lose connectivity and form separate primary components, potentially accepting conflicting writes and risking data inconsistency.<\/span><\/p>\n<p><b>Q) How do I recover a Galera Cluster after a crash?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) To recover a Galera Cluster, identify the node with the highest transaction sequence number, set <\/span><span style=\"font-weight: 400;\">safe_to_bootstrap=1<\/span><span style=\"font-weight: 400;\"> on that node, bootstrap the cluster, and then start the remaining nodes sequentially.<\/span><\/p>\n<p><b>Q) How do I find the correct Galera node to bootstrap?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Check the <\/span><span style=\"font-weight: 400;\">seqno<\/span><span style=\"font-weight: 400;\"> value in the <\/span><span style=\"font-weight: 400;\">\/var\/lib\/mysql\/grastate.dat<\/span><span style=\"font-weight: 400;\"> file on each node. The node with the highest sequence number should be used for cluster bootstrap.<\/span><\/p>\n<p><b>Q) What does <\/b><b>safe_to_bootstrap<\/b><b> mean in Galera Cluster?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) The <\/span><span style=\"font-weight: 400;\">safe_to_bootstrap<\/span><span style=\"font-weight: 400;\"> parameter indicates whether a node is safe to use for initializing the cluster after a crash. It must be set to <\/span><span style=\"font-weight: 400;\">1<\/span><span style=\"font-weight: 400;\"> only on the selected bootstrap node.<\/span><\/p>\n<p><b>Q) What should I do if <\/b><b>seqno<\/b><b> is set to <\/b><b>-1<\/b><b>?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) A <\/span><span style=\"font-weight: 400;\">seqno<\/span><span style=\"font-weight: 400;\"> value of <\/span><span style=\"font-weight: 400;\">-1<\/span><span style=\"font-weight: 400;\"> indicates an inconsistent cluster state. In this case, run <\/span><span style=\"font-weight: 400;\">mysqld &#8211;wsrep-recover<\/span><span style=\"font-weight: 400;\"> to determine the last committed transaction before choosing the bootstrap node.<\/span><\/p>\n<p><b>Q) Can a Galera Cluster recover automatically after a node failure?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Single node failures may recover automatically, but full cluster crashes or split-brain scenarios usually require manual recovery steps.<\/span><\/p>\n<p><b>Q) How do I recover from a single Galera node failure?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Ensure no MySQL processes are running on the failed node, set <\/span><span style=\"font-weight: 400;\">safe_to_bootstrap<\/span><span style=\"font-weight: 400;\"> to <\/span><span style=\"font-weight: 400;\">0<\/span><span style=\"font-weight: 400;\">, and restart the node so it rejoins the cluster.<\/span><\/p>\n<p><b>Q) How can I monitor the health of a Galera Cluster?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Cluster health can be monitored using <\/span><span style=\"font-weight: 400;\">SHOW GLOBAL STATUS<\/span><span style=\"font-weight: 400;\"> commands such as <\/span><span style=\"font-weight: 400;\">wsrep_cluster_size<\/span><span style=\"font-weight: 400;\">, <\/span><span style=\"font-weight: 400;\">wsrep_cluster_status<\/span><span style=\"font-weight: 400;\">, and <\/span><span style=\"font-weight: 400;\">wsrep_local_state_comment<\/span><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><b>Q) Is Galera Cluster suitable for production environments?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">A) Yes. When properly configured with primary keys, monitoring, backups, and correct recovery procedures, Galera Cluster is suitable for production-grade MariaDB workloads.<\/span><\/p>\n<h2 class=\"ack-h2\">Conclusion<\/h2>\n<p><span style=\"font-weight: 400;\">Galera Cluster provides high availability for MariaDB, but recovery must be handled carefully after crashes, network failures, or split-brain situations. Understanding cluster limitations and following the correct recovery sequence is essential to avoid data loss and prolonged downtime.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">By identifying the correct node to bootstrap, restarting nodes in the proper order, and continuously monitoring cluster health, you can safely restore a Galera Cluster to a consistent and operational state. With the right infrastructure and operational practices, Galera Cluster remains a reliable choice for production database workloads.<\/span><\/p>\n","protected":false},"author":1,"featured_media":52879,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","class_list":["post-43914","faq","type-faq","status-publish","has-post-thumbnail","hentry","faq_topics-databases","faq_topics-galera-cluster-recovery","faq_topics-mmp-high-availability-cluster","faq_topics-kb","faq_topics-mysql-mariadb-percona","faq_topics-product-documentation"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v20.10 (Yoast SEO v24.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Galera Cluster Recovery Guide: Fix Crashes &amp; Split Brain<\/title>\n<meta name=\"description\" content=\"Step-by-step guide to recover a MariaDB Galera Cluster after crashes, split-brain issues, or node failures with real commands.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Galera Cluster Limitations and Recovery\" \/>\n<meta property=\"og:description\" content=\"Step-by-step guide to recover a MariaDB Galera Cluster after crashes, split-brain issues, or node failures with real commands.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery\" \/>\n<meta property=\"og:site_name\" content=\"AccuWeb Cloud\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-18T13:04:36+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1280\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#article\",\"isPartOf\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery\"},\"author\":{\"name\":\"Jilesh Patadiya\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/#\/schema\/person\/a7a4cbe8405202b537509c757b588c58\"},\"headline\":\"Galera Cluster Limitations and Recovery\",\"datePublished\":\"2024-06-14T08:26:25+00:00\",\"dateModified\":\"2026-02-18T13:04:36+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery\"},\"wordCount\":1686,\"publisher\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/#organization\"},\"image\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage\"},\"thumbnailUrl\":\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg\",\"inLanguage\":\"en-US\"},{\"@type\":[\"WebPage\",\"FAQPage\"],\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery\",\"url\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery\",\"name\":\"Galera Cluster Recovery Guide: Fix Crashes & Split Brain\",\"isPartOf\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage\"},\"image\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage\"},\"thumbnailUrl\":\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg\",\"datePublished\":\"2024-06-14T08:26:25+00:00\",\"dateModified\":\"2026-02-18T13:04:36+00:00\",\"description\":\"Step-by-step guide to recover a MariaDB Galera Cluster after crashes, split-brain issues, or node failures with real commands.\",\"breadcrumb\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage\",\"url\":\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg\",\"contentUrl\":\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg\",\"width\":1280,\"height\":720},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/accuweb.cloud\/resource\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Galera Cluster Limitations and Recovery\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/#website\",\"url\":\"https:\/\/accuweb.cloud\/resource\/\",\"name\":\"AccuWeb Cloud\",\"description\":\"Cutting Edge Cloud Computing\",\"publisher\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/accuweb.cloud\/resource\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/#organization\",\"name\":\"AccuWeb.Cloud\",\"url\":\"https:\/\/accuweb.cloud\/resource\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/04\/accuwebcloud_logo_black_tagline.jpg\",\"contentUrl\":\"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/04\/accuwebcloud_logo_black_tagline.jpg\",\"width\":156,\"height\":87,\"caption\":\"AccuWeb.Cloud\"},\"image\":{\"@id\":\"https:\/\/accuweb.cloud\/resource\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/#\/schema\/person\/a7a4cbe8405202b537509c757b588c58\",\"name\":\"Jilesh Patadiya\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/accuweb.cloud\/resource\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/2cea2bdb5bbabb771ee67e96acad7396f25cb1a0c360b9bc4a9ac40cea9cd8b2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/2cea2bdb5bbabb771ee67e96acad7396f25cb1a0c360b9bc4a9ac40cea9cd8b2?s=96&d=mm&r=g\",\"caption\":\"Jilesh Patadiya\"},\"description\":\"Jilesh Patadiya, the visionary Co-Founder and Chief Technology Officer (CTO) behind AccuWeb.Cloud. Founder &amp; CTO at AccuWebHosting.com. He shares his web hosting insights on the AccuWeb.Cloud blog. He mostly writes on the latest web hosting trends, WordPress, storage technologies, and Windows and Linux hosting platforms.\",\"sameAs\":[\"https:\/\/accuweb.cloud\/resource\",\"https:\/\/www.facebook.com\/accuwebhosting\",\"https:\/\/www.instagram.com\/accuwebhosting\/\",\"https:\/\/www.linkedin.com\/company\/accuwebhosting\/\",\"https:\/\/x.com\/accuwebhosting\",\"https:\/\/www.youtube.com\/c\/Accuwebhosting\"],\"url\":\"https:\/\/accuweb.cloud\/resource\/author\/accuwebadmin\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Galera Cluster Recovery Guide: Fix Crashes & Split Brain","description":"Step-by-step guide to recover a MariaDB Galera Cluster after crashes, split-brain issues, or node failures with real commands.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery","og_locale":"en_US","og_type":"article","og_title":"Galera Cluster Limitations and Recovery","og_description":"Step-by-step guide to recover a MariaDB Galera Cluster after crashes, split-brain issues, or node failures with real commands.","og_url":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery","og_site_name":"AccuWeb Cloud","article_modified_time":"2026-02-18T13:04:36+00:00","og_image":[{"width":1280,"height":720,"url":"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#article","isPartOf":{"@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery"},"author":{"name":"Jilesh Patadiya","@id":"https:\/\/accuweb.cloud\/resource\/#\/schema\/person\/a7a4cbe8405202b537509c757b588c58"},"headline":"Galera Cluster Limitations and Recovery","datePublished":"2024-06-14T08:26:25+00:00","dateModified":"2026-02-18T13:04:36+00:00","mainEntityOfPage":{"@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery"},"wordCount":1686,"publisher":{"@id":"https:\/\/accuweb.cloud\/resource\/#organization"},"image":{"@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage"},"thumbnailUrl":"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg","inLanguage":"en-US"},{"@type":["WebPage","FAQPage"],"@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery","url":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery","name":"Galera Cluster Recovery Guide: Fix Crashes & Split Brain","isPartOf":{"@id":"https:\/\/accuweb.cloud\/resource\/#website"},"primaryImageOfPage":{"@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage"},"image":{"@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage"},"thumbnailUrl":"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg","datePublished":"2024-06-14T08:26:25+00:00","dateModified":"2026-02-18T13:04:36+00:00","description":"Step-by-step guide to recover a MariaDB Galera Cluster after crashes, split-brain issues, or node failures with real commands.","breadcrumb":{"@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#primaryimage","url":"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg","contentUrl":"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/07\/NEW-OG-IMAGE-URL.jpg","width":1280,"height":720},{"@type":"BreadcrumbList","@id":"https:\/\/accuweb.cloud\/resource\/articles\/galera-cluster-limitations-and-recovery#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/accuweb.cloud\/resource\/"},{"@type":"ListItem","position":2,"name":"Galera Cluster Limitations and Recovery"}]},{"@type":"WebSite","@id":"https:\/\/accuweb.cloud\/resource\/#website","url":"https:\/\/accuweb.cloud\/resource\/","name":"AccuWeb Cloud","description":"Cutting Edge Cloud Computing","publisher":{"@id":"https:\/\/accuweb.cloud\/resource\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/accuweb.cloud\/resource\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/accuweb.cloud\/resource\/#organization","name":"AccuWeb.Cloud","url":"https:\/\/accuweb.cloud\/resource\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/accuweb.cloud\/resource\/#\/schema\/logo\/image\/","url":"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/04\/accuwebcloud_logo_black_tagline.jpg","contentUrl":"https:\/\/accuweb.cloud\/resource\/wp-content\/uploads\/2024\/04\/accuwebcloud_logo_black_tagline.jpg","width":156,"height":87,"caption":"AccuWeb.Cloud"},"image":{"@id":"https:\/\/accuweb.cloud\/resource\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/accuweb.cloud\/resource\/#\/schema\/person\/a7a4cbe8405202b537509c757b588c58","name":"Jilesh Patadiya","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/accuweb.cloud\/resource\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/2cea2bdb5bbabb771ee67e96acad7396f25cb1a0c360b9bc4a9ac40cea9cd8b2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/2cea2bdb5bbabb771ee67e96acad7396f25cb1a0c360b9bc4a9ac40cea9cd8b2?s=96&d=mm&r=g","caption":"Jilesh Patadiya"},"description":"Jilesh Patadiya, the visionary Co-Founder and Chief Technology Officer (CTO) behind AccuWeb.Cloud. Founder &amp; CTO at AccuWebHosting.com. He shares his web hosting insights on the AccuWeb.Cloud blog. He mostly writes on the latest web hosting trends, WordPress, storage technologies, and Windows and Linux hosting platforms.","sameAs":["https:\/\/accuweb.cloud\/resource","https:\/\/www.facebook.com\/accuwebhosting","https:\/\/www.instagram.com\/accuwebhosting\/","https:\/\/www.linkedin.com\/company\/accuwebhosting\/","https:\/\/x.com\/accuwebhosting","https:\/\/www.youtube.com\/c\/Accuwebhosting"],"url":"https:\/\/accuweb.cloud\/resource\/author\/accuwebadmin"}]}},"_links":{"self":[{"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/faq\/43914","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/faq"}],"about":[{"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/types\/faq"}],"author":[{"embeddable":true,"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/comments?post=43914"}],"version-history":[{"count":23,"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/faq\/43914\/revisions"}],"predecessor-version":[{"id":53132,"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/faq\/43914\/revisions\/53132"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/media\/52879"}],"wp:attachment":[{"href":"https:\/\/accuweb.cloud\/resource\/wp-json\/wp\/v2\/media?parent=43914"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}