What is use case where we can see the benefit out of having witness node in the PG cluster

23 Views Asked by Mukesh Tanuku At 09 February 2024 at 04:01

I have a postgres high availability cluster setup like 1 Primary, 1 Standby in same location (w.r.t repmgr config file) now i added witness node with in the same location. while im testing a test case like network disconnection between Primary and standby, but network is available between primary and witness & standby and witness. In this test case standby is getting promoted even witness sees primary. How come Standby gets winner to promote as new primary ?

Cluster info before network disconnection:

ID | Name              | Role    | Status    | Upstream          | Location | Priority | Timeline | Connection string                                     
----+-------------------+---------+-----------+-------------------+----------+----------+----------+--------------------------------------------------------------------
 1  | host001           | primary| * running |                   | dc1      | 100      | 16       | host=host001 user=repmgr dbname=repmgr connect_timeout=2
 2  | host002           | standby |   running | host001           | dc1      | 100      | 16       | host=host002 user=repmgr dbname=repmgr connect_timeout=2
 4  | host004           | witness | * running | host001           | dc1      | 0        | n/a      | host=host004 user=repmgr dbname=repmgr connect_timeout=2

Cluster info after network disconnection:

 ID | Name              | Role    | Status        | Upstream            | Location | Priority | Timeline | Connection string                               
----+-------------------+---------+---------------+---------------------+----------+----------+----------+--------------------------------------------------------------------
 1  | host001           | primary | * running     |                     | dc1      | 100      | 16       | host=host001 user=repmgr dbname=repmgr connect_timeout=2
 2  | host002           | standby | ? unreachable | ? host001           | dc1      | 100      |          | host=host002 user=repmgr dbname=repmgr connect_timeout=2
 4  | host004           | witness | * running     | host001             | dc1      | 0        | n/a      | host=host004 user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
  - unable to connect to node "host002" (ID: 2)
  - node "host002" (ID: 2) is registered as an active standby but is unreachable

Standby repmgr log file:

[2024-02-08 22:22:38] [INFO] 1 active sibling nodes registered
[2024-02-08 22:22:38] [INFO] 3 total nodes registered
[2024-02-08 22:22:38] [INFO] primary node  "host001" (ID: 1) and this node have the same location ("dc1")
[2024-02-08 22:22:38] [INFO] local nodes last receive lsn: 0/83009110
[2024-02-08 22:22:38] [INFO] checking state of sibling node "host004" (ID: 4)
[2024-02-08 22:22:38] [INFO] node "host004" (ID: 4) reports its upstream is node 1, last seen 0 second(s) ago
[2024-02-08 22:22:38] [NOTICE] witness node "host004" (ID: 4) last saw primary node 0 second(s) ago, considering primary still visible
[2024-02-08 22:22:38] [INFO] 1 nodes can see the primary
[2024-02-08 22:22:38] [DETAIL] following nodes can see the primary:
 - node "host004" (ID: 4): 0 second(s) ago

[2024-02-08 22:22:38] [INFO] visible nodes: 2; total nodes: 2; no nodes have seen the primary within the last 4 seconds
[2024-02-08 22:22:38] [NOTICE] promotion candidate is "host002" (ID: 2)
[2024-02-08 22:22:38] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2024-02-08 22:22:38] [INFO] promote_command is:
  "repmgr standby promote -f /u01/app/admin/Data/repmgr.conf --log-to-file --siblings-follow"
[2024-02-08 22:22:38] [NOTICE] redirecting logging output to "/u01/app/admin/Data/PG_LOGS/repmgr.log"

[2024-02-08 22:22:40] [NOTICE] promoting standby to primary
[2024-02-08 22:22:40] [DETAIL] promoting server "host002" (ID: 2) using pg_promote()
[2024-02-08 22:22:40] [NOTICE] waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
[2024-02-08 22:22:41] [NOTICE] STANDBY PROMOTE successful
[2024-02-08 22:22:41] [DETAIL] server "host002" (ID: 2) was successfully promoted to primary
[2024-02-08 22:22:41] [NOTICE] executing STANDBY FOLLOW on 1 of 1 siblings
INFO:  node 4 received notification to follow node 2
[2024-02-08 22:22:42] [INFO] STANDBY FOLLOW successfully executed on all reachable sibling nodes
[2024-02-08 22:22:42] [INFO] checking state of node 2, 1 of 6 attempts
[2024-02-08 22:22:42] [NOTICE] node 2 has recovered, reconnecting
[2024-02-08 22:22:42] [INFO] connection to node 2 succeeded
[2024-02-08 22:22:42] [INFO] original connection is still available
[2024-02-08 22:22:42] [INFO] 1 followers to notify
[2024-02-08 22:22:42] [NOTICE] notifying node "host004" (ID: 4) to follow node 2
INFO:  node 4 received notification to follow node 2
[2024-02-08 22:22:42] [INFO] switching to primary monitoring mode
[2024-02-08 22:22:42] [NOTICE] monitoring cluster primary "host002" (ID: 2)
[2024-02-08 22:22:42] [INFO] child node "host004" (ID: 4) is not yet attached
[2024-02-08 22:27:43] [INFO] monitoring primary node "host002" (ID: 2) in normal state
[2024-02-08 22:32:44] [INFO] monitoring primary node "host002" (ID: 2) in normal state

Original Q&A

What is use case where we can see the benefit out of having witness node in the PG cluster

There are 0 best solutions below

Related Questions in POSTGRESQL

Related Questions in HIGH-AVAILABILITY

Related Questions in POSTGRESQL-15

Related Questions in REPMGR

Trending Questions

Popular # Hahtags

Popular Questions