Why does my postgresql+repmgr+timescaledb container stop right after start up?

539 Views Asked by At

I was trying to build an customized image of postgresql+repmgr+timescaledb on docker.

Here is my dockerfile:

FROM bitnami/postgresql-repmgr:12.4.0-debian-10-r90
USER root
RUN apt-get update \
 && apt-get -y install \
    gcc cmake git clang-format clang-tidy openssl libssl-dev \
 && git clone https://github.com/timescale/timescaledb.git
RUN cd timescaledb \
 && git checkout 2.8.1 \
 && ./bootstrap -DREGRESS_CHECKS=OFF -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
 && cd build \
 && make \
 && make install
RUN echo 'en_US.UTF-8 UTF-8' >> /etc/locale.gen && locale-gen
USER 1001

build command:

docker build -f dockerfile -t my/pg-repmgr-12-tsdb:12.4.0-debian-10-r90 .

When I tested it, it ran perfectly for the primary node, but when I tried to establish a stand by node, the instance stopped almost immediately after starting up and leaving the logs to be:

postgresql-repmgr 18:51:11.00 
postgresql-repmgr 18:51:11.00 Welcome to the Bitnami postgresql-repmgr container
postgresql-repmgr 18:51:11.00 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql-repmgr
postgresql-repmgr 18:51:11.00 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql-repmgr/issues
postgresql-repmgr 18:51:11.01 
postgresql-repmgr 18:51:11.03 INFO  ==> ** Starting PostgreSQL with Replication Manager setup **
postgresql-repmgr 18:51:11.05 INFO  ==> Validating settings in REPMGR_* env vars...
postgresql-repmgr 18:51:11.06 INFO  ==> Validating settings in POSTGRESQL_* env vars..
postgresql-repmgr 18:51:11.06 INFO  ==> Querying all partner nodes for common upstream node...
postgresql-repmgr 18:51:11.13 INFO  ==> Auto-detected primary node: 'pg-0:5432'
postgresql-repmgr 18:51:11.14 INFO  ==> Preparing PostgreSQL configuration...
postgresql-repmgr 18:51:11.14 INFO  ==> postgresql.conf file not detected. Generating it...
postgresql-repmgr 18:51:11.26 INFO  ==> Preparing repmgr configuration...
postgresql-repmgr 18:51:11.27 INFO  ==> Initializing Repmgr...
postgresql-repmgr 18:51:11.28 INFO  ==> Waiting for primary node...
postgresql-repmgr 18:51:11.30 INFO  ==> Cloning data from primary node...
postgresql-repmgr 18:51:12.11 INFO  ==> Initializing PostgreSQL database...
postgresql-repmgr 18:51:12.11 INFO  ==> Cleaning stale /bitnami/postgresql/data/standby.signal file
postgresql-repmgr 18:51:12.12 INFO  ==> Custom configuration /opt/bitnami/postgresql/conf/postgresql.conf detected
postgresql-repmgr 18:51:12.13 INFO  ==> Custom configuration /opt/bitnami/postgresql/conf/pg_hba.conf detected
postgresql-repmgr 18:51:12.16 INFO  ==> Deploying PostgreSQL with persisted data...
postgresql-repmgr 18:51:12.19 INFO  ==> Configuring replication parameters
postgresql-repmgr 18:51:12.23 INFO  ==> Configuring fsync
postgresql-repmgr 18:51:12.25 INFO  ==> Setting up streaming replication slave...
postgresql-repmgr 18:51:12.28 INFO  ==> Starting PostgreSQL in background...
postgresql-repmgr 18:51:12.52 INFO  ==> Unregistering standby node...
postgresql-repmgr 18:51:12.59 INFO  ==> Registering Standby node...
postgresql-repmgr 18:51:12.64 INFO  ==> Running standby follow...
postgresql-repmgr 18:51:12.71 INFO  ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped

while normal logs continues with several restarts. The logs were confusing because no error is thrown.

Thanks to the first comment, I found that the postgres logs (which I used volumes to access later) said:

2022-10-17 12:37:51.070 GMT [171] LOG:  pgaudit extension initialized
2022-10-17 12:37:51.070 GMT [171] LOG:  starting PostgreSQL 12.4 on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2022-10-17 12:37:51.072 GMT [171] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-10-17 12:37:51.072 GMT [171] LOG:  listening on IPv6 address "::", port 5432
2022-10-17 12:37:51.074 GMT [171] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-10-17 12:37:51.106 GMT [171] LOG:  redirecting log output to logging collector process
2022-10-17 12:37:51.106 GMT [171] HINT:  Future log output will appear in directory "/opt/bitnami/postgresql/logs".
2022-10-17 12:37:51.119 GMT [173] LOG:  database system was interrupted; last known up at 2022-10-17 12:37:49 GMT
2022-10-17 12:37:51.242 GMT [173] LOG:  entering standby mode
2022-10-17 12:37:51.252 GMT [173] LOG:  redo starts at 0/E000028
2022-10-17 12:37:51.266 GMT [173] LOG:  consistent recovery state reached at 0/E000100
2022-10-17 12:37:51.266 GMT [171] LOG:  database system is ready to accept read only connections
2022-10-17 12:37:51.274 GMT [177] LOG:  started streaming WAL from primary at 0/F000000 on timeline 1
2022-10-17 12:37:51.579 GMT [171] LOG:  received fast shutdown request
2022-10-17 12:37:51.580 GMT [171] LOG:  aborting any active transactions
2022-10-17 12:37:51.580 GMT [177] FATAL:  terminating walreceiver process due to administrator command
2022-10-17 12:37:51.581 GMT [174] LOG:  shutting down
2022-10-17 12:37:51.601 GMT [171] LOG:  database system is shut down

Can someone please tell me where I did wrong? Much appreciated!

Additional information on reproducing:

the command used for the primary node instance:

docker run --detach  --name pg-0   --network my-network   --env REPMGR_PARTNER_NODES=pg-0,pg-1   --env REPMGR_NODE_NAME=pg-0   --env REPMGR_NODE_NETWORK_NAME=pg-0   --env REPMGR_PRIMARY_HOST=pg-0   --env REPMGR_PASSWORD=repmgrpass   --env POSTGRESQL_POSTGRES_PASSWORD=adminpassword   --env POSTGRESQL_USERNAME=customuser   --env POSTGRESQL_PASSWORD=custompassword   --env POSTGRESQL_DATABASE=customdatabase --env POSTGRESQL_SHARED_PRELOAD_LIBRARIES=repmgr,pgaudit,timescaledb   -p 5420:5432  -v /etc/localtime:/etc/localtime:ro  my/pg-repmgr-12-tsdb:12.4.0-debian-10-r90

the command used for the standby node instance:

docker run   --name pg-1   --network my-network     --env REPMGR_PARTNER_NODES=pg-0,pg-1     --env REPMGR_NODE_NAME=pg-1     --env REPMGR_NODE_NETWORK_NAME=pg-1     --env REPMGR_PRIMARY_HOST=pg-0   --env REPMGR_PASSWORD=repmgrpass   --env POSTGRESQL_POSTGRES_PASSWORD=adminpassword     --env POSTGRESQL_USERNAME=customuser     --env POSTGRESQL_PASSWORD=custompassword     --env POSTGRESQL_DATABASE=customdatabase   --env POSTGRESQL_SHARED_PRELOAD_LIBRARIES=repmgr,pgaudit,timescaledb    -v /etc/localtime:/etc/localtime:ro   -p 5421:5432   my/pg-repmgr-12-tsdb:12.4.0-debian-10-r90
1

There are 1 best solutions below

0
jyoudan On

Sometimes the best way through is just to find another... I changed the dockerfile to

FROM bitnami/postgresql-repmgr:13.6.0-debian-10-r90
USER root
RUN apt-get update \
 && apt-get -y install \
    gcc cmake git clang-format clang-tidy openssl libssl-dev \
 && git clone https://github.com/timescale/timescaledb.git
RUN cd timescaledb \
 && git checkout 2.8.0 \
 && ./bootstrap -DREGRESS_CHECKS=OFF -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
 && cd build \
 && make \
 && make install
RUN echo 'en_US.UTF-8 UTF-8' >> /etc/locale.gen && locale-gen
USER 1001

and the problem is solved.

I even have no clue whether the version of base image or the version of timescaledb did the magic, but anyhow my problem is solved. Hope any one who encountered the same issue later can benefit from my struggle. >3<