mongodb is in stopped state and getting failed when trying to start

42 Views Asked by At

We have monogdb installed on an amazon linux instance and since couple of days it has been in stopped state. When we try to start it errors out. Following is what was done to troubleshoot the issue.

service mongod status
mongod is stopped

service mongod start
Starting mongod:                                           [FAILED]

We tried to do a mongo repair but it kept pointing at the wrong data directory. We have our data dir setup on /mongo/data but it was pointing to the default location as following

2024-01-15T10:41:27.563-0500 I CONTROL  [initandlisten] MongoDB starting : pid=30057 port=27017 dbpath=/data/db 64-bit host=DAT-MGOT-SB
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten] db version v3.2.22
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten] git version: 105acca0d443f9a47c1a5bd608fd7133840a58dd
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.0-fips 29 Mar 2010
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten] allocator: tcmalloc
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten] modules: none
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten] build environment:
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten]     distmod: amazon
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten]     distarch: x86_64
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten]     target_arch: x86_64
2024-01-15T10:41:27.564-0500 I CONTROL  [initandlisten] options: { repair: true }
2024-01-15T10:41:27.582-0500 I STORAGE  [initandlisten] exception in initAndListen: 29 Data directory /data/db not found., terminating
2024-01-15T10:41:27.582-0500 I CONTROL  [initandlisten] dbexit:  rc: 100

Tried to look into the log which gave some hints and tried to resolve the permissions issue. Changed the permissions of the recovery log for journal but that also did not help.

[root@DAT-MGOT-SB sam]# tail /mongo/log/mongod.log
2024-01-15T05:12:55.401-0500 I -        [initandlisten] Detected data files in /mongo/data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2024-01-15T05:12:55.401-0500 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=12G,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),verbose=(recovery_progress),
2024-01-15T05:12:55.412-0500 E STORAGE  [initandlisten] WiredTiger (13) [1705313575:412243][28878:0x7f833f240dc0], txn-recover: /mongo/data/journal/WiredTigerLog.0003340817: handle-open: open: Permission denied
2024-01-15T05:12:55.412-0500 E STORAGE  [initandlisten] WiredTiger (0) [1705313575:412265][28878:0x7f833f240dc0], txn-recover: WiredTiger is unable to read the recovery log.
2024-01-15T05:12:55.412-0500 E STORAGE  [initandlisten] WiredTiger (0) [1705313575:412270][28878:0x7f833f240dc0], txn-recover: This may be due to the log files being encrypted, being from an older version or due to corruption on disk
2024-01-15T05:12:55.412-0500 E STORAGE  [initandlisten] WiredTiger (0) [1705313575:412275][28878:0x7f833f240dc0], txn-recover: You should confirm that you have opened the database with the correct options including all encryption and compression options
2024-01-15T05:12:55.412-0500 E STORAGE  [initandlisten] WiredTiger (13) [1705313575:412280][28878:0x7f833f240dc0], txn-recover: Recovery failed: Permission denied
2024-01-15T05:12:55.413-0500 I -        [initandlisten] Assertion: 28595:13: Permission denied
2024-01-15T05:12:55.413-0500 I STORAGE  [initandlisten] exception in initAndListen: 28595 13: Permission denied, terminating
2024-01-15T05:12:55.413-0500 I CONTROL  [initandlisten] dbexit:  rc: 100

From the above log I tried to find if an existing mongod.pid exist and if so then tried removing it and restarting again but that also did not work.

Then tried to check the log again but that did not fetch me much which is listed below

 mongod(+0x555BD0) [0x955bd0]
 mongod(main+0x15D) [0x9591cd]
 libc.so.6(__libc_start_main+0xF5) [0x7f00d868a585]
 mongod(+0x551579) [0x951579]
-----  END BACKTRACE  -----
2024-01-15T10:39:38.730-0500 I -        [initandlisten] 

***aborting after invariant() failure

I am not sure what should we try further to resolve this issue. Here is the mongod.conf

cat /etc/mongod.conf
# mongod.conf

# for documentation of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
#  path: /var/log/mongodb/mongod.log
  path: /mongo/log/mongod.log

# Where and how to store data.
storage:
#  dbPath: /var/lib/mongo
  dbPath: /mongo/data
  journal:
    enabled: true
  directoryPerDB: true
#  engine:
#  mmapv1:
  wiredTiger:
      engineConfig:
         cacheSizeGB: 12
#         statisticsLogDelaySecs: <number>
         journalCompressor: snappy
#         directoryForIndexes: <boolean>
#      collectionConfig:
#         blockCompressor: <string>
#      indexConfig:
#         prefixCompression: <boolean>

# how the process runs
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid  # location of pidfile

# network interfaces
net:
  port: 27017
#  bindIp: 127.0.0.1  # Listen to local interface only, comment to listen on all interfaces.


security:
   keyFile: /mongo/data/keyfile
#   clusterAuthMode: <string>
   authorization: enabled
#   javascriptEnabled:  <boolean>
#   sasl:
#      hostName: <string>
#      serviceName: <string>
#      saslauthdSocketPath: <string>
#   enableEncryption: <boolean>
#   encryptionCipherMode: <string>
#   encryptionKeyFile: <string>
#   kmip:
#      keyIdentifier: <string>
#      rotateMasterKey: <boolean>
#      serverName: <string>
#      port: <string>
#      clientCertificateFile: <string>
#      clientCertificatePassword: <string>
#      serverCAFile: <string>

#operationProfiling:

#replication:
#   oplogSizeMB: <int>
#   replSetName: <string>
#   replSetName: rs0
#   secondaryIndexPrefetch: <string>
#   enableMajorityReadConcern: <boolean>

#sharding:

## Enterprise-Only Options

#auditLog:

#snmp:

Amazon linux distribution details

NAME="Amazon Linux AMI"
VERSION="2018.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2018.03"
PRETTY_NAME="Amazon Linux AMI 2018.03"
ANSI_COLOR="0;33"
CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/"

Any advice that will help resolve this issue will be highly appreciated

1

There are 1 best solutions below

0
Wernfried Domscheit On

First have a look which service file is used, run systemctl show mongod -p FragmentPath

It should return something like

FragmentPath=/etc/systemd/system/mongod.service

Then have a look at this service file, it shows which config file is loaded (e.g. /etc/mongod.conf)

cat /etc/systemd/system/mongod.service
    
[Service]
User=mongod
Group=mongod
Environment="OPTIONS=-f /etc/mongod.conf"
Environment="MONGODB_CONFIG_OVERRIDE_NOFORK=1"
EnvironmentFile=-/etc/sysconfig/mongod
ExecStart=/usr/bin/mongod $OPTIONS
RuntimeDirectory=mongodb

Then check the config file.

Regarding your settings:

  • According documentation net.bindIp defaults to localhost (however in older version is was different). Better set explicitly net.bindAll: true
  • You run a stand-alone MongoDB instance. security.keyFile applies only to Replica Set or Sharded Cluster. It is useless in your case.
  • storage.wiredTiger.engineConfig.journalCompressor defaults to snappy, thus you an skip it