Spymemcached issue with 2 nodes configured

960 Views Asked by At

I am using Ketama algorithm's spymemcached for my project. I do have two memcached servers running as part of HA (high availability) and my configurations are

hibernate.cache.use_second_level_cache=true
hibernate.cache.use_query_cache=true
hibernate.cache.region.factory_class=kr.pe.kwonnam.hibernate4memcached.Hibernate4MemcachedRegionFactory
hibernate.cache.default_cache_concurrency_strategy=NONSTRICT_READ_WRITE
hibernate.cache.region_prefix=myProjectCache
hibernate.cache.use_structured_entries=false
h4m.adapter.class=kr.pe.kwonnam.hibernate4memcached.spymemcached.SpyMemcachedAdapter
h4m.timestamper.class=kr.pe.kwonnam.hibernate4memcached.timestamper.HibernateCacheTimestamperMemcachedImpl

h4m.adapter.spymemcached.hosts=host1:11211,host2:11211
h4m.adapter.spymemcached.hashalgorithm=KETAMA_HASH
h4m.adapter.spymemcached.operation.timeout.millis=5000
h4m.adapter.spymemcached.transcoder=kr.pe.kwonnam.hibernate4memcached.spymemcached.KryoTranscoder
h4m.adapter.spymemcached.cachekey.prefix=myProject
h4m.adapter.spymemcached.kryotranscoder.compression.threashold.bytes=20000

# 10 minutes
h4m.expiry.seconds=600
# a day
h4m.expiry.seconds.validatorCache.org.hibernate.cache.spi.UpdateTimestampsCache=86400
# 1 hour
h4m.expiry.seconds.validatorCache.org.hibernate.cache.internal.StandardQueryCache=3600
# 30 minutes
h4m.expiry.seconds.myProjectCache.database1=1800
h4m.expiry.seconds.myProjectCache.database2=1800

Configurations are followed as per the link below :

SpyMemcachedAdapter

Both nodes host1 and host2 are reachable, up and running.

Issue :

As part of testing HA , when I bring down one memcached (host1) my application is connecting to host2 but only after trying to connect host1 (which will be timedout - as host1 is made down) for every request. Which will result in too much of time taken Below is the exception thrown for every request

2017-07-07 17:27:31.915 [SimpleAsyncTaskExecutor-6] ERROR u.c.o.sProcessor - TransId:004579 - Exception occurred while processing request :Timeout waiting for value: waited 5,000 ms. Node status: Connection Status { /host1:11211 active: false, authed: true, last read: 247,290 ms ago /host2:11211 active: true, authed: true, last read: 5 ms ago }
 2017-07-07 17:28:54.666 INFO net.spy.memcached.MemcachedConnection:  Reconnecting due to failure to connect to {QA sa=/host1:11211, #Rops=0, #Wops=214, #iq=0, topRop=null, topWop=Cmd: 5 Opaque: 341143 Key: myProject.myProjectCache.databse1@ Amount: 0 Default: 1499444604639 Exp: 2592000, toWrite=0, interested=0}
    java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
        at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:677)
        at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:436)
        at net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:1446)
    2017-07-07 17:28:54.666 WARN net.spy.memcached.MemcachedConnection:  Closing, and reopening {QA sa=/host1:11211, #Rops=0, #Wops=214, #iq=0, topRop=null, topWop=Cmd: 5 Opaque: 341143 Key: myProject.myProjectCache.databse1@ Amount: 0 Default: 1499444604639 Exp: 2592000, toWrite=0, interested=0}, attempt 14.
    2017-07-07 17:28:54.841 WARN net.spy.memcached.MemcachedConnection:  Could not redistribute to another node, retrying primary node for myProject.myProjectCache.databse1@-1:my.co.org.myProject.dao.entity.databse1.tablexyz#14744.

Am using memcached for the first time, not sure whether this is the behavior of spymemcached? Or am I missing something in my configurations? Or by changing timeout configurations will it resolve time taken to process the request?

Any suggestions/help much appreciated.

1

There are 1 best solutions below

0
chandan.khatri On

If you are using DefaultConnectionFactory which uses OOTB ConnectionFactoryBuilder then the reconnect will happen after failed operation count has reached timeoutExceptionThreshold (in version 2.7 of spymemcached) is initialized to 998. So if you create your own ConnectionFactory and change the timeoutExceptionThreshold to lower value then you should see the automatic recovery.

Hope this helps.