We recently set up a static outgoing IP address for our Google App Engine Java backends according to these instructions: https://cloud.google.com/appengine/docs/standard/outbound-ip-addresses

Since enabling the static outgoing IP, we have started getting UNAVAILABLE disconnects from our gRPC streams after 15 minutes of inactivity (we then wait 5 seconds and re-connect each stream). We want to have these streams always up and listening for messages.

We have a keepalive timer for the gRPC streams set to 15 minutes (.keepAliveTime(15, TimeUnit.MINUTES)). It could be that the stream is closed even earlier but that the backend only realizes that the stream is closed when it sends the 15-minute keepalive ping. This 15 minute ping has been there for quite some time to avoid these kinds of disconnections that without this timer used to happen after around 40 minutes or so.

I saw that there are some timeout settings for the Cloud NAT gateway that we created to get this static IP address. I have changed the TCP estabilished timeout from 20 minutes to 1 hour to see if this has any effect (see image), but it didn't.

Are there some other timeout settings having to do with our new static outgoing IP setup that we need to change to avoid these 15-minute disconnects of the gRPC streams?

Cloud NAT gateway timeout settings

1

There are 1 best solutions below

2
Fariya Rahmat On

Since you have set keepalive timer for the gRPC streams set to 15 minutes you are facing UNAVAILABLE disconnects from your gRPC streams after 15 minutes of inactivity.

It can also be device along the network path that kills the connection after a period of idleness. It could be a proxy, NAT, or firewall.

Since gRPC supports Keep Alive that is designed for this scenario. After a period of inactivity grpc will cause activity, just to make sure the connection is still good and inform networking devices the connection is still being used.

You can configure keepalive on client-side or server-side. If the networking device is part of the server deployment, having the server manage the keepalive is best. If it’s a non-routable IP, then the problem may be the NAT. Try configuring keepalive on the client side by setting,

keepAliveTime(150, TimeUnit.SECONDS)

To prevent abuse, gRPC servers limit keepalives to no lower than 5 minutes by default. So you'll also need to change your server.