3

I have a long running process in Go on an App Engine flex instance, deployed via docker image.

Most of the time when I deploy to the live version it sends a SIGTERM to the app. I can catch this and do a graceful shutdown. It's great.

Other times, the process just seems to disappear and a new instance is created. I don't get any log output; no indication of what happened. This definitely seems to happen if I change the number of instances (via manual_scaling) but sometimes it happens on a normal deploy.

Is there a way to get a SIGTERM consistently? Are there other strategies I can use to know when the instance is being killed/restarted?

Update: I tried a few test cases:

  1. "Delete" instance in App Engine UI. The instance cleanly shut down - sending signals - and rebooted since it's configured to have one instance.
  2. Deploy, changing from 1 to 2 instances. Existing instance rebooted cleanly with signals. New instance came up.
  3. Deploy, changing from 2 to 1 instances. One existing instance rebooted cleanly with signals. The other one went poof for lack of a better description. Viewing 'All logs' showing STDERR from my app, then nothing. No output in vm.events, vm.syslog, vm.shutdown logs which report lots of interesting stuff during reboot. I also know that signals weren't received by my app because the database is left in a dirty state.

It's this last case that I'd love some more insight into, thanks! Please also let me know if there's a better place or way to ask this question.

1
  • rcarver, was the answer by Soni Sol useful?
    – Javier A
    Sep 17, 2021 at 7:54

1 Answer 1

0

On App Engine Flex instances are restarted once a week to apply critical updates on the runtime.

This restarts always get a SIGTERM and a SIGKILL 3 seconds before the restart is done.

If your app has processes running for a long time and the weekly restarts can affect them. A good alternative for you can be to run this on a Compute Engine as these ones are more in control for the user and not restarted weekly.

I just replicated and I can confirm that:

  • Redeploying 1 to 1 the SIGTERM and a SIGKILL are sent
  • On the Weekly restart SIGTERM and a SIGKILL are sent
  • Redeploying increasing instances SIGTERM and a SIGKILL are sent
  • Redeploying reducing instances (2 to 1) I can see SIGTERM and a SIGKILL are sent on each instance.

This logs are generated on log: appengine.googleapis.com/vm.shutdown So if you want to continue using App Engine you can wait for this signals which will give you 3 seconds before the actual shutdown is done.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .