Why doesn't systemd capture output from subprocesses of a user service written in shell script?

Question

Here is my amazing service, written in shell script:

$ cat ~/junk/demoapp 
#! /bin/bash -eu

while true
do
    echo "in shell"
    ( echo "in subshell" )
    /usr/bin/echo "in subprocess"
    sleep 1
done

It produces some repetitive output:

$ ~/junk/demoapp
in shell
in subshell
in subprocess
in shell
in subshell
in subprocess
in shell
in subshell
in subprocess
in shell
in subshell
in subprocess
^C

Here is a user service configuration for it:

$ cat ~/.config/systemd/user/demoapp.service 
[Unit]
Description=Demo App

[Service]
Type=exec
ExecStart=/home/tomanderson/junk/demoapp

But when i run this service, using systemd 239, the logged output is missing the lines produced by the subshell and the subprocess:

$ systemctl --user daemon-reload

$ systemctl --user start demoapp

$ journalctl --user --unit demoapp

Sep 12 18:53:27 myhost systemd[539847]: Started Demo App.
Sep 12 18:53:27 myhost demoapp[559387]: in shell
Sep 12 18:53:28 myhost demoapp[559387]: in shell
Sep 12 18:53:29 myhost demoapp[559387]: in shell
Sep 12 18:53:30 myhost demoapp[559387]: in shell
Sep 12 18:53:31 myhost demoapp[559387]: in shell
Sep 12 18:53:32 myhost demoapp[559387]: in shell
Sep 12 18:53:33 myhost demoapp[559387]: in shell
Sep 12 18:53:34 myhost demoapp[559387]: in shell
Sep 12 18:53:35 myhost demoapp[559387]: in shell

Any idea why? From reading around, it seems that systemd would normally capture output from subprocesses here. Is this something specific that the shell is doing which is interacting with this?

Googling around, i see people having problems like this with Python, where buffering is the culprit, but i don't see how that can be relevant here.

EDIT: I see exactly the same behaviour after taking shell script out of the equation, using two simple C programs. I do not see this behaviour with a simple parent process taking the place of systemd and collecting output via a pipe. This strongly indicates that it is systemd that is doing something funky. See: https://github.com/tomwhoiscontrary/child-stdout-demo

EDIT 2: An observant colleague who has root reports that (a) the subprocess output is in the journal, it's just not associated with the service, and (b) he only sees this behaviour with a user service; if he sets up a system service with the same code, the subprocess output is associated with it! Surely this is a systemd bug?

I'll test this tomorrow. PS it would be helpful to know your used systemd version — paladin, Sep 12 at 20:52
@paladin good point, i am using systemd 239, and have edited the question to say so — Tom Anderson, Sep 13 at 14:03

u1686_grawity · Accepted Answer · 2023-09-16 11:23:56Z

EDIT 2: An observant colleague who has root reports that (a) the subprocess output is in the journal, it's just not associated with the service, and (b) he only sees this behaviour with a user service; if he sets up a system service with the same code, the subprocess output is associated with it! Surely this is a systemd bug?

It's a known, long-lived issue; the problem is that the kernel does not provide sufficient means to associate a socket client with a cgroup (unlike, for example, the ability to retrieve the client's PID). So whenever journald receives a message it only knows the sender's PID but has to asynchronously look up its unit name from /proc/<pid>/cgroup. If the process is very short-lived (such as a subshell), it could very well be that it will exit before journald is even woken up – and by the time its message is processed, the information required to associate its output with the service is no longer available.

I'm a bit hazy on the details, but from what I remember, recent systemd versions have a partial workaround that only works if the stdout "pipe" to journald (which is really a socketpair) has been set up by a privileged process, whereas your "user" services are set up by another systemd instance that only has the same privileges as yourself.

That sounds like the answer. Do you by any chance have a link to (discussion of) the issue, or something i could search for? I haven't quite managed to accept that they've messed this up so badly. — Tom Anderson, Sep 17 at 13:22

larsks · Accepted Answer · 2023-09-13 02:36:30Z

1

While I haven't investigated the underlying technical details, the same solution that works for Python (disable buffering) also works in this situation. If I use this unit file...

[Unit]
Description=Demo App

[Service]
Type=exec
ExecStart=/usr/bin/unbuffer %h/bin/demoapp

...then the expected output is logged in the journal.

The unbuffer command is part of the expect package.

This works by forcing the command to execute as an interactive process attached to a pty device, which disables the normal buffering.

If you don't have the unbuffer command handy, you can use the script command instead:

ExecStart=/usr/bin/script -c %h/bin/demoapp /dev/null

edited Sep 13 at 2:36

answered Sep 13 at 2:27

larsks

43.8k14 gold badges122 silver badges181 bronze badges

1

Aha, that does indeed work. But what is actually going on here? I have reproduced this loss of output with a couple of trivial C programs, no shell or built-in commands involved. And i can add as many fflush and fsync calls as i like to both of them, and the output still gets lost. This doesn't feel like behaviour i have ever seen before on unix!
– Tom Anderson
Sep 13 at 14:06

Add a comment |

Stack Exchange Network

Why doesn't systemd capture output from subprocesses of a user service written in shell script?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
systemd
shell
stdout
systemd-service
.

Hot Network Questions

Why doesn't systemd capture output from subprocesses of a user service written in shell script?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged systemdshellstdoutsystemd-service.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
systemd
shell
stdout
systemd-service
.