LibreOffice only works in the terminal: the weirdest GNU/Linux issue I've encountered to date
| Tags:
You’re right Richard, this is your fault too.
First, some context. At work, we have Linux workstations with NFS home folders. This is awesome, since you can move basically seamlessly between computers, but can also bring with it many interesting bugs. This is one of them.
When I tried to open LibreOffice today by double-clicking on a file, it didn’t
work. So I ran LibreOffice from my terminal to see its output and, lo and
behold, it worked just fine. And it was quite reliable – when starting
LibreOffice from a terminal, it worked, any other way didn’t – launching it from
dmenu
, using i3-msg
or from a GUI. All terminals worked, all shells worked.
So I called in my colleagues and we started investigating.
LEdoian (who also helped with most of the
investigation) suggested to look into the .xsession-errors
file and the error
found there was not very helpful:
grep: write error: Permission denied
Error: The debug options --record, --backtrace, --strace, and --valgrind cannot be used together.
Please, use them one by one.
At first I thought that some calling script on the way somehow added these
parameters in response to some environment so I started to dig there.
Eventually, I found the /usr/bin/soffice
file, which was responsible for
launching LibreOffice.
The error is generated by the following snippet of code:
if echo "$checks" | grep -q "cc" ; then
echo "Error: The debug options --record, --backtrace, --strace, and --valgrind cannot be used together."
echo " Please, use them one by one."
exit 1;
fi
Let’s see, where that $checks
variable comes from:
# count number of selected checks; only one is allowed
checks=
EXTRAOPT=
# force the --valgrind option if the VALGRIND variable is set
test -n "$VALGRIND" && EXTRAOPT="--valgrind"
# force the --record option if the RR variable is set
test -n "$RR" && EXTRAOPT="--record"
for arg in "$@" $EXTRAOPT ; do
case "$arg" in
--record)
if which rr >/dev/null 2>&1 ; then
# smoketest may already be recorded => ignore nested
RRCHECK="rr record --nested=ignore"
checks="c$checks"
else
echo "Error: Can't find the tool \"rr\", --record option will be ignored."
exit 1
fi
;;
--backtrace)
if which gdb >/dev/null 2>&1 ; then
GDBTRACECHECK="gdb -nx --command=$sd_prog/gdbtrace --args"
checks="c$checks"
else
echo "Error: Can't find the tool \"gdb\", --backtrace option will be ignored."
exit 1
fi
;;
# (other options redacted)
esac
done
The variable is used, as the error message would suggest, to ensure that two or more conflicting options are not set at the same time. You can also see, that some of the options can be set using environment variables, but there was quite an easy way to check this: I added the following snippet just before the multiple option check:
echo "--------------- Cut here ---------------"
echo "$checks"
echo "$@"
echo "--------------- Stop cutting here ------"
When I launched localc
again, I got the expected output:
--------------- Cut here ---------------
--calc
--------------- Stop cutting here ------
Only at this point did I notice the first line of the original error message, which I managed to overlook:
grep: write error: Permission denied
So I added another set of lines to the debug prints:
echo "--------------- Cut here ---------------"
echo "$checks"
echo "$@"
echo "$checks" | grep "cc"
echo $?
echo "--------------- Stop cutting here ------"
This, again, produced the expected output:
--------------- Cut here ---------------
--calc
grep: write error: Permission denied
2
--------------- Stop cutting here ------
This made me question everything I knew about UNIX. I thought that in a POSIX
shell the if
statement evaluates the first branch when the command returns 0
and the second one with all other return codes, but clearly, grep
returned 2
and sh
still went into the first branch. HUH?
Undeterred, we started investigating why does grep
get that permission
denied error in the first place. After adding ls -la /proc/self/fd
to see what
stdin
, stdout
and stderr
are connected to, we got the following output:
total 0
dr-x------ 2 jan users 0 Oct 31 18:18 .
dr-xr-xr-x 9 jan users 0 Oct 31 18:18 ..
lr-x------ 1 jan users 64 Oct 31 18:18 0 -> pipe:[45923638]
l-wx------ 1 jan users 64 Oct 31 18:18 1 -> /nfs/home/jan/.xsession-errors
l-wx------ 1 jan users 64 Oct 31 18:18 2 -> /nfs/home/jan/.xsession-errors
lr-x------ 1 jan users 64 Oct 31 18:18 3 -> /proc/23830/fd
ls: write error: Permission denied
First of all, stdout
and stderr
point to .xsession-errors
so that’s
probably the culprit, and sure enough, running localc >> .xsession-errors
in a
terminal behaves as if it was executed from dmenu
– reports the aforementioned
and exits. Second of all, ls
also seems to be affected by the issue, but
weirdly, it doesn’t impede its functionality in any way.
So we turned to strace
, tracing id
(as it is the simplest command we could
think of in a hurry that also exhibited the error) and we saw the first properly
cursed thing of the investigation:
write(1, "uid=3262(jan) gid=1000(users) gr"...,) = 162
close(1) = -1 EACCES (Permission denied)
write(2, "id: ", 4id: ) = 4
write(2, "write error", 11write error) = 11
write(2, ": Permission denied", 19: Permission denied) = 19
write(2, "\n", 1
) = 1
exit_group(1) = ?
+++ exited with 1 +++
The kernel replied with EACCES
to a close
syscall, which is, unsurprisingly,
not expected behaviour. The system call manual page specifies EBADF
, EINTR
,
EIO
, ENOSPC
and EDQUOT
. POSIX only allows the first three. We figured this
was probably caused by a main server crash we experienced earlier this week –
the server hosts NFS and the workstation including my session has been running since
before the crash, so we figured that something about NFS didn’t close properly.
But still, why in the name of sanity would the if
statement from earlier act in
such a peculiar way? Unless grep
for some reason returned 0. But we checked
that, right? Well, not quite. The second grep
is called with the -q
option.
The man page states:
-q, --quiet, --silent
Quiet; do not write anything to standard output. Exit immediately with zero status if any match is found, even if
an error was detected. Also see the -s or --no-messages option.
At first we were a bit confused by the wording of the help text, thinking it
says (match or error) ⇒ exit 0
, but it most likely means the more
reasonable explanation (match and error) ⇒ exit 0
in addition to match ⇒
exit 0
. This is confirmed by subjecting grep
to any other error:
$ grep -q . /aaa; echo $?
grep: /aaa: No such file or directory
2
But, there seems to be a bug in GNU grep
, that causes it to return 0 when it
is run with -q
and receives EACCESS
as a response to close
. This includes
situations in which grep
encounters other errors:
$ grep -q > .xsession-errors; echo $?
Usage: grep [OPTION]... PATTERNS [FILE]...
Try 'grep --help' for more information.
grep: write error: Permission denied
0
It also doesn’t happen without the -q
option and correctly returns 2 even if
it otherwise would return 0:
$ echo 'cc' | grep "cc" > .xsession-errors; echo $?
grep: write error: Permission denied
2
Coming back to the original snippet, you can now see, why it behaves like it did:
if echo "$checks" | grep -q "cc" ; then
echo "Error: The debug options --record, --backtrace, --strace, and --valgrind cannot be used together."
echo " Please, use them one by one."
exit 1;
fi
When run from dmenu
, its output is redirected to .xsession-errors
, which is
coincidentally broken by some (possibly) NFS magic. This causes grep
to
receive EACCESS
in response to a close
syscall and, combined with -q
a bug
causes it to return 0, when it really shouldn’t.