Chatty kernel logs
Most people don’t care about the kernel until it breaks or they think it is
broken. When this happens, usually the first place people look is the kernel
logs by using dmesg
or journalctl -k
. This dumps the output of the in-
kernel ringbuffer. The messages from the kernel ring buffer mostly come from
the kernel itself calling printk
. The ring buffer can’t hold an infinite
amount of data, and even if it could more information isn’t necessarily better.
In general, the kernel community tries to limit kernel prints to error messages
or limited probe information. Under normal operation the kernel should be
neither seen nor heard. The kernel doesn’t always match those guidelines though
and not every kernel message is an urgent problem to be fixed.
The kernel starts dumping information almost immediately after the bootloader passes it control. Very early information is designed to give an idea what kind of kernel is running and what kind of system it is running on. This may include dumping out CPU features and what areas of RAM were found by the kernel. As the kernel continues booting and initializing, the printed messages get more selective. Drivers may print out only hardware information or nothing at all. The latter is preferred in most cases.
I’ve had many arguments on both sides about whether a driver should be printing
something. There’s usually a lot of “but my driver reaaaallly needs this”. The
preferred solution to this problem is usually to adjust the log level the
messages are printed at. The kernel provides several
log levels to filter out appropriate messages. Most drivers will make use
of KERN_ERR
KERN_WARN
and KERN_INFO
. These have the meaning you would
expect: true errors, just gentle warning and some useful information.
KERN_DEBUG
should be used to provide more verbose debugging/tracing on an as
needed basis. The kernel option CONFIG_DYNAMIC_DEBUG
can be used to
enable and disable
individual pr_debug
messages at runtime. This option is enabled on Fedora
kernels.
Even with the different levels of kernel messages available, it may not always
be clear how important a message actually is. It’s very common to get Fedora
kernel bug reports of “dmesg had message X”. If there is nothing else going
wrong on the system, the bug may either get closed as NOTABUG
or placed at
low priority. A common ongoing complaint is with firmware. Drivers may look for
multiple firmware versions, starting with the newest. If a particular firmware
isn’t available, the firmware layer itself will spit out an error even if a
matching version may eventually be found. Sometimes the kernel driver may not
match the hardware exactly and the driver will choose to be ‘helpful’ and
indicate there may be problems. While it is true that hardware which is not
100% compliant may cause issues, messages like this are unhelpful without a
message that this isn’t likely to be fixed in the kernel. Even with statements
like “the kernel is fine”, it can be confusing and difficult to explain this
to users.
The kernel logs are a vital piece of information for reporting problems. Not all messages in the logs are an indication of a problem that needs a kernel fix. It’s still important to report bugs so we know what bothers people. It may be possible to report issues to upstream driver owners to report better error messages and make the kernel logs more useful for everyone.