Dave Young
2017-04-12 08:24:33 UTC
Hi,
commit 021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory
regions") causes some of my systems with persistent memory (whether real
or emulated) to fail to boot with a couple of different crash
signatures. The first signature is a NMI watchdog lockup of all but 1
cpu, which causes much difficulty in extracting useful information from
the console. The second variant is an invalid paging request, listed
below.
On some systems, I haven't hit this problem at all. Other systems
experience a failed boot maybe 20-30% of the time. To reproduce it,
configure some emulated pmem on your system. You can find directions
for that here: https://nvdimm.wiki.kernel.org/
Install ndctl (https://github.com/pmem/ndctl).
# ndctl create-namespace -f -e namespace0.0 -m memory
Then just reboot several times (5 should be enough), and hopefully
you'll hit the issue.
I've attached both my .config and the dmesg output from a successful
boot at the end of this mail.
[snip]commit 021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory
regions") causes some of my systems with persistent memory (whether real
or emulated) to fail to boot with a couple of different crash
signatures. The first signature is a NMI watchdog lockup of all but 1
cpu, which causes much difficulty in extracting useful information from
the console. The second variant is an invalid paging request, listed
below.
On some systems, I haven't hit this problem at all. Other systems
experience a failed boot maybe 20-30% of the time. To reproduce it,
configure some emulated pmem on your system. You can find directions
for that here: https://nvdimm.wiki.kernel.org/
Install ndctl (https://github.com/pmem/ndctl).
# ndctl create-namespace -f -e namespace0.0 -m memory
Then just reboot several times (5 should be enough), and hopefully
you'll hit the issue.
I've attached both my .config and the dmesg output from a successful
boot at the end of this mail.
I did some tests about emulated pmem via memmap=, kdump kernel hangs or
just reboots early during compressing kernel, no clue how to handle it.
Since for kdump kernel kaslr is pointless a workaround is use "nokaslr"
In Fedora or RHEL, just add "nokaslr" in KDUMP_COMMANDLINE_APPEND
in /etc/sysconfig/kdump
Can you try if this works?
Thanks
Dave