Exadata, dracut and LVM locking_type as read-only

Sometimes even a perfectly running server can pass over issues after a simple reboot. And was exactly that occurred recently with one Exadata database node. And was not the first time that the same error appears (and since there is no well-documented step by step to fix it I documented them below). So, check how to fix the issue related to the read-only locking_type of LVM detected by dracut.

The error occurs after the reboot and when the dracut can’t find the disks labels that LVM uses for the operational system. The Linux boot phase occurs but stops after that. The first symptom is that after the reboot (or power off/power on) is impossible to log in to the machine, and to check, you need to open the console (for VM or ILOM):

[DOM0 - root@exadbsrv04 ~]$  xm list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0  8886     4     r----- 6065187.7
exadbvm01.mynt.simon.net                     9 614403    40     r----- 35496330.2
exadbvm03.mynt.simon.net                    17 614403    32     r----- 21859903.0
exadbvm05.mynt.simon.net                    19 61443    10     -b----     43.9
[DOM0 - root@exadbsrv04 ~]$ 
[DOM0 - root@exadbsrv04 ~]$  xm console exadbvm05.mynt.simon.net
...
...
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000f0fffffff] usable
...

And after some messages you can see that dracut stops with: “Warning: /dev/disk/by-label/DBSYS does not exist”:

[  215.765829] dracut-initqueue[437]: Warning: dracut-initqueue timeout - starting timeout scripts

[  216.351880] dracut-initqueue[437]: Warning: dracut-initqueue timeout - starting timeout scripts

[  216.935966] dracut-initqueue[437]: Warning: dracut-initqueue timeout - starting timeout scripts

[  217.520974] dracut-initqueue[437]: Warning: dracut-initqueue timeout - starting timeout scripts

[  217.527029] dracut-initqueue[437]: Warning: Could not boot.

[  218.460844] dracut-initqueue[437]: Warning: /dev/disk/by-label/DBSYS does not exist
         Starting Dracut Emergency Shell...
Warning: /dev/disk/by-label/DBSYS does not exist

Generating "/run/initramfs/rdsosreport.txt"


Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.


dracut:/#

To solve the issue, the first step is to check the locking_type definition inside the lvm.conf file. If the parameter value is 4, this means that filesystem/LVM is in read-only mode and the value needs to be changed to 1:

dracut:/etc/lvm# vi lvm.conf
…
#     LVM uses built-in clustered locking with clvmd.
#     This is incompatible with lvmetad. If use_lvmetad is enabled,
#     LVM prints a warning and disables lvmetad use.
#   4
#     LVM uses read-only locking which forbids any operations that
#     might change metadata.
#   5
#     Offers dummy locking for tools that do not need any locks.
#     You should not need to set this directly; the tools will select
#     when to use it instead of the configured locking_type.
#     Do not use lvmetad or the kernel device-mapper driver with this
#     locking type. It is used by the --readonly option that offers
#     read-only access to Volume Group metadata that cannot be locked
#     safely because it belongs to an inaccessible domain and might be
#     in use, for example a virtual machine image or a disk that is
#     shared by a clustered machine.
#
locking_type = 4

# Configuration option global/wait_for_locks.

If it is that the case, the next is creating a copy/backup of the lvm.conf and edit the original file:

dracut:/etc/lvm# cd /etc/lvm
dracut:/etc/lvm# cp lvm.conf lvm1.conf
dracut:/etc/lvm# vi lvm.conf
# This is an example configuration file for the LVM2 system.
...
...
       #     locking type. It is used by the --readonly option that offers
       #     read-only access to Volume Group metadata that cannot be locked
       #     safely because it belongs to an inaccessible domain and might be
       #     in use, for example a virtual machine image or a disk that is
       #     shared by a clustered machine.
       #
       locking_type = 1

       # Configuration option global/wait_for_locks.
       # When disabled, fail if a lock request would block.
"lvm.conf" 2152L, 95859C written
dracut:/etc/lvm#

After that we need to scan again the lvm devices (physical, virtual, and logical – in this exact order):

dracut:/etc/lvm# lvm pvscan
 Scan of VG VGExaDb from /dev/xvdf found metadata seqno 82 vs previous 80.
 Scan of VG VGExaDb from /dev/xvdg found metadata seqno 82 vs previous 80.
 Scan of VG VGExaDb from /dev/xvdf found metadata seqno 82 vs previous 80.
 Scan of VG VGExaDb from /dev/xvdg found metadata seqno 82 vs previous 80.    
 Scan of VG VGExaDb from /dev/xvdf found metadata seqno 82 vs previous 80.
 Scan of VG VGExaDb from /dev/xvdg found metadata seqno 82 vs previous 80.
 WARNING: Missing device /dev/xvda2 reappeared, updating metadata for VG VGExaDb to version 82.
 WARNING: Device /dev/xvda2 still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing.
 WARNING: Missing device /dev/xvdd1 reappeared, updating metadata for VG VGExaDb to version 82.
 WARNING: Device /dev/xvdd1 still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing.
 WARNING: Inconsistent metadata found for VG VGExaDb - updating to use version 82
 PV /dev/xvda2   VG VGExaDb         lvm2 [<24.50 GiB / 0    free]
 PV /dev/xvdd1   VG VGExaDb         lvm2 [<62.00 GiB / 0    free]
 PV /dev/xvdf    VG VGExaDb         lvm2 [<50.00 GiB / 0    free]
 PV /dev/xvdg    VG VGExaDb         lvm2 [<150.00 GiB / 0    free]
 Total: 4 [286.48 GiB] / in use: 4 [286.48 GiB] / in no VG: 0 [0   ]
dracut:/etc/lvm# lvm vgscan
 Reading all physical volumes.  This may take a while...
 Found volume group "VGExaDb" using metadata type lvm2
dracut:/etc/lvm# lvm lvscan
 inactive          '/dev/VGExaDb/LVDbSys1' [24.00 GiB] inherit
 inactive          '/dev/VGExaDb/LVDbSys2' [24.00 GiB] inherit
 inactive          '/dev/VGExaDb/LVDbOra1' [221.48 GiB] inherit
 inactive          '/dev/VGExaDb/LVDbSwap1' [16.00 GiB] inherit
 inactive          '/dev/VGExaDb/LVDoNotRemoveOrUse' [1.00 GiB] inherit
dracut:/etc/lvm#

As you can see above, the LVM reports that the missing devices reappeared and now we can reboot:

dracut:/etc/lvm# reboot
[ 2371.212140] dracut-initqueue[437]: Job for dracut-emergency.service canceled.

[ 2371.220275] dracut-initqueue[437]: Warning: Not all disks have been found.

[ 2371.220334] dracut-initqueue[437]: Warning: You might want to regenerate your initramfs.

[  OK  ] Started Show Plymouth Reboot Screen.

[    **] A start job is running for dev-disk...YS.device (39min 33s / no limit)
...
...
[DOM0 - root@exadbsrv04 ~]$  
[DOM0 - root@exadbsrv04 ~]$  
[DOM0 - root@exadbsrv04 ~]$  
[DOM0 - root@exadbsrv04 ~]$  xm console exadbvm05.mynt.simon.net
...
...
[    3.108250] systemd[1]: Detected architecture x86-64.
[    3.113372] systemd[1]: Running in initial RAM disk.

Welcome to Oracle Linux Server 7.9 dracut-033-572.0.3.el7 (Initramfs)!

[    3.124900] systemd[1]: Set hostname to <exadbvm05.mynt.simon.net>.
[    3.166337] systemd[1]: Created slice Root Slice.
[  OK  ] Created slice Root Slice.
[    3.178137] systemd[1]: Created slice System Slice.
...
[  145.928065] dracut-initqueue[435]: Warning: dracut-initqueue timeout - starting timeout scripts

[  OK  ] Found device /dev/disk/by-label/DBSYS.

         Starting File System Check on /dev/disk/by-label/DBSYS...

[  OK  ] Started dracut initqueue hook.

[  OK  ] Reached target Remote File Systems (Pre).
...
...
         Starting Terminate Plymouth Boot Screen...



 
exadbvm05 login: [  252.544220] rc-oracle-exadata[10761]: Run validation misceachboot - PASSED
[  252.574278] rc-oracle-exadata[10761]: 2021-06-07 10:27:05 +0200 The each boot completed with SUCCESS

The only detail that you need to check if verify that you don’t have any snapshot for the LVM level. If you have (maybe) the issue can return and next boot and the recommendation is to remove the snapshot. If it is the case, check the note Exadata VM Goes To Reboot Loop Due to Existing Root Snapshot (Doc ID 2326897.1):

[root@exadbvm05 ~]# lvdisplay  | grep snap
[root@exadbvm05 ~]#

Disclaimer: “The postings on this site are my own and don’t necessarily represent my actual employer positions, strategies or opinions. The information here was edited to be useful for general purpose, specific data and identifications were removed to allow reach the generic audience and to be useful for the community. Post protected by copyright.”

One thought on “Exadata, dracut and LVM locking_type as read-only

  1. Pingback: Fixing Exadata Missing Volumes at LVM | Fernando Simon

Leave a Reply

Your email address will not be published. Required fields are marked *