ASM, REPLACE DISK Command

The REPLACE DISK command was released with 12.1 and allow to do an online replacement for a failed disk. This command is important because it reduces the rebalance time doing just the SYNC phase. Comparing with normal disk replacement (DROP and ADD in the same command), the REPLACE just do mirror resync.

Basically, when the REPLACE command is called, the rebalance just copy/sync the data from the survivor disk (the partner disk from the mirror). It is faster since the previous way with drop/add execute a complete rebalance from all AU of the diskgroup, doing REBALANCE and SYNC phase.

The replace disk command is important for the SWAP disk process for Exadata (where you add the new 14TB disks) since it is faster to do the rebalance of the diskgroup.

Below one example from this behavior. Look that AU from DISK01 was SYNCED with the new disk:

And compare with the previous DROP/ADD disk, where all AU from all disks was rebalanced:

Actual Environment And Simulate The failure

In this post, to simulate and show how the replace disk works I have the DATA diskgroup with 6 disks (DISK01-06). The DISK07 it is not in use.

SQL> select NAME,FAILGROUP,LABEL,PATH from v$asm_disk order by FAILGROUP, label;

NAME                           FAILGROUP                      LABEL      PATH
------------------------------ ------------------------------ ---------- -----------
DISK01                         FAIL01                         DISK01     ORCL:DISK01
DISK02                         FAIL01                         DISK02     ORCL:DISK02
DISK03                         FAIL02                         DISK03     ORCL:DISK03
DISK04                         FAIL02                         DISK04     ORCL:DISK04
DISK05                         FAIL03                         DISK05     ORCL:DISK05
DISK06                         FAIL03                         DISK06     ORCL:DISK06
RECI01                         RECI01                         RECI01     ORCL:RECI01
SYSTEMIDG01                    SYSTEMIDG01                    SYSI01     ORCL:SYSI01
                                                              DISK07     ORCL:DISK07

9 rows selected.

SQL>

And to simulate the error I disconnected the disk from Operational system (since I used iSCSI, I just log off the target for DISK02:

[root@asmrec ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:tsn.eff4683320e8 -p 172.16.0.3:3260 -u
Logging out of session [sid: 41, target: iqn.2006-01.com.openfiler:tsn.eff4683320e8, portal: 172.16.0.3,3260]
Logout of [sid: 41, target: iqn.2006-01.com.openfiler:tsn.eff4683320e8, portal: 172.16.0.3,3260] successful.
[root@asmrec ~]#

At the same moment, the alertlog from ASM detected the error and informed that the mirror was found in another disk (DISK06):

2020-03-29T00:42:11.160695+01:00
WARNING: Read Failed. group:3 disk:1 AU:29 offset:0 size:4096
path:ORCL:DISK02
         incarnation:0xf0f0c113 synchronous result:'I/O error'
         subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so krq:0x7f3df8db35b8 bufp:0x7f3df8c9c000 osderr1:0x3 osderr2:0x2e
         IO elapsed time: 0 usec Time waited on I/O: 0 usec
WARNING: cache failed reading from group=3(DATA) fn=8 blk=0 count=1 from disk=1 (DISK02) mirror=0 kfkist=0x20 status=0x02 osderr=0x3 file=kfc.c line=13317
WARNING: cache succeeded reading from group=3(DATA) fn=8 blk=0 count=1 from disk=5 (DISK06) mirror=1 kfkist=0x20 status=0x01 osderr=0x0 file=kfc.c line=13366
So, at this moment the DQISK02 will not be removed instantely, but after the disk_repair_time finish:
WARNING: Started Drop Disk Timeout for Disk 1 (DISK02) in group 3 with a value 43200
WARNING: Disk 1 (DISK02) in group 3 will be dropped in: (43200) secs on ASM inst 1
cluster guid (e4db41a22bd95fc6bf79d2e2c93360c7) generated for PST Hbeat for instance 1

So, at this moment the DQISK02 will not be removed instantly, but after the disk_repair_time finish:

WARNING: Started Drop Disk Timeout for Disk 1 (DISK02) in group 3 with a value 43200
WARNING: Disk 1 (DISK02) in group 3 will be dropped in: (43200) secs on ASM inst 1
cluster guid (e4db41a22bd95fc6bf79d2e2c93360c7) generated for PST Hbeat for instance 1

If you want to check the full output from ASM alertlog you can access here at ASM-ALERTLOG-Output-Online-Disk-Error.txt

So, the actual diskgroup is:

SQL> select NAME,FAILGROUP,LABEL,PATH from v$asm_disk order by FAILGROUP, label;

NAME                           FAILGROUP                      LABEL      PATH
------------------------------ ------------------------------ ---------- -----------
DISK01                         FAIL01                         DISK01     ORCL:DISK01
DISK02                         FAIL01
DISK03                         FAIL02                         DISK03     ORCL:DISK03
DISK04                         FAIL02                         DISK04     ORCL:DISK04
DISK05                         FAIL03                         DISK05     ORCL:DISK05
DISK06                         FAIL03                         DISK06     ORCL:DISK06
RECI01                         RECI01                         RECI01     ORCL:RECI01
SYSTEMIDG01                    SYSTEMIDG01                    SYSI01     ORCL:SYSI01
                                                              DISK07     ORCL:DISK07

9 rows selected.

SQL>

REPLACE DISK

Since the old disk was lost (by HW or something similar), it is impossible to put it again online. A new disk was attached to the server (DISK07 in this example) and this is added in the diskgroup.

So, we just need to execute the REPLACE DISK command:

SQL> alter diskgroup DATA
  2  REPLACE DISK DISK02 with 'ORCL:DISK07'
  3  power 2;

Diskgroup altered.

SQL>

The command is easy, we replace disk failed disk with the new disk path. And it is possible to replace more than one at the same time and specify the power of the rebalance too.

At ASM alertlog we can see a lot of messages about this replacement, but look that resync of the disk. The full output can be found here at ASM-ALERTLOG-Output-Replace-Disk.txt

Some points here:

2020-03-29T00:44:31.602826+01:00
SQL> alter diskgroup DATA
replace disk DISK02 with 'ORCL:DISK07'
power 2
2020-03-29T00:44:31.741335+01:00
NOTE: cache closing disk 1 of grp 3: (not open) DISK02
2020-03-29T00:44:31.742068+01:00
NOTE: GroupBlock outside rolling migration privileged region
2020-03-29T00:44:31.742968+01:00
NOTE: client +ASM1:+ASM:asmrec no longer has group 3 (DATA) mounted
2020-03-29T00:44:31.746444+01:00
NOTE: Found ORCL:DISK07 for disk DISK02
NOTE: initiating resync of disk group 3 disks
DISK02 (1)

NOTE: process _user20831_+asm1 (20831) initiating offline of disk 1.4042309907 (DISK02) with mask 0x7e in group 3 (DATA) without client assisting
2020-03-29T00:44:31.747191+01:00
NOTE: sending set offline flag message (2044364809) to 1 disk(s) in group 3
…
…
2020-03-29T00:44:34.558097+01:00
NOTE: PST update grp = 3 completed successfully
2020-03-29T00:44:34.559806+01:00
SUCCESS: alter diskgroup DATA
replace disk DISK02 with 'ORCL:DISK07'
power 2
2020-03-29T00:44:36.805979+01:00
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
2020-03-29T00:44:36.820900+01:00
NOTE: starting rebalance of group 3/0xf99030d7 (DATA) at power 2

After that, we can see the rebalance just take the SYNC phase:

SQL> select * from gv$asm_operation;

   INST_ID GROUP_NUMBER OPERA PASS      STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE     CON_ID
---------- ------------ ----- --------- ---- ---------- ---------- ---------- ---------- ---------- ----------- ---------- ----------
         1            3 REBAL COMPACT   WAIT          2          2          0          0          0           0                     0
         1            3 REBAL REBALANCE WAIT          2          2          0          0          0           0                     0
         1            3 REBAL REBUILD   WAIT          2          2          0          0          0           0                     0
         1            3 REBAL RESYNC    RUN           2          2        231       1350        513           2                     0

SQL>
SQL> /

   INST_ID GROUP_NUMBER OPERA PASS      STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE     CON_ID
---------- ------------ ----- --------- ---- ---------- ---------- ---------- ---------- ---------- ----------- ---------- ----------
         1            3 REBAL COMPACT   WAIT          2          2          0          0          0           0                     0
         1            3 REBAL REBALANCE WAIT          2          2          0          0          0           0                     0
         1            3 REBAL REBUILD   WAIT          2          2          0          0          0           0                     0
         1            3 REBAL RESYNC    RUN           2          2        373       1350        822           1                     0

SQL>
SQL> /

   INST_ID GROUP_NUMBER OPERA PASS      STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE     CON_ID
---------- ------------ ----- --------- ---- ---------- ---------- ---------- ---------- ---------- ----------- ---------- ----------
         1            3 REBAL COMPACT   REAP          2          2          0          0          0           0                     0
         1            3 REBAL REBALANCE DONE          2          2          0          0          0           0                     0
         1            3 REBAL REBUILD   DONE          2          2          0          0          0           0                     0
         1            3 REBAL RESYNC    DONE          2          2       1376       1350          0           0                     0

SQL> /

no rows selected

SQL>

In the end, after the rebalance we have:

SQL> select NAME,FAILGROUP,LABEL,PATH from v$asm_disk order by FAILGROUP, label;

NAME                           FAILGROUP                      LABEL      PATH
------------------------------ ------------------------------ ---------- -----------
DISK01                         FAIL01                         DISK01     ORCL:DISK01
DISK02                         FAIL01                         DISK07     ORCL:DISK07
DISK03                         FAIL02                         DISK03     ORCL:DISK03
DISK04                         FAIL02                         DISK04     ORCL:DISK04
DISK05                         FAIL03                         DISK05     ORCL:DISK05
DISK06                         FAIL03                         DISK06     ORCL:DISK06
RECI01                         RECI01                         RECI01     ORCL:RECI01
SYSTEMIDG01                    SYSTEMIDG01                    SYSI01     ORCL:SYSI01

8 rows selected.

SQL>

An important detail is that the NAME for the disk will not change, it is impossible to change using REPLACE DISK command. As you can see above, the disk named DISK02 has the label DISK07 (here this came from asmlib disk).

Know Issues

There is a known issue for REPLACE DISK for 18c and higher for GI where the rebalance can take AGES to finish. This occurs because (when replacing more than one disk per time), it executes the SYNC disk by disk. One example, for one Exadata the replace for a complete cell took more than 48 hours, while a DROP/ADD took just 12 hours for the same disks.

So, it is recommended to have the fix for Bug 30582481 and Bug 31062010 applied. The detail it is that patch 30582481 (Patch 30582481: ASM REPLACE DISK COMMAND EXECUTED ON ALL CELLDISKS OF A FAILGROUP, ASM RUNNING RESYNC ONE DISK AT A TIME) was withdraw and replaced by bug/patch 31062010 that it is not available (at the moment that I write this port – March 2020).

So, be careful to do this in one engineering system or when you need to replace a lot of disks at the same time.

Some reference for reading:

 

Disclaimer: “The postings on this site are my own and don’t necessarily represent my actual employer positions, strategies or opinions. The information here was edited to be useful for general purpose, specific data and identifications were removed to allow reach the generic audience and to be useful for the community.”

Share it

Leave a Reply

Your email address will not be published. Required fields are marked *