Monday, May 11, 2015

That RAID10 Mystery..

Not all RAIDs are created equal and I recently stumbled upon an issue where I was at loss to explain why two different configs seemed to have the same characteristics (RAID-10).
On a server with many disks (xd configs), I created two LD’s  (one with ten disks, the other with 12 disks).
I asked for a RAID10 config in both cases but I didn't use the same tools to create the RAID's.


  • Creating the two LD’s
    • First LD (racadm on the iDrac):


                send "racadm storage createvd:RAID.Integrated.1-1 -rl r10 -wp wb -rp ra -name Virtual_Disk_1 -ss 512k -pdkey:"
                send "Disk.Bay.2:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.3:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.4:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.5:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.6:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.7:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.8:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.9:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.10:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send ",Disk.Bay.11:Enclosure.Internal.0-1:RAID.Integrated.1-1"
                send "\r"


    • Second LD (MegaCli):

$ sudo MegaCli64 -CfgSpanAdd -r10 -Array0[32:12,32:13,32:14,32:15,32:16,32:17] -Array1[32:18,32:19,32:20,32:21,32:22,32:24] -a0


  • Configuration Results

$ sudo MegaCli64 -LDInfo -l1 -a0 -NoLog
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 1 (Target Id: 1)
Name                :Virtual_Disk_1
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 2.725 TB
Sector Size         : 512
Is VD emulated      : No
Mirror Data         : 2.725 TB
State               : Optimal
Strip Size          : 512 KB
Number Of Drives    : 10
Span Depth          : 1

$ sudo MegaCli64 -LDInfo -l2 -a0 -NoLog
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 2 (Target Id: 2)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 3.271 TB
Sector Size         : 512
Is VD emulated      : No
Mirror Data         : 3.271 TB
State               : Optimal
Strip Size          : 64 KB
Number Of Drives per span:6
Span Depth          : 2

So you would expect that the two LD's are RAID'ed differently (one must be RAID0+1 while the other is RAID1+0).

Let’s look at the LD’s:

    • First LD

$ sudo ~/bin/megaclisas-status|egrep '(Status|c0u1)'
-- ID | Type    |    Size |  Strpsz |   Flags | DskCache |  Status |  OS Path | InProgress   
c0u1  | RAID-10 |   2725G |  512 KB |   RA,WB | Disabled | Offline |        1 | None         
-- ID   | Type | Drive Model                      | Size     | Status          | Speed    | Temp | Slot ID  | LSI Device ID
c0u1p0  | HDD  | SEAGATE ST600MM0006 LS0AS0MH6WD1 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:2]   | 2       
c0u1p1  | HDD  | SEAGATE ST600MM0006 LS0AS0MM9VMK | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:3]   | 3       
c0u1p2  | HDD  | SEAGATE ST600MM0006 LS0AS0MPU9UD | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:4]   | 4       
c0u1p3  | HDD  | SEAGATE ST600MM0006 LS0AS0M0DYA3 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:5]   | 5       
c0u1p4  | HDD  | SEAGATE ST600MM0006 LS0AS0MPE83S | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:6]   | 6       
c0u1p5  | HDD  | SEAGATE ST600MM0006 LS0AS0MCS44B | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 31C  | [32:7]   | 7       
c0u1p6  | HDD  | SEAGATE ST600MM0006 LS0AS0MV4CDQ | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 32C  | [32:8]   | 8       
c0u1p7  | HDD  | SEAGATE ST600MM0006 LS0AS0MKZITD | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:9]   | 9       
c0u1p8  | HDD  | SEAGATE ST600MM0006 LS0AS0MG5JK5 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:10]  | 10      
c0u1p9  | HDD  | SEAGATE ST600MM0006 LS0AS0MNQKGJ | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:11]  | 11      


    • Second LD

$ sudo ~/bin/megaclisas-status|egrep '(Status|c0u2)'
-- ID | Type    |    Size |  Strpsz |   Flags | DskCache |  Status |  OS Path | InProgress   
c0u2  | RAID-10 |   3271G |   64 KB | ADRA,WB |  Default | Optimal | /dev/sdc | Background Initialization: Completed 2%, Taken 2 min. 
-- ID   | Type | Drive Model                      | Size     | Status          | Speed    | Temp | Slot ID  | LSI Device ID
c0u2p0  | HDD  | SEAGATE ST600MM0006 LS0AS0M8U2J0 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:12]  | 12      
c0u2p1  | HDD  | SEAGATE ST600MM0006 LS0AS0MR90JD | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:13]  | 13      
c0u2p2  | HDD  | SEAGATE ST600MM0006 LS0AS0MEHKQG | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:14]  | 14      
c0u2p3  | HDD  | SEAGATE ST600MM0006 LS0AS0MD8GJM | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 32C  | [32:15]  | 15      
c0u2p4  | HDD  | SEAGATE ST600MM0006 LS0AS0MIQWWY | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 32C  | [32:16]  | 16      
c0u2p5  | HDD  | SEAGATE ST600MM0006 LS0AS0MPL5QR | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:17]  | 17      
c0u2p0  | HDD  | SEAGATE ST600MM0006 LS0AS0MJ59BJ | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:18]  | 18      
c0u2p1  | HDD  | SEAGATE ST600MM0006 LS0AS0MB5F8B | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:19]  | 19      
c0u2p2  | HDD  | SEAGATE ST600MM0006 LS0AS0MLCFI8 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:20]  | 20      
c0u2p3  | HDD  | SEAGATE ST600MM0006 LS0AS0M7N594 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:21]  | 21      
c0u2p4  | HDD  | SEAGATE ST600MM0006 LS0AS0MXA9G2 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:22]  | 22      
c0u2p5  | HDD  | SEAGATE ST600MM0006 LS0AS0MM3BLR | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 52C  | [32:24]  | 24      


  • Punching holes through the RAID drives

Let’s take some drives offline. If we have a RAID0+1 config, as soon as we start hitting the second stripe, the LD will most likely go down..:


    • First LD (10 disks):

$ sudo MegaCli64 -PDOffline -PhysDrv '[32:2]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:4]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:6]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:8]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:10]' -a0

The LD is still online, albeit Degraded (This would mean RAID10, not 0+1):

$ sudo ~/bin/megaclisas-status|egrep '(Status|c0u1)'
Password:
-- ID | Type    |    Size |  Strpsz |   Flags | DskCache |  Status |  OS Path | InProgress   
c0u1  | RAID-10 |   2725G |  512 KB |   RA,WB | Disabled | Degraded | /dev/sdb | None         
-- ID   | Type | Drive Model                      | Size     | Status          | Speed    | Temp | Slot ID  | LSI Device ID
c0u1p0  | HDD  | SEAGATE ST600MM0006 LS0AS0MH6WD1 | 558.3 Gb | Offline         | 6.0Gb/s  | 30C  | [32:2]   | 2       
c0u1p1  | HDD  | SEAGATE ST600MM0006 LS0AS0MM9VMK | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:3]   | 3       
c0u1p2  | HDD  | SEAGATE ST600MM0006 LS0AS0MPU9UD | 558.3 Gb | Offline         | 6.0Gb/s  | 30C  | [32:4]   | 4       
c0u1p3  | HDD  | SEAGATE ST600MM0006 LS0AS0M0DYA3 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:5]   | 5       
c0u1p4  | HDD  | SEAGATE ST600MM0006 LS0AS0MPE83S | 558.3 Gb | Offline         | 6.0Gb/s  | 30C  | [32:6]   | 6       
c0u1p5  | HDD  | SEAGATE ST600MM0006 LS0AS0MCS44B | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 32C  | [32:7]   | 7       
c0u1p6  | HDD  | SEAGATE ST600MM0006 LS0AS0MV4CDQ | 558.3 Gb | Offline         | 6.0Gb/s  | 33C  | [32:8]   | 8       
c0u1p7  | HDD  | SEAGATE ST600MM0006 LS0AS0MKZITD | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:9]   | 9       
c0u1p8  | HDD  | SEAGATE ST600MM0006 LS0AS0MG5JK5 | 558.3 Gb | Offline         | 6.0Gb/s  | 30C  | [32:10]  | 10      
c0u1p9  | HDD  | SEAGATE ST600MM0006 LS0AS0MNQKGJ | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:11]  | 11      


    • Second LD (12 disks)

Let’s take some drives offline:
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:13]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:15]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:17]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:19]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:21]' -a0
$ sudo MegaCli64 -PDOffline -PhysDrv '[32:24]' -a0

That second LD is also still online:
$ sudo ~/bin/megaclisas-status|egrep '(Status|c0u2)'
-- ID | Type    |    Size |  Strpsz |   Flags | DskCache |  Status |  OS Path | InProgress   
c0u2  | RAID-10 |   3271G |   64 KB | ADRA,WB |  Default | Degraded | /dev/sdc | None         
-- ID   | Type | Drive Model                      | Size     | Status          | Speed    | Temp | Slot ID  | LSI Device ID
c0u2p0  | HDD  | SEAGATE ST600MM0006 LS0AS0M8U2J0 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:12]  | 12      
c0u2p1  | HDD  | SEAGATE ST600MM0006 LS0AS0MR90JD | 558.3 Gb | Offline         | 6.0Gb/s  | 30C  | [32:13]  | 13      
c0u2p2  | HDD  | SEAGATE ST600MM0006 LS0AS0MEHKQG | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 30C  | [32:14]  | 14      
c0u2p3  | HDD  | SEAGATE ST600MM0006 LS0AS0MD8GJM | 558.3 Gb | Offline         | 6.0Gb/s  | 32C  | [32:15]  | 15      
c0u2p4  | HDD  | SEAGATE ST600MM0006 LS0AS0MIQWWY | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 32C  | [32:16]  | 16      
c0u2p5  | HDD  | SEAGATE ST600MM0006 LS0AS0MPL5QR | 558.3 Gb | Offline         | 6.0Gb/s  | 29C  | [32:17]  | 17      
c0u2p0  | HDD  | SEAGATE ST600MM0006 LS0AS0MJ59BJ | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:18]  | 18      
c0u2p1  | HDD  | SEAGATE ST600MM0006 LS0AS0MB5F8B | 558.3 Gb | Offline         | 6.0Gb/s  | 29C  | [32:19]  | 19      
c0u2p2  | HDD  | SEAGATE ST600MM0006 LS0AS0MLCFI8 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:20]  | 20      
c0u2p3  | HDD  | SEAGATE ST600MM0006 LS0AS0M7N594 | 558.3 Gb | Offline         | 6.0Gb/s  | 30C  | [32:21]  | 21      
c0u2p4  | HDD  | SEAGATE ST600MM0006 LS0AS0MXA9G2 | 558.3 Gb | Online, Spun Up | 6.0Gb/s  | 29C  | [32:22]  | 22      
c0u2p5  | HDD  | SEAGATE ST600MM0006 LS0AS0MM3BLR | 558.3 Gb | Offline         | 6.0Gb/s  | 51C  | [32:24]  | 24      

In both cases, after taking one more drive offline, its associated LD went down. But, ....
At any case, it doesn't look like RAID0+1 and it does seem both LD's are RAID10.
The MegaCli output is very different between LD1 and LD2 and I am at loss to explain what this means (if you know, please do tell :) ).

The modified version of megaclisas-status can be found here:
https://github.com/ElCoyote27/hwraid/blob/master/wrapper-scripts/megaclisas-status

LVM2 bootdisk encapsulation on RHEL7/Centos7

Introduction Hi everyone, Life on overcloud nodes was simple back then and everybody loved that single 'root' partition on th...