使用sas2ircu在TrueNAS Scale中定位坏硬盘

场景描述

长期使用的一个ZFS硬盘阵列出现了硬盘损坏,SMART测试报大量错误。但是由于硬盘安装的时候没有做标记,因此面对12个硬盘位害怕抽错硬盘导致阵列GG。同时阵列正在执行读写和新盘的同步,不太方便停机抽出来看,因此需要在机器运行的时候定位损坏的硬盘。

基本环境

服务器:RH2288H V2

硬盘背板:SAS2308

操作系统:Esxi8直通SAS2308,TrueNAS-SCALE-22.02.4

操作流程

1、通过SSH登录TrueNAS Scale

如果在操作过程中出现`SAS2IRCU: MPTLib2 Error 1`,一般是权限问题,请加sudo或使用root账户。

admin@truenas[/mnt]$ sas2ircu list
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

SAS2IRCU: MPTLib2 Error 1

2、检查sas2ircu是否能识别阵列卡

root@truenas[~]# sas2ircu list
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.


         Adapter      Vendor  Device                       SubSys  SubSys
 Index    Type          ID      ID    Pci Address          Ven ID  Dev ID
 -----  ------------  ------  ------  -----------------    ------  ------
   0     SAS2308_2     1000h    87h   00h:0bh:00h:00h      1000h   0087h
SAS2IRCU: Utility Completed Successfully.
root@truenas[~]# sas2ircu 0 display
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

Read configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
  Controller type                         : SAS2308_2
  BIOS version                            : 7.25.00.00
  Firmware version                        : 15.00.03.00
  Channel description                     : 1 Serial Attached SCSI
  Initiator ID                            : 0
  Maximum physical devices                : 255
  Concurrent commands supported           : 3072
  Slot                                    : 0
  Segment                                 : 0
  Bus                                     : 11
  Device                                  : 0
  Function                                : 0
  RAID Support                            : Yes
------------------------------------------------------------------------
IR Volume information
------------------------------------------------------------------------
------------------------------------------------------------------------
Physical device information
------------------------------------------------------------------------
略

3、在TrueNAS Scale上找到发生损坏的硬盘的序列号(Storage -> Disks -> Serial),是序列号(Serial No)不是硬盘型号(Model Number)。

4、在硬盘信息中找到该硬盘的相关信息

root@truenas[~]# sas2ircu 0 display | grep -B 8 WCC4E3LJFF91
  Enclosure #                             : 2
  Slot #                                  : 5
  SAS Address                             : 500e004-a-aaaa-aa05
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 3815447/7814037167
  Manufacturer                            : ATA
  Model Number                            : WDC WD40PURX-64G
  Firmware Revision                       : 0A80
  Serial No                               : WDWCC4E3LJFF91

5、从上述信息中找到Enclosure编号和Slot编号,构成硬盘盘位的编号Enclosure:Slot,例子中即为:2:5

6、使用定位指令与硬盘盘位号让硬盘盘位的知识灯亮起来

root@truenas[~]# sas2ircu 0 locate 2:5 on
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

SAS2IRCU: LOCATE command completed successfully.
SAS2IRCU: Command LOCATE Completed Successfully.
SAS2IRCU: Utility Completed Successfully.

关灯

root@truenas[~]# sas2ircu 0 locate 2:5 off
LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

SAS2IRCU: LOCATE command completed successfully.
SAS2IRCU: Command LOCATE Completed Successfully.
SAS2IRCU: Utility Completed Successfully.

7、可以看到机箱上的灯已经亮起(或闪烁)

参考文档

sas2ircu工具信息收集及磁盘定位

在TrueNAS/FreeBSD上定位故障硬盘槽位

sas2ircu、sas3ircu、MegaCli64、hpssacli 阵列卡工具使用指令

You may also like...

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注