急:hdiskpower丢失

发布于 2022-08-11 18:49:33 字数 11750 浏览 11 评论 5

环境: 操作系统AIX5.3,HA 5.4 IBM P52A机器分别为APP和DB主机,各自有自己的资源组 EMC磁阵该系统运行着重要业务
操作目的: 为HA双机(P52A机型)增加SCSI卡,以连接外置磁带机
操作步骤:先将DB主机切换到APP,DB释放资源组和IP,由APP主机接管.操作成功,然后DB主机关机,装SCSI卡.DB关机之后,先将DB上光纤电源线网线心跳线都给拔下来了,安装上SCSI卡后,又重新接上.位置都没有接错.重起DB主机,接磁带机,磁带机正常.起DB系统双机软件,报错:
ERROR: Cluster verification detected that some of the disks on the cluster
use both hdisks and device paths on different nodes. To ensure correct
device processing, please confgure all nodes to use either hdisk
or vpath devices for the following PVIDs:
WARNING: Application monitors are required for detecting application failures
in order for HACMP to recover from them. Application monitors are started
by HACMP when the resource group in which they participate is activated.
The following application(s), shown with their associated resource group,
do not have an application monitor configured:

  Application Server             Resource Group
  --------------------------------  ---------------------------------
app_srv                         app_rg
db_srv                         db_rg

COMMAND STATUS

Command: running    stdout: yes          stderr: no
之后,检查系统错误:
# errpt -d H
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
0BA49C99 1122032408 T H scsi2       SCSI BUS ERROR
0BA49C99 1122032308 T H scsi2       SCSI BUS ERROR
0BA49C99 1122025008 T H scsi2       SCSI BUS ERROR
lspv发现hdiskpower3丢失:
# lspv
hdisk0       0008f2421ffe51c6                   rootvg       active
hdisk1       0008f24224048f3d                   rootvg       active
hdisk2       none                               None
hdisk3       none                               None
hdisk4       none                               None
hdisk5       none                               None
hdisk6       none                               None
hdisk7       none                               None
hdisk9       none                               None
hdisk10       none                               None
hdisk11       none                               None
hdisk12       none                               None
hdisk13       none                               None
hdisk14       none                               None
hdisk15       none                               None
hdisk17       none                               None
hdiskpower0    0008f28847e6dc34                   logvg
hdiskpower1    0008f28847e5a383                   oravg
hdiskpower2    0008f28847f149c6                   oravg
hdiskpower3    none                               None
hdisk18       none                               None
hdisk19       none                               None
检查磁阵状态发现 hdiskpower3链路有问题:
# powermt display dev=all
Pseudo name=hdiskpower2
CLARiiON ID=CK200050800542 [SMG4]
Logical device ID=60060160359414000C2F2F7155BEDC11 [LUN 10]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B    Array failover mode: 1
==============================================================================
---------------- Host --------------- - Stor - -- I/O Path -  -- Stats ---
###  HW Path             I/O Paths Interf. Mode State  Q-IOs Errors
==============================================================================
1 fscsi1                   hdisk11 SP A1    active  alive    0    0
1 fscsi1                   hdisk15 SP B1    active  alive    0    0
0 fscsi0                   hdisk3 SP A0    active  alive    0    0
0 fscsi0                   hdisk7 SP B0    active  alive    0    0

Pseudo name=hdiskpower3
CLARiiON ID=CK200050800542 [SMG4]
Logical device ID=600601603594140015015EAF59BEDC11 [LUN 20]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B    Array failover mode: 1
==============================================================================
---------------- Host --------------- - Stor - -- I/O Path -  -- Stats ---
###  HW Path             I/O Paths Interf. Mode State  Q-IOs Errors
==============================================================================
1 fscsi1                   hdisk12 SP A1    active  alive    0    0
1 fscsi1                   hdisk16 SP B1    active  dead    0    0
0 fscsi0                   hdisk18 SP B0    active  alive    0    0
1 fscsi1                   hdisk19 SP B1    active  alive    0    0
0 fscsi0                   hdisk4 SP A0    active  alive    0    0
0 fscsi0                   hdisk8 SP B0    active  dead    0    0

各位大哥,有没有碰到这种情况,帮忙分析一下啊.

分享到QQ

分享到微博