由于RAC的测试环境空间不足,在给ASM添加新的磁盘空间时,出现了故障。
操作的步骤大致如下,在节点1启动了dbca来管理ASM设备。由于配置的部分裸设备在ASM图形界面下看不到。因此在节点1上通过root用户将裸设备的访问权限授予了oracle。
这时,从图形界面的候选磁盘中,已经可以看到这些裸设备了。通过图形界面将裸设备加到了磁盘组中。
但是这个操作出现了两个错误:ORA-15032和ORA-15075错误。
ORA-15032: not all alterations performed
Cause: At least one ALTER DISKGROUP action failed.
Action: Check the other messages issued along with this summary error.
ORA-15075: disk(s) are not visible cluster-wide
Cause: An ALTER DISKGROUP ADD DISK command specified a disk that could not be discovered by one or more nodes in a RAC cluster configuration.
Action: Determine which disks are causing the problem from the GV$OSM_DISK fixed view. Check operating system permissions for the device and the storage sub-system configuration on each node in a RAC cluster that cannot identify the disk.
其实ORA-15075错误中的信息已经足够明显了。如果有一定的经验或者根据这个错误进行分析就能找到问题的原因。
但是由于发生了其他的意外,导致解决问题的方向发生了变化。
一个奇怪的现象是,我认为操作已经失败了,但是这些裸设备在dbca的ASM配置中已经可见了。
当我正在检查这两个错误信息的时候。同事告诉我节点2上的实例连不上了。
通过操作系统命令检查发现,实例2已经关闭了。不过实例2的ASM实例仍然存在。看到这个现象感觉有点奇怪。对ASM的操作引起的错误,ASM实例都没有出错,怎么数据库实例关闭了呢。
检查alert文件,尝试重启系统,看看错误信息:
$ tail -500 alert*
List of nodes:
.
.
.
Thu Mar 29 17:10:24 2007
SUCCESS: disk DISK_0012 (12.4042303515) added to diskgroup DISK
SUCCESS: disk DISK_0013 (13.4042303516) added to diskgroup DISK
SUCCESS: disk DISK_0014 (14.4042303517) added to diskgroup DISK
SUCCESS: disk DISK_0015 (15.4042303518) added to diskgroup DISK
SUCCESS: disk DISK_0016 (16.4042303519) added to diskgroup DISK
Thu Mar 29 17:25:36 2007
IXDBA.NET社区论坛
SUCCESS: disk DISK_0017 (17.4042303525) added to diskgroup DISK
SUCCESS: disk DISK_0018 (18.4042303520) added to diskgroup DISK
SUCCESS: disk DISK_0019 (19.4042303521) added to diskgroup DISK
SUCCESS: disk DISK_0020 (20.4042303522) added to diskgroup DISK
SUCCESS: disk DISK_0021 (21.4042303523) added to diskgroup DISK
SUCCESS: disk DISK_0022 (22.4042303524) added to diskgroup DISK
Thu Mar 29 17:29:45 2007
SUCCESS: diskgroup DISK was dismounted
SUCCESS: diskgroup DISK was dismounted
Thu Mar 29 17:29:46 2007
Errors in file /data/oracle/admin/testrac/bdump/testrac2_lmon_2789.trc:
ORA-00202: control file: ' DISK/testrac/control01.ctl'
ORA-15078: ASM diskgroup was forcibly dismounted
Thu Mar 29 17:29:46 2007
Errors in file /data/oracle/admin/testrac/bdump/testrac2_lmon_2789.trc:
ORA-00204: error in reading (block 35, # blocks 1) of control file
ORA-00202: control file: ' DISK/testrac/control01.ctl'
ORA-15078: ASM diskgroup was forcibly dismounted
Thu Mar 29 17:29:46 2007
LMON: terminating instance due to error 204
Thu Mar 29 17:29:46 2007
Errors in file /data/oracle/admin/testrac/bdump/testrac2_pmon_2754.trc:
ORA-00204: error in reading (block , # blocks ) of control file
Thu Mar 29 17:29:46 2007
System state dump is made for local instance