ASM报错,实例自动重启
环境:AIX6+11.2.0.3.0 双节点RAC
错误信息如下:
Fri Apr 18 07:56:00 2014
Time drift detected. Please check VKTM trace file for more details.
Fri Apr 18 07:59:25 2014
Thread 1 advanced to log sequence 713 (LGWR switch)
Current log# 17 seq# 713 mem# 0: +DATA/zhbdbst/onlinelog/group_17.323.844214311
Fri Apr 18 07:59:28 2014
Archived Log entry 3652 added for thread 1 sequence 712 ID 0xffffffffca3f0a9c dest 1:
Fri Apr 18 08:00:58 2014
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x595959592D4D4D2D] [PC:0x10123F4FC, kglic0()+828] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/zhbdb1_m000_12649526.trc (incident=957242):
ORA-07445: exception encountered: core dump [kglic0()+828] [SIGSEGV] [ADDR:0x595959592D4D4D2D] [PC:0x10123F4FC] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/incident/incdir_957242/zhbdb1_m000_12649526_i957242.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fri Apr 18 08:01:01 2014
Dumping diagnostic data in directory=[cdmp_20140418080101], requested by (instance=1, osid=12649526 (M000)), summary=[incident=957242].
Fri Apr 18 08:02:21 2014
LMS3 (ospid: 39715556) waits for latch 'row cache objects' for 81 secs.
LMS4 (ospid: 39256696) waits for latch 'row cache objects' for 88 secs.
Fri Apr 18 08:02:36 2014
Errors in file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/zhbdb1_lmhb_44171576.trc (incident=944210):
ORA-29771: process USER (OSID 20250796) blocks LMS3 (OSID 39715556) for more than 70 seconds
Incident details in: /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/incident/incdir_944210/zhbdb1_lmhb_44171576_i944210.trc
USER (ospid: 20250796) is blocking LMS3 (ospid: 39715556) in a wait
LMHB (ospid: 44171576) kills USER (ospid: 20250796).
Please check LMHB trace file for more detail.
Errors in file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/zhbdb1_lmhb_44171576.trc (incident=944211):
ORA-29771: process USER (OSID 20250796) blocks LMS4 (OSID 39256696) for more than 70 seconds
Incident details in: /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/incident/incdir_944211/zhbdb1_lmhb_44171576_i944211.trc
USER (ospid: 20250796) is blocking LMS4 (ospid: 39256696) in a wait
LMHB (ospid: 44171576) kills USER (ospid: 20250796).
Please check LMHB trace file for more detail.
Fri Apr 18 08:02:54 2014
LCK0 (ospid: 41157420) waits for latch 'shared pool' for 83 secs.
Fri Apr 18 08:02:59 2014
Errors in file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/zhbdb1_cjq0_20316518.trc (incident=945906):
ORA-00445: background process "J001" did not start after 120 seconds
Incident details in: /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/incident/incdir_945906/zhbdb1_cjq0_20316518_i945906.trc
Errors in file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/zhbdb1_lmhb_44171576.trc (incident=944212):
ORA-29771: process USER (OSID 9044888) blocks LCK0 (OSID 41157420) for more than 70 seconds
Incident details in: /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/incident/incdir_944212/zhbdb1_lmhb_44171576_i944212.trc
Fri Apr 18 08:03:02 2014
PMON failed to acquire latch, see PMON dump
kkjcre1p: unable to spawn jobq slave process
Errors in file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/zhbdb1_cjq0_20316518.trc:
Fri Apr 18 08:03:05 2014
USER (ospid: 9044888) is blocking LCK0 (ospid: 41157420) in a wait
LMHB (ospid: 44171576) kills USER (ospid: 9044888).
Please check LMHB trace file for more detail.
Fri Apr 18 08:04:03 2014
PMON failed to acquire latch, see PMON dump
Fri Apr 18 16:04:57 2014
WARNING: ASM communication error: op 18 state 0x50 (3113)
ERROR: slave communication error with ASM
NOTE: Deferred communication with ASM instance
Errors in file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/zhbdb1_ora_2622168.trc:
ORA-03113: end-of-file on communication channel
Process ID:
Session ID: 1024 Serial number: 57
NOTE: deferred map free for map id 4276
Fri Apr 18 08:05:33 2014
PMON failed to acquire latch, see PMON dump
Fri Apr 18 08:06:33 2014
PMON failed to acquire latch, see PMON dump
Fri Apr 18 08:08:03 2014
PMON failed to acquire latch, see PMON dump
Fri Apr 18 08:09:03 2014
PMON failed to acquire latch, see PMON dump
Fri Apr 18 08:10:34 2014
PMON failed to acquire latch, see PMON dump
Fri Apr 18 08:11:34 2014
PMON failed to acquire latch, see PMON dump
Fri Apr 18 08:12:04 2014
PMON (ospid: 40501984): terminating the instance due to error 471
Fri Apr 18 08:12:04 2014
System state dump requested by (instance=1, osid=40501984 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/zhbdb/zhbdb1/trace/cdmp_20140418080101/zhbdb1_ora_23986638_bucket.trc
Fri Apr 18 08:12:07 2014
ORA-1092 : opitsk aborting process
Fri Apr 18 08:12:07 2014
ORA-1092 : opitsk aborting process
Fri Apr 18 08:12:07 2014
License high water mark = 1817
Instance terminated by PMON, pid = 40501984
USER (ospid: 40043070): terminating the instance
Instance terminated by USER, pid = 40043070
Fri Apr 18 08:12:31 2014
Adjusting the default value of parameter parallel_max_servers
from 5120 to 3485 due to the value of parameter processes (3500)
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Private Interface 'en1' configured from GPnP for use as a private interconnect.
[name='en1', type=1, ip=169.254.119.231, mac=32-13-77-83-34-0c, net=169.254.0.0/17, mask=255.255.128.0, use=haip:cluster_interconnect/62]
Private Interface 'en2' configured from GPnP for use as a private interconnect.
[name='en2', type=1, ip=169.254.145.101, mac=32-13-77-83-34-0d, net=169.254.128.0/17, mask=255.255.128.0, use=haip:cluster_interconnect/62]
Public Interface 'en0' configured from GPnP for use as a public interface.
[name='en0', type=1, ip=10.6.135.52, mac=32-13-77-83-34-0b, net=10.6.135.0/24, mask=255.255.255.0, use=public/1]
Public Interface 'en0' configured from GPnP for use as a public interface.
[name='en0', type=1, ip=10.6.135.54, mac=32-13-77-83-34-0b, net=10.6.135.0/24, mask=255.255.255.0, use=public/1]
Picked latch-free SCN scheme 3
Autotune of undo retention is turned on.
WARNING: The parameter cursor_sharing was found to be set
to the value SIMILAR. This setting will be ignored and
cursor sharing will operate as though the value was set
to FORCE instead.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options.
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name: AIX
Node name: ora7a
Release: 1
Version: 6
Machine: 00F89A424C00
Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0/db_1/dbs/initzhbdb1.ora
System parameters with non-default values:
processes = 3500
sessions = 5376
resource_limit = TRUE
sga_max_size = 150G
spfile = "+DATA/zhbdbst/spfile/spfilezhbdbst.ora"
sga_target = 150G
control_files = "+DATA/zhbdbst/datafile/control01.ctl"
control_files = "+DATA/zhbdbst/datafile/control02.ctl"
db_block_size = 8192
db_writer_processes = 12
compatible = "11.2.0.0.0"
log_archive_dest_1 = "location=+data/zhbdbst/arch"
log_archive_format = "%t_%s_%r.dbf"
log_archive_max_processes= 8
cluster_database = TRUE
thread = 1
undo_tablespace = "UNDOTBS1"
instance_number = 1
db_securefile = "PERMITTED"
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
local_listener = "LISTENER_DB"
remote_listener = "LISTENER_SCAN"
job_queue_processes = 1000
cursor_sharing = "SIMILAR"
parallel_min_servers = 16
audit_file_dest = "/u01/app/oracle/admin/zhbdbst/adump"
audit_trail = "DB"
db_name = "zhbdb"
open_cursors = 5000
_serial_direct_read = "NEVER"
pga_aggregate_target = 50G
optimizer_dynamic_sampling= 0
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
169.254.119.231
169.254.145.101
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Fri Apr 18 08:12:34 2014
PMON started with pid=2, OS id=39322322
Fri Apr 18 08:12:35 2014
PSP0 started with pid=3, OS id=42009508
Fri Apr 18 08:12:35 2014
VKTM started with pid=4, OS id=40043084 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Fri Apr 18 08:12:35 2014
GEN0 started with pid=5, OS id=40632430
Fri Apr 18 08:12:35 2014
DIAG started with pid=6, OS id=12059112
Fri Apr 18 08:12:35 2014
DBRM started with pid=7, OS id=42271554
Fri Apr 18 08:12:35 2014
PING started with pid=8, OS id=21955342
Fri Apr 18 08:12:36 2014
ACMS started with pid=9, OS id=24051936
Fri Apr 18 08:12:36 2014
DIA0 started with pid=10, OS id=8782382
Fri Apr 18 08:12:36 2014
LMON started with pid=11, OS id=25363808
Fri Apr 18 08:12:36 2014
SKGXP:[1106fccc8.1]{-}: WARNING: Failed to set buffer limit on IPC interconnect socket
SKGXP:[1106fccc8.1]{-}: Oracle requires that the socket receive buffer size be tunable up to 2048 KB.
Please make sure the kernel parameter which limits the receive socket space set by
applications (i.e. SO_RCVBUF) is at least that value.
SKGXP:[1106fccc8.2]{-}: WARNING: Failed to set buffer limit on IPC interconnect socket
SKGXP:[1106fccc8.2]{-}: Oracle requires that the socket receive buffer size be tunable up to 2048 KB.
Please make sure the kernel parameter which limits the receive socket space set by
applications (i.e. SO_RCVBUF) is at least that value.
LMD0 started with pid=12, OS id=12256248
* System load used for high load check
* New Low - High Load Threshold Range = [442368 - 589824]
Fri Apr 18 08:12:36 2014
LMS0 started with pid=13, OS id=38339380 at elevated priority
Fri Apr 18 08:12:36 2014
LMS1 started with pid=14, OS id=43712996 at elevated priority
Fri Apr 18 08:12:37 2014
LMS2 started with pid=15, OS id=36373990 at elevated priority
Fri Apr 18 08:12:37 2014
LMS3 started with pid=16, OS id=35325246 at elevated priority
Fri Apr 18 08:12:37 2014
LMS4 started with pid=17, OS id=23330966 at elevated priority
Fri Apr 18 08:12:37 2014
LMS5 started with pid=18, OS id=3670706 at elevated priority
Fri Apr 18 08:12:37 2014
RMS0 started with pid=19, OS id=35063260
Fri Apr 18 08:12:37 2014
LMHB started with pid=20, OS id=40632850
Fri Apr 18 08:12:37 2014
MMAN started with pid=21, OS id=31523910
Fri Apr 18 08:12:38 2014
DBW0 started with pid=22, OS id=35259670
Fri Apr 18 08:12:38 2014
DBW1 started with pid=23, OS id=22872316
Fri Apr 18 08:12:38 2014
DBW2 started with pid=24, OS id=44433872
Fri Apr 18 08:12:38 2014
DBW3 started with pid=25, OS id=18219812
Fri Apr 18 08:12:39 2014
DBW4 started with pid=26, OS id=40960120
Fri Apr 18 08:12:39 2014
DBW5 started with pid=27, OS id=29951152
Fri Apr 18 08:12:39 2014
DBW6 started with pid=28, OS id=43909516
Fri Apr 18 08:12:39 2014
DBW7 started with pid=29, OS id=40502006
Fri Apr 18 08:12:39 2014
DBW8 started with pid=30, OS id=33555566
Fri Apr 18 08:12:39 2014
DBW9 started with pid=31, OS id=38143116
Fri Apr 18 08:12:40 2014
DBWa started with pid=32, OS id=23789666
Fri Apr 18 08:12:40 2014
DBWb started with pid=33, OS id=4981588
Fri Apr 18 08:12:40 2014
LGWR started with pid=34, OS id=29295734
Fri Apr 18 08:12:40 2014
CKPT started with pid=35, OS id=40436332
Fri Apr 18 08:12:40 2014
SMON started with pid=36, OS id=22675456
Fri Apr 18 08:12:40 2014
RECO started with pid=37, OS id=21233910
Fri Apr 18 08:12:40 2014
RBAL started with pid=38, OS id=17105778
Fri Apr 18 08:12:40 2014
ASMB started with pid=39, OS id=34997578
Fri Apr 18 08:12:41 2014
MMON started with pid=40, OS id=40370712
Fri Apr 18 08:12:41 2014
MMNL started with pid=41, OS id=23199966
NOTE: initiating MARK startup
Starting background process MARK
lmon registered with NM - instance number 1 (internal mem no 0)
Fri Apr 18 08:12:41 2014
MARK started with pid=42, OS id=17236910
NOTE: MARK has subscribed
Fri Apr 18 08:12:44 2014
Sweep [inc][957242]: completed
Sweep [inc][945906]: completed
Sweep [inc][944212]: completed
Sweep [inc][944211]: completed
Sweep [inc][944210]: completed
Sweep [inc2][957242]: completed
Sweep [inc2][945906]: completed
Sweep [inc2][944212]: completed
Sweep [inc2][944211]: completed
Sweep [inc2][944210]: completed
Fri Apr 18 08:12:49 2014
Reconfiguration started (old inc 0, new inc 286)
List of instances:
1 2 (myinst: 1)
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Fri Apr 18 08:12:52 2014
* domain 0 not valid according to instance 2
* domain 0 valid = 0 according to instance 2
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Fri Apr 18 08:12:52 2014
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Fri Apr 18 08:12:52 2014
LMS 5: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Fri Apr 18 08:12:52 2014
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Fri Apr 18 08:12:52 2014
LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Fri Apr 18 08:12:52 2014
LMS 4: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Fri Apr 18 08:12:52 2014
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Submitted all GCS remote-cache requests
Fri Apr 18 08:13:12 2014
Fix write in gcs resources
Reconfiguration complete
Fri Apr 18 08:13:14 2014
LCK0 started with pid=44, OS id=28968142
Fri Apr 18 08:13:14 2014
Starting background process RSMN
Fri Apr 18 08:13:14 2014
RSMN started with pid=45, OS id=40174208
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /u01/app/oracle
Fri Apr 18 08:13:17 2014
ALTER DATABASE MOUNT /* db agent *//* {0:9:12} */
Fri Apr 18 08:13:17 2014
NOTE: Loaded library: System
Fri Apr 18 08:13:17 2014
SUCCESS: diskgroup DATA was mounted
Fri Apr 18 08:13:18 2014
ERROR: failed to establish dependency between database zhbdb and diskgroup resource ora.DATA.dg
Fri Apr 18 08:13:32 2014
Successful mount of redo thread 1, with mount id 3393117013
Fri Apr 18 08:13:32 2014
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT /* db agent *//* {0:9:12} */
ALTER DATABASE OPEN /* db agent *//* {0:9:12} */
Picked broadcast on commit scheme to generate SCNs
LGWR: STARTING ARCH PROCESSES
Fri Apr 18 08:13:39 2014
ARC0 started with pid=64, OS id=37029184
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Fri Apr 18 08:13:41 2014
ARC1 started with pid=66, OS id=19529734
Fri Apr 18 08:13:41 2014
ARC2 started with pid=68, OS id=24839270
Thread 1 opened at log sequence 714
Current log# 25 seq# 714 mem# 0: +DATA/zhbdbst/onlinelog/group_25.312.844214197
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Apr 18 08:13:41 2014
SMON: enabling cache recovery
Fri Apr 18 08:13:41 2014
ARC3 started with pid=67, OS id=15664002
Fri Apr 18 08:13:41 2014
ARC4 started with pid=69, OS id=42467808
Fri Apr 18 08:13:41 2014
ARC5 started with pid=70, OS id=38208244
Fri Apr 18 08:13:41 2014
ARC6 started with pid=71, OS id=24183840
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
ARC5: Archival started
ARC6: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC3: Becoming the heartbeat ARCH
Fri Apr 18 08:13:41 2014
ARC7 started with pid=72, OS id=19071044
ARC7: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Archived Log entry 3654 added for thread 1 sequence 713 ID 0xffffffffca3f0a9c dest 1:
Fri Apr 18 08:13:44 2014
minact-scn: Inst 1 is a slave inc#:286 mmon proc-id:40370712 status:0x2
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
Fri Apr 18 08:13:56 2014
[43975040] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:2042050521 end:2042064034 diff:13513 (135 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Fri Apr 18 08:13:56 2014
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
Starting background process GTX0
Fri Apr 18 08:13:59 2014
GTX0 started with pid=78, OS id=41550202
Starting background process RCBG
Fri Apr 18 08:13:59 2014
RCBG started with pid=79, OS id=42074370
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Fri Apr 18 08:14:00 2014
QMNC started with pid=81, OS id=41419162
Completed: ALTER DATABASE OPEN /* db agent *//* {0:9:12} */
Starting background process SMCO
Fri Apr 18 08:14:05 2014
SMCO started with pid=120, OS id=31916402
Fri Apr 18 08:14:13 2014
Auto-tuning: Starting background process GTX1
实例, 重启, 节点, Please, trace
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这是个BUG :13629056
Dumping diagnostic data in directory=[cdmp_20140418080101], requested by (instance=1, osid=12649526 (M000)), summary=[incident=957242].
Fri Apr 18 08:02:21 2014
LMS3 (ospid: 39715556) waits for latch 'row cache objects' for 81 secs.
LMS4 (ospid: 39256696) waits for latch 'row cache objects' for 88 secs.
应该是不能获取到latch的信息,导致进程异常。上传systemstate文件。
好像提示内在有问题
我看着像内存不够用呢?把os,db的内存贴一下吧!