当前位置:文档之家› hpux EMS使用

hpux EMS使用

HP-UX下EMS的使用说明手动停sendmail& snmp:/sbin/init.d/sendmail stop/sbin/init.d/SnmpMaster stop假如在启动时就不要启动sendmail,修改/etc/rc.config.d/mailservs, 把=1 改为=0 假如在启动时就不要启动snmp,修改 /etc/rc.config.d/SnmpMaster, 把=1 改为=0 停rstatd服务:先可用rpcinfo查看,若有,/etc/inetd.conf 文件里注释掉 rstat。

并执行#/sbin/init.d/inetd 来重启inetd服务1. EMS介绍EMS(Event Monitoring Service)是一项HP-UX的集成服务,它能够对主机硬件进行实时监控,并可以通过指定方式将监控信息报告给系统维护人员,这有助于运维人员及时、准确的发现主机故障,并辅助判定故障所在,提高主机的可用时间。

EMS可以通过MRM(Monitoring Request Manager)进行管理,通过MRM可以对EMS的监控范围、事情报警触发条件、事件信息报警方式进行设置。

MRM调用方法如下:(1)用root身份登陆主机系统(2)运行/etc/opt/resmon/lbin/monconfig(3)通过(MRM)Monitoring Request Manager Main Menu进行配置在MRM菜单中,可以查看、检查、修改、删除、启用、禁用检测器。

如下:========================================================================= ====================== Event Monitoring Service====================================== Monitoring Request Manager ============================================================================================ ===EVENT MONITORING IS CURRENTLY ENA BLED.EMS V ersion : A.04.10STM V ersion : C.46.15========================================================================= ================= Monitoring Request Manager Main Menu ======================================================================================= ===Note: Monitoring requests let you specify the events for monitorsto report and the notification methods to use.Select:(S)how monitoring requests configured via monconfig(C)heck detailed monitoring status(L)ist descriptions of available monitors(A)dd a monitoring request(D)elete a monitoring request(M)odify an existing monitoring request(E)nable Monitoring(K)ill (disable) monitoring(H)elp(Q)uitEnter selection: [s]下面以定制一个monitor为例子,说明MRM的配置方法:(1)以root身份登陆系统(2)运行/etc/opt/resmon/lbin/monconfig进入MRM主菜单(就是上面看到的)(3)选择a并回车,对应的功能选项是(A)dd a monitoring request(4)此时将显示出可供监控的硬件模块,一般全选,键入a并回车就行(5)选择基准事件级别,建议选择2)MINOR WARNING(6)选择报警触发的条件,选择4)>=(7)选择监控事件信息报警的方式,选择6)EMAIL(8)选择事件报警邮件的接收人,这里可根据需要输入相应的用户名,例如:monitor(9)对此次monitor进行注释说明,选择(A)dd(10)Client Configuration File,这里选择(C)lear(11)保存上述配置信息,此后将退回到主菜单(12)在主菜单下,选择(S)how monitoring requests configured via monconfig查看新建的monitor 是否存在(13)退回到MRM主菜单,选择(C)heck detailed monitoring status,可查看所有有效的监控状态,因主机配置而异,对于主机中不存在的硬件,EMS将会忽略,即使在上述第“4”步中设置为监控所有硬件(14)(E)nable Monitoring,开启EMS服务功能说明:通过上述步骤,我们新建的monitor是针对所有硬件模块(step 4)实时监控,但仅对严重程度大于等于Minor Warning(step 5 & step 6)的事件,通过email(step 6)的方式报告给用户monitor(step 8)。

2. 如何从event mail获取信息EMS产生的时间警告邮件可通过内部网络接收,无需另外配置域名服务器。

EMS产生的邮件,根据事先定义发给目标用户monitor,可通过PC上的邮件客户端软件(outlook等)进行接收。

以outlook为例子,为了接收event mail,邮件客户端软件需要新建邮件账号,用户名为在MRM 中指定的HP-UX用户名,口令为HP-UX中对应的口令,pop3/smtp服务器为被检测主机的IP 地址,建议outlook设定自动收取邮件的间隔时间,以便能及时收到来自EMS的事件信息。

说明:(1)因为HP-UX自身的安全机制,root用户的e-mail无法通过客户端软件收取,因此在MRM 中指定事件邮件接收用户时指定为其他普通用户,例如此次就新建了monitor这个用户(2)网络中应该开放pop3/pop的110/109两个端口(3)供event mail使用的用户是HP-UX中的用户,也能够登陆主机,建议定期修改HP-UX中该用户的密码,对应的,也要修改outlook的密码下面举例说明EMS生成的事件报警邮件的内容,下述故障来自人为带电拔出一块硬盘导致的系统异常(中文部分为注释)>------------ Event Monitoring Service Event Notification ------------<Notification Time: Wed Jun 8 23:26:18 2005 事件触发时间hpux1 sent Event Monitor notification information: 可反映主机名/storage/events/disks/default/0_0_1_1.15.0 is >= 2. 硬件模块、触发器Its current value is CRITICA L(5). 该事件严重程度User Comments:Just a test:)Event data from monitor:Event Time..........: Wed Jun 8 23:26:16 2005 Severity............: CRITICA LMonitor.............: disk_emEvent #.............: 101System..............: hpux1Summary: 事件概述Disk at hardware path 0/0/1/1.15.0 : Device removed from monitoring Description of Error: 故障描述The device has been removed from the list of devices being monitored by this monitor.Probable Cause / Recommended Action: 可能原因/推荐处理办法The device was removed from the system, has stopped responding to the system or it has been replaced with a device that is not supported by this monitor.Run ioscan to determine the state and type of the device.Check the /var/stm/data/os_decode_xref for the information indicating which devices are supported by thi s monitor.Check other monitors to determine if they are now monitoring thedevice by running /etc/opt/resmon/lbin/monconfig and using the "Check monitoring" command.Additional Event Data:System IP Address...: 15.85.114.14 主机IPEvent Id............: 0x42a70e1800000000Monitor V ersion.....: B.01.01Event Class.........: I/O 事件类别Client Configuration File...........:/var/stm/config/tools/monitor/default_disk_em.clcfgClient Configuration File V ersion...: A.01.00Qualification criteria met.Number of events..: 1Associated OS error log entry id(s):NoneAdditional System Data:System Model Number.............: 9000/800/A500-44 主机model号OS V ersion......................: B.11.11 操作系统版本STM V ersion.....................: A.45.00EMS V ersion.....................: A.04.00Latest information on this event:/hpux/content/hardware/ems/disk_em.htm#101v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-vComponent Data:Physical Device Path...: 0/0/1/1.15.0 故障设备物理路径Device Class...........: Disk 设备类型Inquiry V endor ID......: SEA GATE 设备生产商Inquiry Product ID.....: ST34572W C 产品号Firmware V ersion.......: HP03 固件版本Serial Number..........: JKJ118650QPJCX 故障备件序列号>---------- End Event Monitoring Service Event Notification ----------<Enven mail中显示了故障发生的事件、主机名字、事件严重等级、故障盘的物理路径、硬盘的product ID、建议的检查步骤、主机型号、操作系统版本等信息,有助于发现并排查主机硬件故障。

相关主题