Tag Archives: agent

New OEM 12c/13c Agent Install Won’t Keep Running – Dies After Awhile

Many agents - Copyright by Warner Bros.,The Wachowski Brothers used without permission as educational content.
Agents are often hard to kill, usually. Images Copyright by Warner Bros., The Wachowski Brothers; used without permission as educational content.

OEM agents tend to occupy memory based upon how many targets they have to keep track of in a particular host.  At another organization, we tended to spin up VM’s for each instance environment, so at maximum, a particular agent might have a few hundred targets (especially on an e-Business Suite Applications Tier.)  In those circumstances, the default Java memory settings are probably fine.

In this environment, we run our hosts to death, and on this particular proof-of-concept host, we have 43 instances running on it, with variants of 10g, 11g, and 12c databases combined.

We are doing a fresh install of OEM 12.1.0.5.0 for our POC before setting up the 13c production OMS, and after deploying the agent to this particular database host, the agent would startup fine, run for about 20 or so minutes and then abruptly die without warning.

Re-starts fine, passes the usual tests fine (before the 20 minutes or so goes by) and then dies again.

AGENT_INST=/u01/app/oracle/agent12c/agent_inst

cd $AGENT_INST/bin

./emctl status agent

Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
—————————————————————
Agent Version          : 12.1.0.5.0
OMS Version            : 12.1.0.5.0
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/app/oracle/agent12c/agent_inst
Agent Log Directory    : /u01/app/oracle/agent12c/agent_inst/sysman/log
Agent Binaries         : /u01/app/oracle/agent12c/core/12.1.0.5.0
Agent Process ID       : 10598
Parent Process ID      : 10499
Agent URL              : https://itsrv33c.mydomain:3872/emd/main/
Local Agent URL in NAT : https://itsrv33c.mydomain:3872/emd/main/
Repository URL         : https://itsrv35g.mydomain:1159/empbs/upload
Started at             : 2016-11-09 09:57:05
Started by user        : oracle
Operating System       : HP-UX version B.11.31 (IA64W)
Last Reload            : (none)
Last successful upload                       : 2016-11-09 10:19:26
Last attempted upload                        : 2016-11-09 10:19:26
Total Megabytes of XML files uploaded so far : 0.2
Number of XML files pending upload           : 0
Size of XML files pending upload(MB)         : 0
Available disk space on upload filesystem    : 16.92%
Collection Status                            : Collections enabled
Heartbeat Status                             : Ok
Last attempted heartbeat to OMS              : 2016-11-09 10:19:53
Last successful heartbeat to OMS             : 2016-11-09 10:19:53
Next scheduled heartbeat to OMS              : 2016-11-09 10:20:53

—————————————————————
Agent is Running and Ready

./emctl pingOMS

Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
—————————————————————
EMD pingOMS completed successfully

$AGENT_INST/sysman/log/gcagent.log contains

—– Wed Nov  9 09:39:43 2016::26900::Agent Launched with PID 27336 at time Wed
Nov  9 09:39:43 2016 —–
—– Wed Nov  9 09:39:43 2016::27336::Time elapsed between Launch of Watchdog p
rocess and execing EMAgent is 34 secs —–
2016-11-09 09:39:44,287 [1:main] WARN – Missing filename for log handler ‘wsm’
2016-11-09 09:39:44,302 [1:main] WARN – Missing filename for log handler ‘opss’
2016-11-09 09:39:44,305 [1:main] WARN – Missing filename for log handler ‘opsscf
g’
Agent is going down due to an OutOfMemoryError
—– Wed Nov  9 09:40:06 2016::26900::Checking status of EMAgent : 27336 —–
—– Wed Nov  9 09:40:06 2016::26900::EMAgent exited at Wed Nov  9 09:40:06 201
6 with return value 57. —–
—– Wed Nov  9 09:40:06 2016::26900::EMAgent will be restarted because of an O
ut of Memory Exception. —–
—– Wed Nov  9 09:40:06 2016::26900::writeAbnormalExitTimestampToAgntStmp: exi
tCause=OOM : restartRequired=1 —–
—– Wed Nov  9 09:40:06 2016::26900::Restarting EMAgent. —–

That means, the agent is starting, then stopping, then restarting, then stopping (aka “thrashing”)

Take a look for the running agent daemon at the OS level:

ps -ef | grep agent12c
oracle 26900     1  0 09:39:09 pts/0     0:00 /u01/app/oracle/agent12c/core/12.1.0.5.0/perl/bin/perl /u01/app/oracle/agent12c/core/12.1.0.5.0/bin/emwd.pl agent /u01/app/oracle/…
oracle 27665 26900  0 09:40:12 pts/0     1:01 /u01/app/oracle/agent12c/core/12.1.0.5.0/jdk/bin/IA64W/java -Xmx169M -XX:MaxPermSize=96M -server -Djava.security.egd=file:///de…

Oh – it’s set up for the default of 169MB of RAM.  Check My Oracle Support.

EM 12c: emctl start agent Fails ‘Fatal agent error: State Manager failed at Startup’ ‘restarted because of an Out of Memory Exception’ Reported in emagent.nohup /gcagent.log (Doc ID 1950490.1)

Verify this setting also in the $AGENT_INST/sysman/log/gcagent.log:

—– Wed Nov  9 09:40:06 2016::26900::Auto tuning the agent at time Wed Nov  9
09:40:06 2016 —–
inMemoryLoggingSize=6291456
_SchedulePersistTimer=30
MaxThreads=10
agentJavaDefines=-Xmx169M -XX:MaxPermSize=96M
SchedulerRandomSpreadMins=5
UploadMaxNumberXML=5000
UploadMaxMegaBytesXML=50.0
Auto tuning was successful

Well, it’s trying. Per the above Doc ID 1950490.1

Stop the agent.

$AGENT_INST/bin/emctl stop agent

Edit the $AGENT_INST/sysman/config/emd.properties   (this contains the runtime parameters for the agent):

old entry:
agentJavaDefines=-Xmx169M -XX:MaxPermSize=96M

new entry:
agentJavaDefines=-Xmx512M -XX:MaxPermSize=96M

(You may tune these values up or down according to your environment requirements)

Restart the agent:

$AGENT_INST/bin/emctl stop agent

Agent runs, and keeps running like the E-Bunny.

New feature in the 12.1.0.4.x OEM agents – Metrics Browser

Actual content being collected depends on what plugins are available on the agent.

https://(agenthostname):(port)/emd/browser/main

(agenthostname):(port) obtained from $AGENT_HOME/bin/emctl status agent


oem_agent_metric_browser_login_ss1Metric Browser Login

Top of Form

Enter user ID and password:
Agent UserName or root Password

PDP Type
None Sudo PowerBroker

RunAs Username Profile name (only applicable if PowerBroker)
And then click this button:

Bottom of Form


Screenshot of OEM 12.1.0.4.0 Agent Metric Browser
Screenshot of OEM 12.1.0.4.0 Agent Metric Browser

EMAGENT 12.1.0.4.0

Health Meter Score Schedule Properties Upload System Top Target/Metric Cpu Reports System State Dumps Agent Key Performance Charts Agent KPI Charts
100.0 Schedule Properties Upload System Top Target/Metric Cpu Reports System State Dumps Agent Key Performance Charts Agent KPIs

Target List

TargetType TargetName BrokenCode BrokenReason Status Version Runtime Version Blackout Status Master ScheduleStatus HealthScore Severities Schedule CollectionItems Target Events
Host (hostname) 0 MONITORED 4.4 2 false true OPERATIONAL 99.6 Severities Schedule CollectionItems Events
Oracle Concurrent Processing (ORACLE_SID)-Core Managers for Concurrent Processing 0 MONITORED 12.03 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Custom Oracle Concurrent Program (ORACLE_SID)-AUS_FNDGSCST 0 MONITORED 12.02 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Custom Oracle Concurrent Program (ORACLE_SID)-(ORACLE_SID)_7BK_AGING 0 MONITORED 12.02 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle E-Business Suite Custom Objects (ORACLE_SID)-Oracle E-Business Suite Custom Objects Configuration 0 MONITORED 12.01 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Internal Concurrent Manager (ORACLE_SID)-Internal Concurrent Manager 0 MONITORED 12.02 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle E-Business Suite Node (ORACLE_SID)-Infrastructure (ORACLE_SID)_(hostname)-Database Context 0 MONITORED 12.03 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle E-Business Suite Patch Information (ORACLE_SID)-Oracle E-Business Suite Patch Information Configuration 0 MONITORED 12.03 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle E-Business Suite Workflow (ORACLE_SID)-Workflow Infrastructure 0 MONITORED 12.02 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle Workflow Agent Listener (ORACLE_SID)-Oracle Workflow Agent Listener 0 MONITORED 12.01 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle Workflow Background Engine (ORACLE_SID)-Workflow Background Engine 0 MONITORED 12.01 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle Workflow Notification Mailer (ORACLE_SID)-Oracle Workflow Notification Mailer 0 MONITORED 12.01 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Database Instance (ORACLE_SID) 0 MONITORED 5.3 2 false true OPERATIONAL 99.3 Severities Schedule CollectionItems Events
Database Instance (ORACLE_SID) 0 MONITORED 5.3 2 false true OPERATIONAL 100.0 Severities Schedule CollectionItems Events
Oracle E-Business Suite (ORACLE_SID)-Oracle E-Business Suite 0 MONITORED 12.03 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Agent (hostname):(port) 0 MONITORED 12.4 6 false true OPERATIONAL Severities Schedule CollectionItems Events
Agent proxy (hostname):(port)_proxy 0 MONITORED 12.01 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle Home OraDb11g_home1_(hostname) 0 MONITORED 2.0 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle Home OraDb11g_home2_(hostname) 0 MONITORED 2.0 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle Home agent12c1_(hostname) 0 MONITORED 2.0 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Oracle Home agent12c2_18_(hostname) 0 MONITORED 2.0 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Listener LISTENER_(hostname) 0 MONITORED 2.7 2 false true OPERATIONAL Severities Schedule CollectionItems Events
Listener (ORACLE_SID)_(hostname) 0 MONITORED 2.7 2 false true OPERATIONAL Severities Schedule CollectionItems Events

Timestamp = 2015-06-16T10:37:34.693-07:00

Logout https://(hostname):(port)/emd/browser/logout

Copyright (c) 1996, 2013, Oracle and/or its affiliates. All rights reserved

Each clickable link allows you to see what each metric being collected is actually collecting, and what the current values that are being passed to the OEM OMS Repository look like.

This is referenced in the My Oracle Support document:

List Of All Metrics In Enterprise Manager 12c Cloud Control (Doc ID 1678449.1)

Existing OEM 12c Agent Fails Startup and Resecure on Hostname Change

Had an issue where the hostname (on Oracle Enterprise Linux 5.9 – OEL 64-bit) happened to have an incorrect hostname and alias when I had already installed the OEM 12c (12.1.0.3) Agent. Thus the OMS repository targets were all named incorrectly, even though the Agent was secured and registered. (This was a new database host).

In the $AGENT_HOME/sysman/log/emagent.nohup log was the following:

— EMState agent
—– Sat Mar 1 10:42:22 2014::10437::Auto tuning the agent at time Sat Mar 1
10:42:22 2014 —–
—– Sat Mar 1 10:42:23 2014::10437::Finished auto tuning the agent at time Sat Mar 1 10:42:23 2014 —–
—– Sat Mar 1 10:42:23 2014::10437::Launching the JVM with following options: -Xmx128M -server -Djava.security.egd=file:///dev/./urandom -Dsun.lang.ClassLoader.allowArraySyntax=true -XX:+UseLinuxPosixThreadCPUClocks -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+UseCompressedOops —–
—– Sat Mar 1 10:42:23 2014::10491::Time elapsed between Launch of Watchdog process and execing EMAgent is 2 secs —–
—– Sat Mar 1 10:42:23 2014::10437::Agent Launched with PID 10491 at time Sat Mar 1 10:42:23 2014 —–
2014-03-01 10:42:23,962 [1:main] WARN – Missing filename for log handler ‘wsm’
2014-03-01 10:42:23,971 [1:main] WARN – Missing filename for log handler ‘opss’
2014-03-01 10:42:23,972 [1:main] WARN – Missing filename for log handler ‘opsscfg’
OMS decided to shutdown the agent because of the following reason sent from OMS: EM_PLUGIN_MISMATCH_AND_AGENT_NOT_YET_MANAGED
—– Sat Mar 1 10:42:37 2014::10437::Checking status of EMAgent : 10491 —–
—– Sat Mar 1 10:42:37 2014::10437::EMAgent exited at Sat Mar 1 10:42:37 2014 with return value 0. —–
—– Sat Mar 1 10:42:37 2014::10437::EMAgent was shutdown normally. —–

./emctl secure agent

./emctl start agent

Resulted in the same repeating failures.

Removed the Target Host and Agent from OEM 12c OMS

(Target -> Hosts -> (select host) -> [Remove], and then,

Setup -> Manage Cloud Control -> Agents -> (click on Agent:(port)

Agent Menu (upper-left dropdown) -> Target Setup -> Remove Target)

Re-deployed using the faster method of silent Agent deployment (Bobby Curtis has this covered on http://dbasolved.com/2013/04/10/install-oem-agents-silently-in-any-environment )

Everything ready to proceed again.