Tag Archives: dies

New OEM 12c/13c Agent Install Won’t Keep Running – Dies After Awhile

Many agents - Copyright by Warner Bros.,The Wachowski Brothers used without permission as educational content.
Agents are often hard to kill, usually. Images Copyright by Warner Bros., The Wachowski Brothers; used without permission as educational content.

OEM agents tend to occupy memory based upon how many targets they have to keep track of in a particular host.  At another organization, we tended to spin up VM’s for each instance environment, so at maximum, a particular agent might have a few hundred targets (especially on an e-Business Suite Applications Tier.)  In those circumstances, the default Java memory settings are probably fine.

In this environment, we run our hosts to death, and on this particular proof-of-concept host, we have 43 instances running on it, with variants of 10g, 11g, and 12c databases combined.

We are doing a fresh install of OEM 12.1.0.5.0 for our POC before setting up the 13c production OMS, and after deploying the agent to this particular database host, the agent would startup fine, run for about 20 or so minutes and then abruptly die without warning.

Re-starts fine, passes the usual tests fine (before the 20 minutes or so goes by) and then dies again.

AGENT_INST=/u01/app/oracle/agent12c/agent_inst

cd $AGENT_INST/bin

./emctl status agent

Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
—————————————————————
Agent Version          : 12.1.0.5.0
OMS Version            : 12.1.0.5.0
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/app/oracle/agent12c/agent_inst
Agent Log Directory    : /u01/app/oracle/agent12c/agent_inst/sysman/log
Agent Binaries         : /u01/app/oracle/agent12c/core/12.1.0.5.0
Agent Process ID       : 10598
Parent Process ID      : 10499
Agent URL              : https://itsrv33c.mydomain:3872/emd/main/
Local Agent URL in NAT : https://itsrv33c.mydomain:3872/emd/main/
Repository URL         : https://itsrv35g.mydomain:1159/empbs/upload
Started at             : 2016-11-09 09:57:05
Started by user        : oracle
Operating System       : HP-UX version B.11.31 (IA64W)
Last Reload            : (none)
Last successful upload                       : 2016-11-09 10:19:26
Last attempted upload                        : 2016-11-09 10:19:26
Total Megabytes of XML files uploaded so far : 0.2
Number of XML files pending upload           : 0
Size of XML files pending upload(MB)         : 0
Available disk space on upload filesystem    : 16.92%
Collection Status                            : Collections enabled
Heartbeat Status                             : Ok
Last attempted heartbeat to OMS              : 2016-11-09 10:19:53
Last successful heartbeat to OMS             : 2016-11-09 10:19:53
Next scheduled heartbeat to OMS              : 2016-11-09 10:20:53

—————————————————————
Agent is Running and Ready

./emctl pingOMS

Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
—————————————————————
EMD pingOMS completed successfully

$AGENT_INST/sysman/log/gcagent.log contains

—– Wed Nov  9 09:39:43 2016::26900::Agent Launched with PID 27336 at time Wed
Nov  9 09:39:43 2016 —–
—– Wed Nov  9 09:39:43 2016::27336::Time elapsed between Launch of Watchdog p
rocess and execing EMAgent is 34 secs —–
2016-11-09 09:39:44,287 [1:main] WARN – Missing filename for log handler ‘wsm’
2016-11-09 09:39:44,302 [1:main] WARN – Missing filename for log handler ‘opss’
2016-11-09 09:39:44,305 [1:main] WARN – Missing filename for log handler ‘opsscf
g’
Agent is going down due to an OutOfMemoryError
—– Wed Nov  9 09:40:06 2016::26900::Checking status of EMAgent : 27336 —–
—– Wed Nov  9 09:40:06 2016::26900::EMAgent exited at Wed Nov  9 09:40:06 201
6 with return value 57. —–
—– Wed Nov  9 09:40:06 2016::26900::EMAgent will be restarted because of an O
ut of Memory Exception. —–
—– Wed Nov  9 09:40:06 2016::26900::writeAbnormalExitTimestampToAgntStmp: exi
tCause=OOM : restartRequired=1 —–
—– Wed Nov  9 09:40:06 2016::26900::Restarting EMAgent. —–

That means, the agent is starting, then stopping, then restarting, then stopping (aka “thrashing”)

Take a look for the running agent daemon at the OS level:

ps -ef | grep agent12c
oracle 26900     1  0 09:39:09 pts/0     0:00 /u01/app/oracle/agent12c/core/12.1.0.5.0/perl/bin/perl /u01/app/oracle/agent12c/core/12.1.0.5.0/bin/emwd.pl agent /u01/app/oracle/…
oracle 27665 26900  0 09:40:12 pts/0     1:01 /u01/app/oracle/agent12c/core/12.1.0.5.0/jdk/bin/IA64W/java -Xmx169M -XX:MaxPermSize=96M -server -Djava.security.egd=file:///de…

Oh – it’s set up for the default of 169MB of RAM.  Check My Oracle Support.

EM 12c: emctl start agent Fails ‘Fatal agent error: State Manager failed at Startup’ ‘restarted because of an Out of Memory Exception’ Reported in emagent.nohup /gcagent.log (Doc ID 1950490.1)

Verify this setting also in the $AGENT_INST/sysman/log/gcagent.log:

—– Wed Nov  9 09:40:06 2016::26900::Auto tuning the agent at time Wed Nov  9
09:40:06 2016 —–
inMemoryLoggingSize=6291456
_SchedulePersistTimer=30
MaxThreads=10
agentJavaDefines=-Xmx169M -XX:MaxPermSize=96M
SchedulerRandomSpreadMins=5
UploadMaxNumberXML=5000
UploadMaxMegaBytesXML=50.0
Auto tuning was successful

Well, it’s trying. Per the above Doc ID 1950490.1

Stop the agent.

$AGENT_INST/bin/emctl stop agent

Edit the $AGENT_INST/sysman/config/emd.properties   (this contains the runtime parameters for the agent):

old entry:
agentJavaDefines=-Xmx169M -XX:MaxPermSize=96M

new entry:
agentJavaDefines=-Xmx512M -XX:MaxPermSize=96M

(You may tune these values up or down according to your environment requirements)

Restart the agent:

$AGENT_INST/bin/emctl stop agent

Agent runs, and keeps running like the E-Bunny.

Advertisements