Syed Saad Ali

Oracle ACE Pro

Oracle Solution Architect

Oracle E-Business Suite

Oracle Cloud Infrastructure

Oracle Fusion Middleware

Oracle Database Administration

Oracle Weblogic Administration

Syed Saad Ali

Oracle ACE Pro

Oracle Solution Architect

Oracle E-Business Suite

Oracle Cloud Infrastructure

Oracle Fusion Middleware

Oracle Database Administration

Oracle Weblogic Administration

Blog Post

Troubleshooting AD Controller

Troubleshooting AD Controller

Fixing a “Failed” worker

 Whenever  a job fails for the 1st time, the job is deferred at the end of the phase and another job is assigned to that worker.

 

What If the Job Fails 2nd Time?

  • If the run time of the job was < 10 min => the job is deferred at the end of the phase and another job is assigned to that Worker.
  • if the run time of the job was >= 10 min => then the  job status will be shown as “Failed”.

 

What If the job fails 3nd time?

If the Job fail third time the job status will be shown as “Failed”.

 

Review the Worker log information into

$APPL_TOP/admin/<SID>/log/adworkNNN.log

Example: adwork001.log name for the worker number 1.

 

After fixing error, start (if is not already started) AD Controller and to use the option 2 “Tell worker to restart a failed job”.

When prompted we have to specify the worker which must be restarted.

If all the workers are failed, we can type  all to restart all Workers.

Restarting a Failed Patch Process

During a patch process (or adadmin process) if a job fails and cannot be restarted, then the patch must be restarted.

Here are the steps for doing this:

  1. Tell worker to quit (for all workers) => to manually shutdown/ quit the workers
  2. Tell manager that a worker failed its job
  3. Tell manager that a worker acknowledges quit => the manager will stop, the AutoPatch will restart the patch

PLEASE NOTE: When the patch will restart all the information in the database about this session must be accurate.

 

How to determine if a process is Hanging or not

  1. We can check the log file to see if some information is added or not to the log
  2. We can determine if the worker process is consuming CPU by issuing below

$ ps -eo pcpu,pid,user,args | grep workerid

3.We check if there are any child processes, which are consuming CPU by issuing following command:

$ ps -eo pcpu,pid,ppid,user,args | grep <Parent Process> | grep -v grep

 

 

Restarting a Hanging Worker Process

1. kill at the OS level the processes associated with the Hanging Worker

$ kill -9 ProcesssNumber

2. fix the problem

3. Restart the worker (or the job)

 

 

Restart an AD utility after a Node Crash

a. Start AD Controller

b. Choose “4. Tell manager that a worker failed its job

c. Choose “2. Tell worker to restart a failed job

d. Restart the AD utility that was running when the node

Shutting down the Manager

1. Start AD Controller

2. Choose “3. Tell worker to quit

3. Verify that no worker processes are running

 

 

Related Posts
Write a comment