Blog Post

Oracle AD Controller

April 4, 2017 APPS DBA, Errors/Workarounds-Applications by Syed Saad Ali

AD Controller is ad utilities used to monitor/ control the workers execution.

Running AD controller.

Step 1 : Login as Applications Tier user & run the environment file. Environment file is located in APPL_TOP directory.

$ cd /prod/ebs/apps/apps_st/appl

$ cd $APPL_TOP

$. ./ APPSebs_example.env

Step 2 : Run the following AD controller command.

$ adctrl

Note: You will be prompted for the location of APPL_TOP location, username of APPLSYS and password APPS.After providing the above information the AD controller menu will appear as shown below.

AD Controller Menu

—————————————————

Show worker status
Tell worker to restart a failed job
Tell worker to quit
Tell manager that a worker failed its job
Tell manager that a worker acknowledges quit
Restart a worker on the current machine
Exit

Enter your choice [1] :

Checking the status of the workers?

After adctrl is started, we have to choose the first option “Show worker status”.

[edsanimate_start entry_animation_type= “fadeIn” entry_delay= “0” entry_duration= “1” entry_timing= “linear” exit_animation_type= “” exit_delay= “” exit_duration= “” exit_timing= “” animation_repeat= “infinite” keep= “yes” animate_on= “load” scroll_offset= “” custom_css_class= “”]

Please Note: If there is no session, used by the workers, then the following message will appear:

[edsanimate_end]

Error: The FND_INSTALL_PROCESSES table does not exist.

This table is used for communication with the

worker processes, and if it does not exist, it

means that the workers are not running, because

the ad utility has not started them yet.

You should check the file

adctrl.log

for errors.

This is because the FND_INSTALL_PROCESSES table is created when AD parallel jobs start (not the AD utility) and is dropped when the task is completed.

The meaning of each worker status.

STATUS	DESCRIPTION
Waiting	The worker is idle.
Assigned	A job was assigned by the manager to a worker but the worker didn’t start the job.
Running	The worker is running a job.
Failed	The job failed due to an error.
Fixed, Restart	When a jobs restart after the error has been fixed (during this time the worker run the failed job).
Restarted	After the error has been fixed, the worker will have the status “Fixed, Restart” and after that “Restarted”. (The status will not change to “Running”)
Completed	The job was completed and the manager did not yet assigned another job to that worker.

Database Processing Phases concept

When a database patch/ operation will run, tasks are divided into functions. This is done by Oracle when the patch is created. Suppose a patch will create 2 tables and 2 sequences. In that case the patch driver contains 2 phases, one for tables creation and one for sequences creation. Because the sequences could be created in the same time, this will be done in parallel by using more workers.

Here are some Database Processing Phases:

seq = create sequence

tab = create tables, synonyms, grants privileges on tables

pls = create package specification

plb = create package body

vw = create views

Fixing a “Failed” worker

If the job fails 1st time

The job is deferred at the end of the phase and another job is assigned to that worker.

If the job fails 2nd time

– If the run time of the job was < 10 min => the job is deferred at the end of the phase and another job is assigned to that worker.

– If the run time of the job was >= 10 min => the job status will be “Failed”.

If the job fails 3nd time

The job status will be “Failed”.

To review the worker log information you have to check into

$APPL_TOP/admin/<SID>/log/adworkNNN.log

Example: adwork001.log will be the log file for the worker number 1.

After fixing the error we have to start (if is not already started) AD Controller and to use:

Option 2 : “Tell worker to restart a failed job”.

Restarting a Failed Patch Process

During a patch process (or adadmin process) if a job fails and cannot be restarted the patch must be restarted.

Here are the steps for doing this:

Option 3. Tell worker to quit (for all workers) —— [this manually shutdown/ quit the workers]

Option 4. Tell manager that a worker failed its job

Option 5. Tell manager that a worker acknowledges quit—[ Manager will stop, the AutoPatch will stop]

Then Restart the patch

[edsanimate_start entry_animation_type= “fadeIn” entry_delay= “1” entry_duration= “0.5” entry_timing= “linear” exit_animation_type= “” exit_delay= “” exit_duration= “” exit_timing= “” animation_repeat= “infinite” keep= “yes” animate_on= “load” scroll_offset= “” custom_css_class= “”]

PLEASE NOTE: When the patch will restart all the information in the database about this session must be accurate.

[edsanimate_end]

Determine if a process is Hanging or not

We can check the log file to see if some information is added or not to the log file.
We can determine if the worker process is consuming CPU by issuing below command.

$ ps -eo pcpu,pid,user,args | grep workerid

We check if there are any child processes, which are consuming CPU by issuing following command:

$ ps -eo pcpu,pid,ppid,user,args | grep <Parent Process> | grep -v grep

Restarting a Hanging Worker Process

1.kill at the OS level the processes associated with the Hanging Worker Process.

$ kill -9 (Process Number)

Fix the problem.

Restart the worker (or the job)

Restart an AD utility after a Node Crash

Start AD Controller
Choose Option : “4. Tell manager that a worker failed its job”
Choose Option : “2. Tell worker to restart a failed job”
Restart the AD utility that was running when the node crashed.

Shutting down the Manager

Start AD Controller
Choose Option: “3. Tell worker to quit”
Verify that no worker processes are running.

Write a comment

Oracle AD Controller

Running AD controller.

Checking the status of the workers?

The meaning of each worker status.

Database Processing Phases concept

Fixing a “Failed” worker

Restarting a Failed Patch Process

Determine if a process is Hanging or not

Restarting a Hanging Worker Process

Restart an AD utility after a Node Crash

Shutting down the Manager