set up job for control-m agent down

All questions about Control-M jobs definitions
Post Reply
User avatar
karan
Nouveau
Nouveau
Posts: 20
Joined: 25 May 2011 12:00

set up job for control-m agent down

Post by karan » 22 Jul 2013 8:44

Hi All



I want to set up a job to check for agent status and in case any agent is down it informs ops via alert.

Is it possible?

User avatar
Manii
Nouveau
Nouveau
Posts: 46
Joined: 28 Jul 2011 12:00

Post by Manii » 22 Jul 2013 4:21

Hi,

There is already regular alert configure in case of any agent is down.

message as below :

STATUS OF AGENT PLATFORM <Agent> CHANGED TO UNAVAILABLE

there is no need to put an additional monitoring of job .

User avatar
cjdesch
Nouveau
Nouveau
Posts: 10
Joined: 16 Jul 2013 12:00

Post by cjdesch » 22 Jul 2013 8:59

It depends on your alerting system too (Not everyone can tie directly into Remedy or whatever alerting system they have). Some configurations (using the Patrol Control-M KM) will not alert on an Agent Unavailable alert but will alert on job failures. I believe this is corrected/added feature in the Patrol KM for Control-M 8.0 but is not documented.

To schedule a job to check the status of an agent, all you need to do is schedule a job to run a 'ls -la' or some generic command and to have a Late Sub alert after the run time. If the agent is available, the job will run and complete... if it is not available the job will turn "Wait for Resource" and trigger the Late Sub Alert since it cannot run on time-- giving you a decent check on the health of your agent determining whether it can run work or cannot run work by going through each step of the process: Submitting, Running, Completing, Returning a Completion code.

There are several problems with this method:

1) You must schedule a job for every hour of the day (or chosen interval) for how often you want to check it. (Cyclic jobs won't work with Late Sub Alerts). If you have a lot of agents, this could mean a lot of tasks. 24 hour checks * 50 agents * 365 days in a year = 483000 extra jobs.
2) If the Agent goes down right after the last check you may have a gap of time the agent is down before the next check.



Advantages:
1) Some agents go down routinely and come back up automatically within 5 minutes. These agents won't be alerted on immediately and cause extra tickets to be generated.

NOTE: Configure the job to delete the sysout after running in all cases (success or fail) or else the proclog on the agent will fill up and consume the file system in some cases.

User avatar
Surya47
Nouveau
Nouveau
Posts: 6
Joined: 16 Jan 2014 12:00

Post by Surya47 » 16 Jan 2014 7:14

Create a cyclic job with tasktype as command.

ctm_agstat -LIST "*"|grep -v jbs|grep Unavail

Post Reply