If a job doesn't start or complete by a specific time

A lot of scripts ans example to extract information from Control-M tools.
User avatar
donnawonna
Nouveau
Nouveau
Posts: 8
Joined: 19 Apr 2011 12:00

If a job doesn't start or complete by a specific time

Post by donnawonna » 19 Apr 2011 12:46

Our developers want to provide a maximum time limit for a job. They need it to stop executing if it is not completed by say 8:00am and should be forced ok. Can controlm be coded to do all this without manual intervention from an operator? If so, how?

User avatar
sandu
Nouveau
Nouveau
Posts: 18
Joined: 08 Sep 2008 12:00
Location: Toronto

Post by sandu » 20 Apr 2011 10:01

In the PostProc tab:

When: Late Time
Param: 0800
To: killjob
Urgency:Urgent
Message: %%ORDERID

where killjob shoud be defined as a kill script in ctm_menu/Shout Destination table.

e.g

11 P S killjob <abdolute_path>/killjob.sh


killjob.sh script looks like:

#/usr/bin/sh

ctmkilljob -ORDERID $2

User avatar
donnawonna
Nouveau
Nouveau
Posts: 8
Joined: 19 Apr 2011 12:00

Post by donnawonna » 23 Apr 2011 8:01

Thanks for your response. It's greatly appreciated!

User avatar
futre25
Nouveau
Nouveau
Posts: 166
Joined: 11 Aug 2009 12:00

Post by futre25 » 25 Apr 2011 4:38

Hi,

excuse me.

I performed a test with data indicating, but I can not kill the process, or to send me a alert.

Why? I'm doing wrong

Attachment my data:

1.- I create a new post and new scripts:

Shout Destination Table 'SYSTEM '
------------------------------------

# Destination Type Adr Logical Name Physical Name
--- ---------------- --- ----------------- ----------------------------
1 O S CONSOLE 2 E S ECS
3 L S IOALOG
4 P S KILLJOB /tmp/killjob.sh

q) Quit e#) Edit entry # n) New entry d#) Delete entry #


And I create a new scripts in /tmp

xxxx_xxxx-xxxxxx [56] ls -ltr killjob.sh
-rwxrwxrwx 1 xxxxxx controlm 72 Apr 25 16:15 killjob.sh
xxxxxx-xxxxx [57] more killjob.sh
echo "Lanzado KILL JOB: " > /tmp/prueba_kill.txt
ctmkilljob -ORDERID $2

In my PostProc tab:

When: Exectime
Param: >1
To: KILLJOB
Urgency:Urgent
Message: %%ORDERID

My job delay 3 minutes in executing.

Thanks to all.

User avatar
sandu
Nouveau
Nouveau
Posts: 18
Joined: 08 Sep 2008 12:00
Location: Toronto

Post by sandu » 25 Apr 2011 6:56

What does it say in the log of the job?

User avatar
futre25
Nouveau
Nouveau
Posts: 166
Joined: 11 Aug 2009 12:00

Post by futre25 » 26 Apr 2011 9:31

Hi.

the command of the job is a sleep 250 and the problem is that not executing the command ctmkilljob, because not performed the call to the command.

Execute the sleep, but not shows the alert, and not kill the job. Terminate succesfully.

Thanks for your answer.

User avatar
Walty
Nouveau
Nouveau
Posts: 473
Joined: 20 Jan 2006 12:00

Post by Walty » 26 Apr 2011 12:17

Hi futre25,

In your script:

echo "Lanzado KILL JOB: $2" > /tmp/prueba_kill.txt &
ctmkilljob -ORDERID $2 &
ctmshout -DEST ECS -SEVERITY U -MESSAGE "JOB with ORDERID=$2 was Killed " &


The ampersand (&) sign must be present at the end of each command
The <ctmshout> utility send alerte to GAS

In PostProc tab:

When: Exectime
Param: >1
To: KILLJOB
Urgency:Regular
Message: %%ORDERID
Best regards
Walty

User avatar
futre25
Nouveau
Nouveau
Posts: 166
Joined: 11 Aug 2009 12:00

Post by futre25 » 26 Apr 2011 1:28

Hi.

Thanks for your aswner.

The problem not exist in the scripts, but the scripts not reaches to executing.

I think that the problem is to create the new Shout Destination table -. KILLJOB.

Never executed the scripts.

Thanks for your help and dedication.

User avatar
sandu
Nouveau
Nouveau
Posts: 18
Joined: 08 Sep 2008 12:00
Location: Toronto

Post by sandu » 26 Apr 2011 4:47

Try and see if EXECTIME > 001 works instead of >1.

User avatar
futre25
Nouveau
Nouveau
Posts: 166
Joined: 11 Aug 2009 12:00

Post by futre25 » 26 Apr 2011 5:10

Thanks for your responde.

The sentence is fine, because when testing the same exectime, but with the destination table ECS, the execution was correct and send a alert.

I'm doing wrong?

Thanks sandu.

User avatar
sandu
Nouveau
Nouveau
Posts: 18
Joined: 08 Sep 2008 12:00
Location: Toronto

Post by sandu » 26 Apr 2011 6:58

If you did the whole setup correctly it should work. I would go again over the shout definition, script permissions and job definitions to see if they are correct.

Admin007

If a job doesn't start or complete by a specific time

Post by Admin007 » 26 Apr 2011 9:28

I also tried this and it does not appear to be working.

I created a script in /tmp: killjob.sh

#/usr/bin/sh

ctmkilljob -ORDERID $2

I set up ctmsys as follows:

5 P S killjob /tmp/killjob.sh

I set up the job definition on PostProc tab as follows:

When: Late Time
Param: 1520 (just using any test time)
To: killjob
Urgency: Urgent
Message: %%ORDERID

To test this I created a very basic job definition performing an ls -ltr

I added a PRECMD of sleep 360 to force the job to sleep for 6 minutes before it executes.

The job slept and executed fine but never called the script to kill it even though the 1520 time came and went.

Thinking about it, would it be that the job is not actually executing when 1520 arrives? Since it is sleeping. I would still think it was submitted to the system by Control-M and then sleeps that the Late Time would still apply.

User avatar
sandu
Nouveau
Nouveau
Posts: 18
Joined: 08 Sep 2008 12:00
Location: Toronto

Post by sandu » 26 Apr 2011 9:40

This will not work. PRECMD is not part of the job. It does not get included in the runtime of the job. Look at statistics for the job to see the elapsed time.
ls command might be to quick to have the chance to kill the job before it completes. Sometimes there is a delay depending on how many you run on your datacentre. Try like before with the sleep 240 in the command line for the job..Then go on the actual agent box and monitor the process when it runs.
Does the script reside on Control-M/Server box?

Try to run the ctmkilljob -ORDERID $2 command from CTM/Server box replacing $2 with the order ID of the job after it has started.

Admin007

Post by Admin007 » 26 Apr 2011 10:01

I set it up with the command line to sleep 240, removed the PRECMD and attempted it numerous times without success.

Yes, the killjob.sh script resides on the Control-M/Server box.

I was able to kill the job while watching it process on the box by issuing:

ctmkilljob -ORDERID 00gqo

I then reran the job after adjusting the Late Time parm and then ran my script from the home directory and it killed the job.

I know the script works. It appears the job never calls the script though via the PostProc tab.

User avatar
Walty
Nouveau
Nouveau
Posts: 473
Joined: 20 Jan 2006 12:00

Post by Walty » 27 Apr 2011 8:24

Very strange.
I use Shout Destination tables for multiple actions without particular problems (v6.3.01.700)
Your Shout Destination table used is it active ? (ctmshtb)
I tried 2 executions (Exectime & Late Time) and it's work without problem

Shout Destination table (SYSTEM):
11 P S KILLJOB /tmp/killjob.sh

Script:
ls -lrt killjob.sh ; more killjob.sh
-rwxrwxrwx 1 labctm01 controlm 157 Apr 26 12:24 killjob.sh

echo "Lanzado KILL JOB: $2" > /tmp/prueba_kill.txt &
ctmkilljob -ORDERID $2 &
ctmshout -DEST ECS -SEVERITY U -MESSAGE "JOB with ORDERID=$2 was Killed " &

Job1 definition:
Task Type: Command
Command: sleep 240
PostProc: Exectime
Param: >1
To : KILLJOB
Urgency: Regular
Message: %%ORDERID

Log execution job1:
Date Time Code Job Name Job Id ----- Message -----

27/04/11 07:52:55 CS5065 ORDERED JOB:16783; DAILY FORCED, ODATE 20110426
27/04/11 07:52:56 SL5105 SUBMITTED TO labctm01
27/04/11 07:53:01 TR5101 STARTED AT 20110427075256 ON labctm01
27/04/11 07:53:01 TR5120 JOB STATE CHANGED TO Executing
27/04/11 07:53:56 TR5201 SHOUT TO KILLJOB PERFORMED
27/04/11 07:53:57 UT5409 JOB KILLED BY USER labctm01
27/04/11 07:54:01 TR5100 ENDED AT 20110427075401. OSCOMPSTAT 143. RUNCNT 1

27/04/11 07:54:01 TR5134 ENDED NOTOK
27/04/11 07:54:01 TR5120 JOB STATE CHANGED TO Analyzed
27/04/11 07:54:01 SL5120 JOB STATE CHANGED TO Post processed

Job2 definition:
Task Type: Command
Command: sleep 480
PostProc: Late Time
Param: 0800
To : KILLJOB
Urgency: Regular
Message: %%ORDERID

Log execution job2:
Date Time Code Job Name Job Id ----- Message -----

27/04/11 07:55:07 CS5065 ORDERED JOB:16786; DAILY FORCED, ODATE 20110426
27/04/11 07:55:07 SL5105 SUBMITTED TO labctm01
27/04/11 07:55:11 TR5101 STARTED AT 20110427075507 ON labctm01
27/04/11 07:55:11 TR5120 JOB STATE CHANGED TO Executing
27/04/11 08:00:06 TR5201 SHOUT TO KILLJOB PERFORMED
27/04/11 08:00:08 UT5409 JOB KILLED BY USER labctm01
27/04/11 08:00:11 TR5100 ENDED AT 20110427080011. OSCOMPSTAT 143. RUNCNT 1

27/04/11 08:00:11 TR5134 ENDED NOTOK
27/04/11 08:00:11 TR5120 JOB STATE CHANGED TO Analyzed
27/04/11 08:00:12 SL5120 JOB STATE CHANGED TO Post processed

To be continued .... maybe other users will have new suggestions
Best regards
Walty

Post Reply