I found a peculiar behaviour for one job which submitted but it did not turn to 'STARTED' status.
I noticed the job got submitted at 3:13:05 but it did not start executing.
I simply then held the job at 3:33:58 and released it, it immediately started executing.
Has anyone noticed these types of behaviour. Can someone tell me the reason why job did not start?
Here is the log
3/12/2013 7:00:13 AM CS5065 ORDERED JOB:210944; DAILY SYSTEM, ODATE 20130312
3/12/2013 7:01:21 AM SL5120 JOB STATE CHANGED TO Wait Condition
3/13/2013 3:13:05 PM SL5208 QUANTITATIVE RESOURCE xxxxx QUANTITY 1 ALLOCATED
3/13/2013 3:13:05 PM SL5201 SHOUT TO EM PERFORMED
3/13/2013 3:13:05 PM SL5105 SUBMITTED TO xxxxxxx
3/13/2013 3:33:58 PM CS5401 HELD BY USER *******
3/13/2013 3:34:07 PM CS5402 FREED BY USER *******
3/13/2013 3:34:46 PM TR5136 JOB STATE CHANGED TO RETRY-SUBMIT
3/13/2013 3:34:46 PM TR5120 JOB STATE CHANGED TO Retry Submitted
3/13/2013 3:34:47 PM SL5214 QUANTITATIVE RESOURCES RELEASED
3/13/2013 3:34:47 PM SL5120 JOB STATE CHANGED TO Wait Scheduling
3/13/2013 3:34:47 PM SL5208 QUANTITATIVE RESOURCE xxxxx QUANTITY 1 ALLOCATED
3/13/2013 3:34:47 PM SL5201 SHOUT TO EM PERFORMED
3/13/2013 3:34:47 PM SL5105 SUBMITTED TO xxxxxxx
3/13/2013 3:34:49 PM TR5101 STARTED AT 20130313153447 ON xxxxxxx
3/13/2013 3:34:49 PM TR5120 JOB STATE CHANGED TO Executing
3/13/2013 3:38:43 PM TR5100 ENDED AT 20130313153843. OSCOMPSTAT 0. RUNCNT 1
3/13/2013 3:38:43 PM TR5133 ENDED OK
Job submitted but not started
its seems that SL process or TR process hung for sometime or the status directory was not updated for job id, thats the reason job hung .
as soon as you held the job it initiated the CS process again and which result in status directory to change the status of job and it refreshed the SL and TR process for job and it started executing
You might see the log in proclog directory in server for more detail on TR.* and SL* log file
as soon as you held the job it initiated the CS process again and which result in status directory to change the status of job and it refreshed the SL and TR process for job and it started executing
You might see the log in proclog directory in server for more detail on TR.* and SL* log file