| CONDOR_Q(1) | HTCondor Manual | CONDOR_Q(1) |
condor_q - HTCondor Manual
Display information about jobs in queue
condor_q [-help [Universe | State] ]
condor_q [-debug ] [general options ] [restriction list ] [output options ] [analyze options ]
condor_q displays information about jobs in the HTCondor job queue. By default, condor_q queries the local job queue, but this behavior may be modified by specifying one of the general options.
As of version 8.5.2, condor_q defaults to querying only the current user's jobs. This default is overridden when the restriction list has usernames and/or job ids, when the -submitter or -allusers arguments are specified, or when the current user is a queue superuser. It can also be overridden by setting the CONDOR_Q_ONLY_MY_JOBS configuration macro to False.
As of version 8.5.6, condor_q defaults to batch-mode output (see -batch in the Options section below). The old behavior can be obtained by specifying -nobatch on the command line. To change the default back to its pre-8.5.6 value, set the new configuration variable CONDOR_Q_DASH_BATCH_IS_DEFAULT to False.
As of version 8.5.6, condor_q defaults to displaying information about batches of jobs, rather than individual jobs. The intention is that this will be a more useful, and user-friendly, format for users with large numbers of jobs in the queue. Ideally, users will specify meaningful batch names for their jobs, to make it easier to keep track of related jobs.
(For information about specifying batch names for your jobs, see the condor_submit and condor_submit_dag manual pages.)
A batch of jobs is defined as follows:
There are many output options that modify the output generated by condor_q. The effects of these options, and the meanings of the various output data, are described below.
If the -long option is specified, condor_q displays a long description of the queried jobs by printing the entire job ClassAd for all jobs matching the restrictions, if any. Individual attributes of the job ClassAd can be displayed by means of the -format option, which displays attributes with a printf(3) format, or with the -autoformat option. Multiple -format options may be specified in the option list to display several attributes of the job.
For most output options (except as specified), the last line of condor_q output contains a summary of the queue: the total number of jobs, and the number of jobs in the completed, removed, idle, running, held and suspended states.
If no output options are specified, condor_q now defaults to batch mode, and displays the following columns of information, with one line of output per batch of jobs:
OWNER, BATCH_NAME, SUBMITTED, DONE, RUN, IDLE, [HOLD,] TOTAL, JOB_IDS
Note that the HOLD column is only shown if there are held jobs in the output or if there are no jobs in the output.
If the -nobatch option is specified, condor_q displays the following columns of information, with one line of output per job:
ID, OWNER, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
If the -dag option is specified (in conjunction with -nobatch), condor_q displays the following columns of information, with one line of output per job; the owner is shown only for top-level jobs, and for all other jobs (including sub-DAGs) the node name is shown:
ID, OWNER/NODENAME, SUBMITTED, RUN_TIME, ST, PRI, SIZE, CMD
If the -run option is specified (in conjunction with -nobatch), condor_q displays the following columns of information, with one line of output per running job:
ID, OWNER, SUBMITTED, RUN_TIME, HOST(S)
Also note that the -run option disables output of the totals line.
If the -grid option is specified, condor_q displays the following columns of information, with one line of output per job:
ID, OWNER, STATUS, GRID->MANAGER, HOST, GRID_JOB_ID
If the -grid:ec2 option is specified, condor_q displays the following columns of information, with one line of output per job:
ID, OWNER, STATUS, INSTANCE ID, CMD
If the -goodput option is specified, condor_q displays the following columns of information, with one line of output per job:
ID, OWNER, SUBMITTED, RUN_TIME, GOODPUT, CPU_UTIL, Mb/s
If the -io option is specified, condor_q displays the following columns of information, with one line of output per job:
ID, OWNER, RUNS, ST, INPUT, OUTPUT, RATE, MISC
If the -cputime option is specified (in conjunction with -nobatch), condor_q displays the following columns of information, with one line of output per job:
ID, OWNER, SUBMITTED, CPU_TIME, ST, PRI, SIZE, CMD
If the -hold option is specified, condor_q displays the following columns of information, with one line of output per job:
ID, OWNER, HELD_SINCE, HOLD_REASON
If the -totals option is specified, condor_q displays only one line of output no matter how many jobs and batches of jobs are in the queue. That line of output contains the total number of jobs, and the number of jobs in the completed, removed, idle, running, held and suspended states.
The available output data are as follows:
The -analyze or -better-analyze options can be used to determine why certain jobs are not running by performing an analysis on a per machine basis for each machine in the pool. The reasons can vary among failed constraints, insufficient priority, resource owner preferences and prevention of preemption by the PREEMPTION_REQUIREMENTS expression. If the analyze option -verbose is specified along with the -analyze option, the reason for failure is displayed on a per machine basis. -better-analyze differs from -analyze in that it will do matchmaking analysis on jobs even if they are currently running, or if the reason they are not running is not due to matchmaking. -better-analyze also produces more thorough analysis of complex Requirements and shows the values of relevant job ClassAd attributes. When only a single machine is being analyzed via -machine or -mconstraint, the values of relevant attributes of the machine ClassAd are also displayed.
To restrict the display to jobs of interest, a list of zero or more restriction options may be supplied. Each restriction may be one of:
If cluster or cluster.process is specified, and the job matching that restriction is a condor_dagman job, information for all jobs of that DAG is displayed in batch mode (in non-batch mode, only the condor_dagman job itself is displayed).
If no owner restrictions are present, the job matches the restriction list if it matches at least one restriction in the list. If owner restrictions are present, the job matches the list if it matches one of the owner restrictions and at least one non-owner restriction.
Also change the output columns as noted above.
Note that, as of version 8.5.6, -batch is the default, unless the CONDOR_Q_DASH_BATCH_IS_DEFAULT configuration variable is set to False.
It is assumed that no attribute names begin with a dash character, so that the next word that begins with dash is the start of the next option. The autoformat option may be followed by a colon character and formatting qualifiers to deviate the output formatting from the default:
j print the job ID as the first field,
l label each field,
h print column headings before the first line of output,
V use %V rather than %v for formatting (string values are quoted),
r print "raw", or unevaluated values,
, add a comma character after each field,
t add a tab character before each field instead of the default space character,
n add a newline character after each field,
g add a newline character between ClassAds, and suppress spaces before each field.
Use -af:h to get tabular values with headings.
Use -af:lrng to get -long equivalent format.
The newline and comma characters may not be used together. The l and h characters may not be used together.
priority to consider user priority during the analysis
summary to show a one line summary for each job or machine
reverse to analyze machines, rather than jobs
priority to consider user priority during the analysis
summary to show a one line summary for each job or machine
reverse to analyze machines, rather than jobs
The default output from condor_q is formatted to be human readable, not script readable. In an effort to make the output fit within 80 characters, values in some fields might be truncated. Furthermore, the HTCondor Project can (and does) change the formatting of this default output as we see fit. Therefore, any script that is attempting to parse data from condor_q is strongly encouraged to use the -format option (described above, examples given below).
Although -analyze provides a very good first approximation, the analyzer cannot diagnose all possible situations, because the analysis is based on instantaneous and local information. Therefore, there are some situations such as when several submitters are contending for resources, or if the pool is rapidly changing state which cannot be accurately diagnosed.
It is possible to hold jobs that are in the X state. To avoid this it is best to construct a -constraint expression that option contains JobStatus != 3 if the user wishes to avoid this condition.
The -format option provides a way to specify both the job attributes and formatting of those attributes. There must be only one conversion specification per -format option. As an example, to list only Jane Doe's jobs in the queue, choosing to print and format only the owner of the job, the command line arguments for the job, and the process ID of the job:
$ condor_q -submitter jdoe -format "%s" Owner -format " %s " Args -format " ProcId = %d\n" ProcId jdoe 16386 2800 ProcId = 0 jdoe 16386 3000 ProcId = 1 jdoe 16386 3200 ProcId = 2 jdoe 16386 3400 ProcId = 3 jdoe 16386 3600 ProcId = 4 jdoe 16386 4200 ProcId = 7
To display only the JobID's of Jane Doe's jobs you can use the following.
$ condor_q -submitter jdoe -format "%d." ClusterId -format "%d\n" ProcId 27.0 27.1 27.2 27.3 27.4 27.7
An example that shows the analysis in summary format:
$ condor_q -analyze:summary
-- Submitter: submit-1.chtc.wisc.edu : <192.168.100.43:9618?sock=11794_95bb_3> :
submit-1.chtc.wisc.edu
Analyzing matches for 5979 slots
Autocluster Matches Machine Running Serving
JobId Members/Idle Reqmnts Rejects Job Users Job Other User Avail Owner
---------- ------------ -------- ------------ ---------- ---------- ----- -----
25764522.0 7/0 5910 820 7/10 5046 34 smith
25764682.0 9/0 2172 603 9/9 1531 29 smith
25765082.0 18/0 2172 603 18/9 1531 29 smith
25765900.0 1/0 2172 603 1/9 1531 29 smith
An example that shows summary information by machine:
$ condor_q -ana:sum,rev
-- Submitter: s-1.chtc.wisc.edu : <192.168.100.43:9618?sock=11794_95bb_3> : s-1.chtc.wisc.edu
Analyzing matches for 2885 jobs
Slot Slot's Req Job's Req Both
Name Type Matches Job Matches Slot Match %
------------------------ ---- ------------ ------------ ----------
slot1@INFO.wisc.edu Stat 2729 0 0.00
slot2@INFO.wisc.edu Stat 2729 0 0.00
slot1@aci-001.chtc.wisc.edu Part 0 2793 0.00
slot1_1@a-001.chtc.wisc.edu Dyn 2644 2792 91.37
slot1_2@a-001.chtc.wisc.edu Dyn 2623 2601 85.10
slot1_3@a-001.chtc.wisc.edu Dyn 2644 2632 85.82
slot1_4@a-001.chtc.wisc.edu Dyn 2644 2792 91.37
slot1@a-002.chtc.wisc.edu Part 0 2633 0.00
slot1_10@a-002.chtc.wisc.edu Den 2623 2601 85.10
An example with two independent DAGs in the queue:
$ condor_q -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:35169?... OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS wenger DAG: 3696 2/12 11:55 _ 10 _ 10 3698.0 ... 3707.0 wenger DAG: 3697 2/12 11:55 1 1 1 10 3709.0 ... 3710.0 14 jobs; 0 completed, 0 removed, 1 idle, 13 running, 0 held, 0 suspended
Note that the "13 running" in the last line is two more than the total of the RUN column, because the two condor_dagman jobs themselves are counted in the last line but not the RUN column.
Also note that the "completed" value in the last line does not correspond to the total of the DONE column, because the "completed" value in the last line only counts jobs that are completed but still in the queue, whereas the DONE column counts jobs that are no longer in the queue.
Here's an example with a held job, illustrating the addition of the HOLD column to the output:
$ condor_q -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?... OWNER BATCH_NAME SUBMITTED DONE RUN IDLE HOLD TOTAL JOB_IDS wenger CMD: /bin/slee 9/13 16:25 _ 3 _ 1 4 599.0 ... 4 jobs; 0 completed, 0 removed, 0 idle, 3 running, 1 held, 0 suspended
Here are some examples with a nested-DAG workflow in the queue, which is one of the most complicated cases. The workflow consists of a top-level DAG with nodes NodeA and NodeB, each with two two-proc clusters; and a sub-DAG SubZ with nodes NodeSA and NodeSB, each with two two-proc clusters.
First of all, non-batch mode with all of the node jobs in the queue:
$ condor_q -nobatch -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?... ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 591.0 wenger 9/13 16:05 0+00:00:13 R 0 2.4 condor_dagman -p 0 592.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60 592.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300 593.0 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 60 593.1 wenger 9/13 16:05 0+00:00:07 R 0 0.0 sleep 300 594.0 wenger 9/13 16:05 0+00:00:07 R 0 2.4 condor_dagman -p 0 595.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60 595.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300 596.0 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 60 596.1 wenger 9/13 16:05 0+00:00:01 R 0 0.0 sleep 300 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
Now non-batch mode with the -dag option (unfortunately, condor_q doesn't do a good job of grouping procs in the same cluster together):
$ condor_q -nobatch -dag -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?... ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD 591.0 wenger 9/13 16:05 0+00:00:27 R 0 2.4 condor_dagman - 592.0 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep 60 593.0 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep 60 594.0 |-SubZ 9/13 16:05 0+00:00:21 R 0 2.4 condor_dagman - 595.0 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep 60 596.0 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep 60 592.1 |-NodeA 9/13 16:05 0+00:00:21 R 0 0.0 sleep 300 593.1 |-NodeB 9/13 16:05 0+00:00:21 R 0 0.0 sleep 300 595.1 |-NodeSA 9/13 16:05 0+00:00:15 R 0 0.0 sleep 300 596.1 |-NodeSB 9/13 16:05 0+00:00:15 R 0 0.0 sleep 300 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
Now, finally, the non-batch (default) mode:
$ condor_q -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?... OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS wenger ex1.dag+591 9/13 16:05 _ 8 _ 5 592.0 ... 596.1 10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
There are several things about this output that may be slightly confusing:
Now here is non-batch mode after proc 0 of each node job has finished:
$ condor_q -nobatch -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?... ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 591.0 wenger 9/13 16:05 0+00:01:19 R 0 2.4 condor_dagman -p 0 592.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300 593.1 wenger 9/13 16:05 0+00:01:13 R 0 0.0 sleep 300 594.0 wenger 9/13 16:05 0+00:01:13 R 0 2.4 condor_dagman -p 0 595.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300 596.1 wenger 9/13 16:05 0+00:01:07 R 0 0.0 sleep 300 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
The same state also with the -dag option:
$ condor_q -nobatch -dag -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?... ID OWNER/NODENAME SUBMITTED RUN_TIME ST PRI SIZE CMD 591.0 wenger 9/13 16:05 0+00:01:30 R 0 2.4 condor_dagman - 592.1 |-NodeA 9/13 16:05 0+00:01:24 R 0 0.0 sleep 300 593.1 |-NodeB 9/13 16:05 0+00:01:24 R 0 0.0 sleep 300 594.0 |-SubZ 9/13 16:05 0+00:01:24 R 0 2.4 condor_dagman - 595.1 |-NodeSA 9/13 16:05 0+00:01:18 R 0 0.0 sleep 300 596.1 |-NodeSB 9/13 16:05 0+00:01:18 R 0 0.0 sleep 300 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
And, finally, that state in batch (default) mode:
$ condor_q -- Schedd: wenger@manta.cs.wisc.edu : <128.105.14.228:9619?... OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS wenger ex1.dag+591 9/13 16:05 _ 4 _ 5 592.1 ... 596.1 6 jobs; 0 completed, 0 removed, 0 idle, 6 running, 0 held, 0 suspended
condor_q will exit with a status value of 0 (zero) upon success, and it will exit with the value 1 (one) upon failure.
HTCondor Team
1990-2024, Center for High Throughput Computing, Computer Sciences Department, University of Wisconsin-Madison, Madison, WI, US. Licensed under the Apache License, Version 2.0.
| August 25, 2024 |