Job Commands#
In this section, we will review the most important commands to manage
your jobs. We assume that the system administrator has installed the
cluster and that you have successfully executed the grbcluster login
command with the appropriate flags to access your cluster.
Submitting Interactive Jobs#
The Gurobi command-line tool gurobi_cl
can be used to submit
optimization jobs. Once you have successfully executed the
grbcluster login
command, you can then submit a job using
gurobi_cl
. This is the example we used earlier in this document:
> gurobi_cl ResultFile=solution.sol stein9.mps
Using license file /opt/gurobi950/manager.lic
Set parameter CSManager to value "http://server1:61080"
Set parameter LogFile to value "gurobi.log"
Compute Server job ID: 1e9c304c-a5f2-4573-affa-ab924d992f7e
Capacity available on 'server1:61000' - connecting...
Established HTTP unencrypted connection
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)
Copyright (c) 2022, Gurobi Optimization, LLC
...
Gurobi Compute Server Worker version 10.0.3 build v10.0.3rc0 (linux64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
...
Optimal solution found (tolerance 1.00e-04)
Best objective 5.000000000000e+00, best bound 5.000000000000e+00, gap 0.0000%
Compute Server communication statistics:
Sent: 0.002 MBytes in 9 msgs and 0.01s (0.26 MB/s)
Received: 0.007 MBytes in 26 msgs and 0.09s (0.08 MB/s)
The initial log output indicates that a Compute Server job was created,
that the Compute Server cluster had capacity available to run that job,
and that an unencrypted HTTP connection was established with a server in
that cluster. The log concludes with statistics about the communication
performed between the client machine and the Compute Server. Note that
the result file solution.sol
is also retrieved.
This is an interactive optimization task because the connection with the job must be kept alive and the progress messages are displayed in real-time. Also, stopping or killing the command terminates the job.
Listing Jobs#
The optimization jobs running on a Compute Server cluster can be listed
using the jobs
command:
> grbcluster jobs
JOBID ADDRESS STATUS #Q STIME USER PRIO API
58780a22 server1 RUNNING 2019-04-07 14:36:49 jones 0 gurobi_cl
The jobs
command is actually a shortcut for the job list
command.
> grbcluster job list
JOBID ADDRESS STATUS #Q STIME USER PRIO API
58780a22 server1 RUNNING 2019-04-07 14:36:49 jones 0 gurobi_cl
Note that you can get more information by using the —-long
flag.
With this flag, you will also display the complete job ID, which is
unique, instead of the short ID:
> grbcluster jobs --long
JOBID ADDRESS STATUS #Q STIME USER PRIO API RUNTIME PID HOST IP
58780a22-... server1 RUNNING 2019-04-07 14:36:49 jones 0 gurobi_cl 9.5.0 20656 machine1 [::1]
The jobs
command only shows jobs that are currently running. To
obtain information on jobs that were processed recently, run the
job recent
command:
> grbcluster job recent
JOBID ADDRESS STATUS STIME USER OPT API
58780a22 server1 COMPLETED 2019-04-07 14:36:54 jones OPTIMAL gurobi_cl
The information displayed by the jobs
and job recent
commands
can be changed using the —-view
flag. The default view for the two
commands is the status
view. Alternatives are:
status - List all jobs and their statuses
model - List all jobs, and include information about the models solved
simplex - List jobs that used the SIMPLEX algorithm
barrier - List jobs that used the BARRIER algorithm
mip - list jobs that used the MIP algorithm
For example, the model
view gives details about the model, including
the number of rows, columns and non-zeros in the constraint matrix:
> grbcluster job recent --view=model
JOBID STATUS STIME ROWS COLS NONZ ALG OBJ DURATION
58780a22 COMPLETED 2019-04-07 14:36:54 331 45 1034 MIP 30 4.901s
To get an explanation of the meanings of the different fields within a
view, add the —-describe
flag. For example:
> grbcluster job recent --view=model --describe
JOBID - Unique job ID, use --long to display full ID
STATUS - Job status
STIME - Job status updated time
ROWS - Number of rows
COLS - Number of columns
NONZ - Number of non zero
ALG - Algorithm MIP, SIMPLEX or BARRIER
OBJ - Best objective
DURATION - Solve duration
For a Mixed-Integer Program (MIP), the mip
view provides progress
information for the branch-and-cut tree. For example:
> grbcluster job recent --view=mip
JOBID STATUS STIME OBJBST OBJBND NODCNT SOLCNT CUTCNT NODLFT
58780a22 COMPLETED 2019-04-07 14:36:54 30 30 54868 4 19 0
Again, —-describe
explains the meanings of the different fields:
> grbcluster job recent --view mip --describe
JOBID - Unique job ID, use --long to display full ID
STATUS - Job status
STIME - Job status updated time
OBJBST - Current best objective
OBJBND - Current best objective bound
NODCNT - Current explored node count
SOLCNT - Current count of feasible solutions found
CUTCNT - Current count of cutting planes applied
NODLFT - Current unexplored node count
Note that the jobs
command provides live status information, so you
will for example see current MIP progress information while the solve is
in progress.
The other views (simplex
and barrier
) are similar, although of
course they provide slightly different information.
Accessing Job Logs#
Gurobi Optimizer log output from a previous or currently running job can
be retrieved by using the job log
command. For example:
> grbcluster job log 58780a22
info : Found matching job is 58780a22-8acc-499e-b73c-da6f2df59669
Flow cover: 20
Zero half: 4
Explored 54868 nodes (381866 simplex iterations) in 4.27 seconds
Thread count was 4 (of 4 available processors)
Solution count 4: 30 31 32 33
Optimal solution found (tolerance 1.00e-04)
Best objective 3.000000000000e+01, best bound 3.000000000000e+01, gap 0.0000%
The argument to this command is the JOBID
for the job of interest
(which can be retrieved using the jobs
command). You can use the
full ID or the short ID. If you don’t specify a JOBID
, the command
will display the log for the last job submitted.
The job log
command accepts the following arguments:
Usage:
grbcluster job log <JOBID> [flags]
Flags can be set using --flag=value or the short form -f=value if defined.
A boolean flag can be enabled using --flag or the short form -f if defined.
Flags:
-b, --begin Display log from the beginning
-f, --continuous Display log continuously until job completion
-h, --help help for log
-n, --lines int Display only the last n lines (default 10)
For example, to get the entire log, from the beginning of the job, use
the -b
(or —-begin
) flag.
> grbcluster job log 58780a22 -b
You can get a continuous feed of the log for a running optimization job
with the -f
(or —-follow
) flag.
Accessing Job Parameters#
The Gurobi Optimizer provides a number of parameters that can be
modified by the user. The job params
command allows you to inspect
the values of these parameters in a Compute Server job:
> grbcluster job params 58780a22
info : Found matching job is 58780a22-8acc-499e-b73c-da6f2df59669
ComputeServer=server1
TimeLimit=60
The argument to this command is the JOBID
for the job of interest
(which can be retrieved using the jobs
command). You can use the
full ID or the short ID. If you don’t specify a JOBID
, the command
will display the changed parameters of the last job submitted.
The following example illustrates how the grbcluster job params
command can be used in practice. The first step is to start an
optimization job on a Compute Server cluster with one modified
parameter:
> gurobi_cl TimeLimit=120 glass4.mps
Once the job starts, you can use the grbcluster jobs
command to
retrieve the associated JOBID
(or you can read it off from the
output of gurobi_cl
). For jobs that have been already processed, you
would run the job recent
command instead.
> grbcluster jobs
JOBID ADDRESS STATUS #Q STIME USER PRIO API
7c51bf74 server1 RUNNING 2019-04-07 14:50:56 jones 0 gurobi_cl
Once you obtain the JOBID
, the job params
command shows the
modified parameter settings for the job:
> grbcluster job params 7c51bf74
info : Found matching job is 7c51bf74-ba02-4239-875e-c8ea388f9427
ComputeServer=server1
TimeLimit=120
The full list of Gurobi parameters can be found in the Parameters section of the Gurobi Reference Manual.
Aborting Jobs#
Jobs that are running on a Compute Server can be aborted by using the
job abort
command. For example:
> grbcluster job abort e7022667
The following steps illustrate how you would start and subsequently
abort a job. First, use the Gurobi command-line tool (gurobi_cl
) to
start a long-running optimization job on your Compute Server:
> gurobi_cl glass4.mps
Once the job starts, you can use the grbcluster jobs
command to
retrieve the associated JOBID
(or you can read it off from the
output of gurobi_cl
):
> grbcluster jobs
JOBID ADDRESS STATUS #Q STIME PRIO
8f9b15d9 server1 RUNNING 2017-10-10 17:30:33 0
The full or short JOBID
can be used to abort the job as follows:
> grbcluster job abort 8f9b15d9
If no JOBID
is specified, the most recently started job will be
aborted.
After the abort command is issued, the status of the job can be
retrieved using the job recent
command:
> grbcluster job recent
JOBID ADDRESS STATUS STIME USER OPT
8f9b15d9 server1 ABORTED 2017-10-10 17:41:33 user1 OPTIMAL
As you can see, the status of the job has changed to ABORTED
.
Accessing the Job History#
The job history is only supported by the Cluster Manager. While the nodes have a limited recent history of jobs, the Cluster Manager is able to record the jobs and the logs over a longer period of time. The exact lifespan is a configuration parameter that can be set by the system administrator. History information is persistent and can be accessed even if the cluster nodes are not available.
By default, the job history
command will display the last jobs in
your cluster.
> grbcluster job history
JOBID BATCHID ADDRESS STATUS STIME USER OPT API
910878b9 4aba4ad3 server1:61000 COMPLETED 2019-09-22 15:55:24 jones OPTIMAL grbcluster
594d82fc 9bc34333 server1:61000 ABORTED 2019-09-22 15:52:36 admin grbcluster
ce7ab3a4 2e05810c server1:61000 COMPLETED 2019-09-22 14:20:14 jones OPTIMAL grbcluster
66d4783b ada0a345 server1:61000 COMPLETED 2019-09-22 14:17:58 admin OPTIMAL grbcluster
In addition, flags are available to query the history by status,
username, time period, application name, and more. For example, the
following command lists the last two jobs that were ABORTED
by the
user gurobi:
> grbcluster job history --status=ABORTED --length=2 --username=gurobi
JOBID BATCHID ADDRESS STATUS STIME USER OPT API
80334e25 5a4764cc server1:61000 ABORTED 2019-09-20 16:53:23 gurobi Python
0d6d140b f94228b6 server1:61000 ABORTED 2019-09-16 22:04:09 gurobi Python
This history
command gives you access to the same views as the
job recent
command, but with more filters. For example, the
following command shows the model characteristics of the last two jobs
that were COMPLETED
by the application app1
:
> grbcluster job history --status=COMPLETED --length=2 --view=model --app=app1
JOBID STATUS STIME ROWS COLS NONZ ALG OBJ DURATION
b9063f12 COMPLETED 2019-09-19 11:22:18 13 9 45 MIP 5 319ms
964a9405 COMPLETED 2019-09-19 11:22:17 13 9 45 MIP 5 126ms