Job Commands#

In this section, we will review the most important commands to manage your jobs. We assume that the system administrator has installed the cluster and that you have successfully executed the grbcluster login command with the appropriate flags to access your cluster.

Submitting Interactive Jobs#

The Gurobi command-line tool gurobi_cl can be used to submit optimization jobs. Once you have successfully executed the grbcluster login command, you can then submit a job using gurobi_cl. This is the example we used earlier in this document:

> gurobi_cl ResultFile=solution.sol stein9.mps
Using license file /opt/gurobi950/manager.lic
Set parameter CSManager to value "http://server1:61080"
Set parameter LogFile to value "gurobi.log"
Compute Server job ID: 1e9c304c-a5f2-4573-affa-ab924d992f7e
Capacity available on 'server1:61000' - connecting...
Established HTTP unencrypted connection

Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)
Copyright (c) 2022, Gurobi Optimization, LLC
...
Gurobi Compute Server Worker version 10.0.3 build v10.0.3rc0 (linux64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
...
Optimal solution found (tolerance 1.00e-04)
Best objective 5.000000000000e+00, best bound 5.000000000000e+00, gap 0.0000%

Compute Server communication statistics:
  Sent: 0.002 MBytes in 9 msgs and 0.01s (0.26 MB/s)
  Received: 0.007 MBytes in 26 msgs and 0.09s (0.08 MB/s)

The initial log output indicates that a Compute Server job was created, that the Compute Server cluster had capacity available to run that job, and that an unencrypted HTTP connection was established with a server in that cluster. The log concludes with statistics about the communication performed between the client machine and the Compute Server. Note that the result file solution.sol is also retrieved.

This is an interactive optimization task because the connection with the job must be kept alive and the progress messages are displayed in real-time. Also, stopping or killing the command terminates the job.

Listing Jobs#

The optimization jobs running on a Compute Server cluster can be listed using the jobs command:

> grbcluster jobs
JOBID    ADDRESS  STATUS  #Q  STIME               USER  PRIO API
58780a22 server1  RUNNING     2019-04-07 14:36:49 jones 0    gurobi_cl

The jobs command is actually a shortcut for the job list command.

> grbcluster job list
JOBID    ADDRESS  STATUS  #Q  STIME               USER  PRIO API
58780a22 server1  RUNNING     2019-04-07 14:36:49 jones 0    gurobi_cl

Note that you can get more information by using the —-long flag. With this flag, you will also display the complete job ID, which is unique, instead of the short ID:

> grbcluster jobs --long
JOBID        ADDRESS  STATUS  #Q  STIME               USER  PRIO API       RUNTIME PID   HOST   IP
58780a22-... server1  RUNNING     2019-04-07 14:36:49 jones 0    gurobi_cl 9.5.0   20656 machine1 [::1]

The jobs command only shows jobs that are currently running. To obtain information on jobs that were processed recently, run the job recent command:

> grbcluster job recent
JOBID    ADDRESS  STATUS    STIME               USER  OPT         API
58780a22 server1  COMPLETED 2019-04-07 14:36:54 jones OPTIMAL     gurobi_cl

The information displayed by the jobs and job recent commands can be changed using the —-view flag. The default view for the two commands is the status view. Alternatives are:

status   - List all jobs and their statuses
model    - List all jobs, and include information about the models solved
simplex  - List jobs that used the SIMPLEX algorithm
barrier  - List jobs that used the BARRIER algorithm
mip      - list jobs that used the MIP algorithm

For example, the model view gives details about the model, including the number of rows, columns and non-zeros in the constraint matrix:

> grbcluster job recent --view=model
JOBID    STATUS    STIME               ROWS COLS NONZ ALG OBJ  DURATION
58780a22 COMPLETED 2019-04-07 14:36:54 331  45   1034 MIP 30   4.901s

To get an explanation of the meanings of the different fields within a view, add the —-describe flag. For example:

> grbcluster job recent --view=model --describe
JOBID     - Unique job ID, use --long to display full ID
STATUS    - Job status
STIME     - Job status updated time
ROWS      - Number of rows
COLS      - Number of columns
NONZ      - Number of non zero
ALG       - Algorithm MIP, SIMPLEX or BARRIER
OBJ       - Best objective
DURATION  - Solve duration

For a Mixed-Integer Program (MIP), the mip view provides progress information for the branch-and-cut tree. For example:

> grbcluster job recent --view=mip
JOBID    STATUS    STIME               OBJBST OBJBND NODCNT SOLCNT CUTCNT NODLFT
58780a22 COMPLETED 2019-04-07 14:36:54 30     30     54868  4      19     0

Again, —-describe explains the meanings of the different fields:

> grbcluster job recent --view mip --describe
JOBID     - Unique job ID, use --long to display full ID
STATUS    - Job status
STIME     - Job status updated time
OBJBST    - Current best objective
OBJBND    - Current best objective bound
NODCNT    - Current explored node count
SOLCNT    - Current count of feasible solutions found
CUTCNT    - Current count of cutting planes applied
NODLFT    - Current unexplored node count

Note that the jobs command provides live status information, so you will for example see current MIP progress information while the solve is in progress.

The other views (simplex and barrier) are similar, although of course they provide slightly different information.

Accessing Job Logs#

Gurobi Optimizer log output from a previous or currently running job can be retrieved by using the job log command. For example:

> grbcluster job log 58780a22
info  : Found matching job is 58780a22-8acc-499e-b73c-da6f2df59669
  Flow cover: 20
  Zero half: 4

Explored 54868 nodes (381866 simplex iterations) in 4.27 seconds
Thread count was 4 (of 4 available processors)

Solution count 4: 30 31 32 33

Optimal solution found (tolerance 1.00e-04)
Best objective 3.000000000000e+01, best bound 3.000000000000e+01, gap 0.0000%

The argument to this command is the JOBID for the job of interest (which can be retrieved using the jobs command). You can use the full ID or the short ID. If you don’t specify a JOBID, the command will display the log for the last job submitted.

The job log command accepts the following arguments:

Usage:
  grbcluster job log <JOBID> [flags]

Flags can be set using --flag=value or the short form -f=value if defined.
A boolean flag can be enabled using --flag or the short form -f if defined.

Flags:
  -b, --begin        Display log from the beginning
  -f, --continuous   Display log continuously until job completion
  -h, --help         help for log
  -n, --lines int    Display only the last n lines (default 10)

For example, to get the entire log, from the beginning of the job, use the -b (or —-begin) flag.

> grbcluster job log 58780a22 -b

You can get a continuous feed of the log for a running optimization job with the -f (or —-follow) flag.

Accessing Job Parameters#

The Gurobi Optimizer provides a number of parameters that can be modified by the user. The job params command allows you to inspect the values of these parameters in a Compute Server job:

> grbcluster job params 58780a22
info  : Found matching job is 58780a22-8acc-499e-b73c-da6f2df59669
ComputeServer=server1
TimeLimit=60

The argument to this command is the JOBID for the job of interest (which can be retrieved using the jobs command). You can use the full ID or the short ID. If you don’t specify a JOBID, the command will display the changed parameters of the last job submitted.

The following example illustrates how the grbcluster job params command can be used in practice. The first step is to start an optimization job on a Compute Server cluster with one modified parameter:

> gurobi_cl TimeLimit=120 glass4.mps

Once the job starts, you can use the grbcluster jobs command to retrieve the associated JOBID (or you can read it off from the output of gurobi_cl). For jobs that have been already processed, you would run the job recent command instead.

> grbcluster jobs
JOBID    ADDRESS STATUS  #Q  STIME               USER  PRIO API
7c51bf74 server1 RUNNING     2019-04-07 14:50:56 jones 0    gurobi_cl

Once you obtain the JOBID, the job params command shows the modified parameter settings for the job:

> grbcluster job params 7c51bf74
info  : Found matching job is 7c51bf74-ba02-4239-875e-c8ea388f9427
ComputeServer=server1
TimeLimit=120

The full list of Gurobi parameters can be found in the Parameters section of the Gurobi Reference Manual.

Aborting Jobs#

Jobs that are running on a Compute Server can be aborted by using the job abort command. For example:

> grbcluster job abort e7022667

The following steps illustrate how you would start and subsequently abort a job. First, use the Gurobi command-line tool (gurobi_cl) to start a long-running optimization job on your Compute Server:

> gurobi_cl glass4.mps

Once the job starts, you can use the grbcluster jobs command to retrieve the associated JOBID (or you can read it off from the output of gurobi_cl):

> grbcluster jobs
JOBID    ADDRESS STATUS  #Q  STIME               PRIO
8f9b15d9 server1 RUNNING     2017-10-10 17:30:33 0

The full or short JOBID can be used to abort the job as follows:

> grbcluster job abort 8f9b15d9

If no JOBID is specified, the most recently started job will be aborted.

After the abort command is issued, the status of the job can be retrieved using the job recent command:

> grbcluster job recent
JOBID    ADDRESS  STATUS  STIME               USER  OPT
8f9b15d9 server1  ABORTED 2017-10-10 17:41:33 user1 OPTIMAL

As you can see, the status of the job has changed to ABORTED.

Accessing the Job History#

The job history is only supported by the Cluster Manager. While the nodes have a limited recent history of jobs, the Cluster Manager is able to record the jobs and the logs over a longer period of time. The exact lifespan is a configuration parameter that can be set by the system administrator. History information is persistent and can be accessed even if the cluster nodes are not available.

By default, the job history command will display the last jobs in your cluster.

> grbcluster job history
JOBID    BATCHID  ADDRESS       STATUS    STIME               USER   OPT     API
910878b9 4aba4ad3 server1:61000 COMPLETED 2019-09-22 15:55:24 jones  OPTIMAL grbcluster
594d82fc 9bc34333 server1:61000 ABORTED   2019-09-22 15:52:36 admin          grbcluster
ce7ab3a4 2e05810c server1:61000 COMPLETED 2019-09-22 14:20:14 jones  OPTIMAL grbcluster
66d4783b ada0a345 server1:61000 COMPLETED 2019-09-22 14:17:58 admin  OPTIMAL grbcluster

In addition, flags are available to query the history by status, username, time period, application name, and more. For example, the following command lists the last two jobs that were ABORTED by the user gurobi:

> grbcluster job history --status=ABORTED --length=2 --username=gurobi
JOBID    BATCHID  ADDRESS       STATUS  STIME               USER   OPT API
80334e25 5a4764cc server1:61000 ABORTED 2019-09-20 16:53:23 gurobi     Python
0d6d140b f94228b6 server1:61000 ABORTED 2019-09-16 22:04:09 gurobi     Python

This history command gives you access to the same views as the job recent command, but with more filters. For example, the following command shows the model characteristics of the last two jobs that were COMPLETED by the application app1:

> grbcluster job history --status=COMPLETED --length=2  --view=model --app=app1
JOBID    STATUS    STIME               ROWS COLS NONZ ALG OBJ DURATION
b9063f12 COMPLETED 2019-09-19 11:22:18 13   9    45   MIP 5   319ms
964a9405 COMPLETED 2019-09-19 11:22:17 13   9    45   MIP 5   126ms