Installing a Cluster Node¶
Once the Remote Services package is installed, you will need to set up a license, if necessary. Then, the Remote Services agent must be configured and started as a standard process or as a service. Finally, you should verify your installation.
Licensing¶
You will need to download and install a license file on all Compute Server nodes (no license file is required for a Distributed Worker node). You will find detailed instructions for downloading a license in the section How do I retrieve and set up a Gurobi license of the Getting Started Knowledge Base article.
We will just provide a quick summary of the process here. Your first step is to locate and download your license file from the Gurobi License Center. When you download the license file, we strongly recommend that you place it in the default location:
C:\gurobi\
on Windows/opt/gurobi/
on Linux/Library/gurobi/
on macOSThe user’s home directory
You can also set the environment variable GRB_LICENSE_FILE
to point
to this file.
In order to use the Cluster Manager, you will need to connect at least one Compute Server node to the cluster. When certain operations are requested such as submitting a job or a batch, the Cluster Manager will check the licenses available on the nodes. If none of the nodes have a valid Compute Server license, the operation will not be authorized.
Remote Services Agent (grb_rs)¶
To form a Remote Services cluster, you need to run the Remote Services
agent (grb_rs
) on all of the nodes that make up the cluster. These
agents communicate amongst themselves, and also with the Cluster Manager
or the clients.
The primary task of the Remote Services agents is to collectively manage the queueing and the execution of jobs. The agents work together to balance the load by assigning a new job to the node with the fewest running jobs whenever possible. If all nodes are at capacity, newly submitted jobs will be queued, and the first node with available capacity will later execute the job. If a new node is added to the cluster, it will immediately start processing queued jobs.
The grb_rs
executable provides several commands and flags to help in
the configuration and execution of the agent. We will review these
commands step by step in the following sections. You can see the full
list of commands in the reference section or
by using the command-line help:
> grb_rs --help
Configuring a Cluster Node¶
The Remote Services agent has a number of configuration properties that
affect its behavior. These can be controlled using a grb_rs.cnf
configuration file. The installation package includes a predefined
configuration file that can be used as a starting point
(<installdir>/bin/grb_rs.cnf
).
The simplest way to modify the parameters is to edit the default
configuration file. Other options are available, though. The grb_rs
process uses the following precedence rules:
First priority: command-line flag
—-config
Second priority: a configuration file in the current directory
Third priority: a configuration file in a shared directory (
C:\gurobi
,/opt/gurobi
,/Library/gurobi
for Windows, Linux and macOS platforms, respectively)Fourth priority: a configuration file in the directory where
grb_rs
is located
Most of the properties that are configured through this file are related
to communication options and job processing options. The configuration
file is only read once, when grb_rs
first starts. Subsequent changes
to the file won’t affect parameter values on a running server.
Configuration file format¶
The configuration file contains a list of properties of the form
PROPERTY=value
. Lines that begin with the # symbol are treated as
comments and are ignored. Here is an example:
# grb_rs.cnf configuration file
PORT=61000
MANAGER=http://mymanager:61080
While you could create this file from scratch, we recommend you start with the version of this file that is included with the product and modify it instead.
The command grb_rs properties
lists all of the available properties,
their default values, and provides documentation for each. Some can be
overridden on the command-line of grb_rs
; the name of the
command-line flag you would use to do so is provided as well. Some
properties are important and must be changed for a production
deployment. However, we need to distinguish between deployment with a
Cluster Manager and without.
Important Properties with a Cluster Manager¶
When deploying a node with a Cluster Manager, the configuration is easier and you need to review the following properties:
- MANAGER
This is the URL of the manager.
- HOSTNAME
This must be the DNS name of the node that can be resolved from the other nodes or from the Cluster Manager.
grb_rs
tries to get a reasonable default value, but this value may still not be resolved by other nodes and could generate connection errors. It this case, you need to override this name in the configuration file with a fully qualified name of your node, for example:HOSTNAME=server1
Note that you do not need to give addresses that can be resolved by clients because all communication is routed through the Cluster Manager. The nodes are never accessed directly by the clients.
- CLUSTER_TOKEN
The token is a private key that enables different nodes to join the same cluster. All nodes of a cluster and the Cluster Manager must have the same token. We recommended that you generate a brand new token when you set up your cluster. The
grb_rs token
command will generate a random token, which you can copy into the configuration file.- JOBLIMIT
This property sets the maximum number of jobs that can run concurrently when using Compute Server on a specific node. The limit can be changed on a running cluster using the
grbcluster node config —-job-limit <new_limit>
command, in which case the new value will persist and the value in the configuration file will be ignored from that point on (even if you stop and restart the cluster).- HARDJOBLIMIT
Certain jobs (those with priority 100) are allowed to ignore the JOBLIMIT, but they aren’t allowed to ignore this limit. Client requests beyond this limit are queued, irrespective of their priority. This limit is set to 0 by default which means that it is disabled and jobs with priority 100 will not be queued.
Important Properties without a Cluster Manager¶
When installing a node that will not be connected to a Cluster Manager, authentication of clients uses predefined passwords that must be stored in the configuration file. The default configuration files must be reviewed and the following properties must be changed for a production deployment:
- HOSTNAME
This must be the DNS name of the node that can be resolved from the other nodes or the clients in your network.
grb_rs
tries to get a reasonable default value, but this value may still not be resolved by clients and could generate connection errors. It this case, you need to override this name in the configuration file with a fully qualified name of your node, for example:HOSTNAME=server1
If the names cannot be resolved by clients, another option is to use IP addresses directly, in this case set this property to the IP address of the node.
- CLUSTER_TOKEN
The token is a private key that enables different nodes to join the same cluster. All nodes of a cluster must have the same token. We recommended that you generate a brand new token when you set up your cluster. The
grb_rs token
command will generate a random token, which you can copy into the configuration file.- PASSWORD
This is the password that clients must supply in order to access the cluster. It can be stored in clear text or hashed. We recommended that you create your own password, and that you store it in hashed form. You can use the
grb_rs hash
command to compute the hashed value for your chosen password. Note that clients must provide the original password (not hashed) and it will be exchanged encrypted if HTTPS is used.grb_rs hash newpass $$ppEieKZExlBR-pCSUMlmc4oWlG8nZsUOE2IM0hJbzsmV_Yjj
Then copy and paste the value in the configuration file:
PASSWORD=$$ppEieKZExlBR-pCSUMlmc4oWlG8nZsUOE2IM0hJbzsmV_Yjj
The default password is
pass
.- ADMINPASSWORD
This is the password that clients must supply in order to run restricted administrative job commands. It can be stored in clear text or hashed. We recommended that you create your own password, and that you store it in hashed form. You can use the
grb_rs hash
command to compute the hashed value for your chosen password. Note that clients must provide the original password (not hashed) and it will be exchanged encrypted if HTTPS is used. The default password isadmin
.- CLUSTER_ADMINPASSWORD
This is the password that clients must supply in order to run restricted administrative cluster commands. It can be stored in clear text or hashed. We recommended that you create your own password, and that you store it in hashed form. You can use the
grb_rs hash
command to compute the hashed value for your chosen password. Note that clients must provide the original password (not hashed) and it will be exchanged encrypted if HTTPS is used. The default password iscluster
.- JOBLIMIT
This property sets the maximum number of jobs that can run concurrently when using Compute Server on a specific node. The limit can be changed on a running cluster using the
grbcluster node config —-job-limit <new_limit>
command, in which case the new value will persist and the value in the configuration file will be ignored from that point on (even if you stop and restart the cluster).- HARDJOBLIMIT
Certain jobs (those with priority 100) are allowed to ignore the JOBLIMIT, but they aren’t allowed to ignore this limit. Client requests beyond this limit are queued, irrespective of their priority. This limit is set to 0 by default which means that it is disabled and jobs with priority 100 will not be queued.
Starting a Cluster Node as a Process¶
Once you have installed the Remote Services package (including
retrieving and installing your license file and, for Linux users,
setting your PATH
variable), starting grb_rs
as a standard
process is quite straightforward. From a terminal window with
administrator privileges, simply issue the following command:
> grb_rs
If you are using a Cluster Manager and you did not set the MANAGER
configuration property you can specify it on the command-line:
> grb_rs --manager=http://mymanager:61080
Both commands will start the Remote Services agent on the default port (port 80), and you should see output like the following:
info : Reading configuration file: /home/jones/gurobi_server1200/linux64/bin/grb_rs.cnf
info : Gurobi Remote Services starting...
info : Platform is linux
info : Version is 12.0.0\ (build v12.0.0rc0)
info : Variable GRB_LICENSE_FILE is not set
info : License file found at /home/jones/gurobi.lic
info : Node address is server1
info : Node FQN is server1
info : Node has 8 cores
info : Using data directory /home/jones/gurobi_server1200/linux64/bin/data
info : Data store created
info : Available runtimes: [... 12.0.0]
info : Public root is /home/jones/gurobi_server1200/linux64/resources/grb_rs/public
info : Starting API server (HTTP) on port 80...
If you do not have administrator privileges or if the default port is already in use, you will see an error about opening the port. For example, on Linux you might see an error like this:
fatal : Gurobi Remote Services terminated, listen tcp :80: bind: permission denied
or
fatal : Gurobi Remote Services terminated, listen tcp :80: bind: address already in use
Note that grb_rs
does not have to be run with elevated privileges,
but it does need elevated privileges to use the default port 80.
If you would like to run grb_rs
on a non-default port, use the
—-port
flag or set the PORT
property in the configuration file.
For example:
> grb_rs --manager=http://mymanager:61080 --port=61000
The Remote Services agent (grb_rs
) needs a directory to store
various files, including the runtimes, job metadata, job log files, etc.
The default location is a directory named data
, located in the same
directory as the grb_rs
executable (<installdir>/bin/data
). If
you have a directory named data
in your current directory, it will
use that location instead.
If starting grb_rs
produces an error message that indicates that
there was a problem creating the storage service (as shown below), a
likely cause is that another grb_rs
process is already running.
fatal : Error creating storage service: Error opening data store: timeout
Starting a Cluster Node as a Service¶
While you always have the option of running grb_rs
from a terminal
and leaving the process running in the background, we recommended that
you start it as a service instead, especially in a production
deployment. The advantage of a service is that it will automatically
restart itself if the computer is restarted or if the process terminates
unexpectedly.
grb_rs
provides several commands that help you to set it up as a
service. These must be executed with administrator privileges:
- grb_rs install
Install the service. The details of exactly what this involves depend on the host operating system type and version: this uses
systemd
orupstart
on Linux,launchd
on macOS, and Windows services on Windows.- grb_rs start
Start the service (and install it if it hasn’t already been installed).
- grb_rs stop
Stop the service.
- grb_rs restart
Stop and then start the service.
- grb_rs uninstall
Uninstall the service.
Note that the install
command installs the service using default
settings. If you don’t need to modify any of these, you can use the
start
command to both install and start the service. Otherwise, run
install
to register the service, then modify the configuration (the
details are platform-dependent and are touched on below), and then run
start
the service.
Note that you only need to start the service once; grb_rs
will keep
running until you execute the grb_rs stop
command. In particular, it
will start again automatically if you restart the machine.
Note also that the start
command does not take any flags or
additional parameters. All of the configuration properties must be set
in the grb_rs.cnf
configuration file. If you need to make a change,
edit the configuration file, then use the stop
command followed by
the start
command to restart grb_rs
with the updated
configuration.
The one exception is the JOBLIMIT
property, which can be changed on
a live server using grbcluster
. If you change this property and
later restart the server, the new value will persist and the value in
the configuration file will be ignored.
The exact behavior of these commands varies depending on the host operating system and version:
Linux¶
On Linux, grb_rs
supports two major service managers: systemd
and upstart
. The install
command will detect the service manager
available on your system and will generate a service configuration file
located in /etc/systemd/system/grb_rs.service
or
/etc/init/grb_rs.conf
for systemd
and upstart
, respectively.
Once the file is generated, you can edit it to set advanced properties.
Please refer to the documentation of systemd
or upstart
to learn
more about service configuration.
Use the start
and stop
commands to start and stop the service.
When the service is running, log messages are sent to the Linux
syslog
and to a rotating log file, service.log
, located in the
same directory as grb_rs
.
The uninstall
command will delete the generated file.
macOS¶
On macOS, the system manager is called launchd
, and the install
command will generate a service file in
/Library/LaunchDaemons/grb_rs.plist
. Once the file is generated, you
can edit it to set advanced properties. Please refer to the launchd
documentation to learn more about service configuration.
Use the start
and stop
commands to start and stop the service.
When the service is running, log messages are sent to the macOS
syslog
and to a rotating log file, service.log
, located in the
same directory as grb_rs
.
The uninstall
command will delete the generated file.
Windows¶
On Windows, the install
command will declare the service to the
operating system. If you wish to set advanced properties for the service
configuration, you will need to start the Services
configuration
application. Please refer to the Windows Operating System documentation
for more details.
Use the start
and stop
commands to start and stop the service.
When the service is running, log messages are sent to the Windows event
log and to a rotating log file, service.log
, located in the same
directory as grb_rs
. Note that the service must run as a user that
has write permissions to this directory; otherwise, no log file will be
generated.
The uninstall
command will delete the service from the registry.
Verification¶
Once you have grb_rs
running, you can check to make sure that you
will be able to submit jobs to it.
Log In with a Cluster Manager¶
As we have explained earlier, the Cluster Manager initially creates three default users with predefined passwords:
standard user:
gurobi
/pass
administrator:
admin
/admin
system administrator:
sysadmin
/cluster
These default accounts are provided to simplify installation; you should change the passwords or delete the accounts before actually using the cluster.
You can check that you can log in using the sysadmin
account with
the grbcluster
command-line tool:
> grbcluster login --manager=http://mymanager:61080 --username=sysadmin
info : Using client license file '/home/jones/gurobi.lic'
Password for sysadmin:
info : User gurobi connected to http://mymanager:61080, session will expire on...
Log In without a Cluster Manager¶
With a self-managed cluster, there are no user accounts, and the access level is determined by the password used. Here are the default passwords (which can be changed in the configuration file):
standard user:
pass
administrator:
admin
system administrator:
cluster
In this case, you need to log in to one of the nodes and provide the system administrator password:
> grbcluster login --server=http://server1:61000
info : Using client license file '/home/jones/gurobi.lic'
Enter password (return to use default):
info : Connected to https://server1:61000
Note that the password you provide is stored in clear in the license file (for future use by other commands). With this in mind, make sure that access to the license file is restricted.
Accessing the Cluster¶
Once you have verified that you can log in, you should also check the list of nodes with the command:
> grbcluster nodes
ID ADDRESS STATUS TYPE LICENSE PROCESSING #Q #R JL IDLE %MEM %CPU
b7d037db server1:61000 ALIVE COMPUTE VALID ACCEPTING 0 0 10 <1s 10.89 4.99
You are ready to submit jobs if both of the following are true:
the
STATUS
column indicates that one or more servers areALIVE
the
LICENSE
column indicates that the license isVALID
.
If grbcluster
is unable to connect or if it does not show any live
nodes, then check your network and the log of the grb_rs
nodes (the
console output or <installdir>/bin/service.log
if started as a
service).
If a node has an INVALID
license, the ERROR
field will provide
more information about the error. For example:
> grbcluster node licenses
ID ADDRESS STATUS TYPE KEY EXP ORG USER APP VER CS DL ERROR
b7d037db server1:61000 INVALID NODE false 0 No Gurobi license found...
You may also want to verify that it is possible to submit a job to your cluster. To this end, you may want to identify a machine from which the users will typically submit jobs and install the gurobi client package. Then, you can submit a job with a command like the following:
> gurobi_cl misc07.mps
For more information on how to install the client and run gurobi_cl
,
please refer to the section about
using Remote Services.