Installing a Cluster Node¶
Once the Remote Services package is installed, you will need to set up a license, if necessary. Then, the Remote Services agent must be configured and started as a standard process or as a service. Finally, you should verify your installation.
Licensing¶
You will need to download and install a license file on all Compute Server nodes (no license file is required for a Distributed Worker node). You will find detailed instructions for downloading a license in the section How do I retrieve and set up a Gurobi license of the Getting Started Knowledge Base article.
We will just provide a quick summary of the process here. Your first step is to locate and download your license file from the Gurobi License Center. When you download the license file, we strongly recommend that you place it in the default location:
C:\gurobi\on Windows/opt/gurobi/on Linux/Library/gurobi/on macOSThe user’s home directory
You can also set the environment variable GRB_LICENSE_FILE to point
to this file.
In order to use the Cluster Manager, you will need to connect at least one Compute Server node to the cluster. When certain operations are requested such as submitting a job or a batch, the Cluster Manager will check the licenses available on the nodes. If none of the nodes have a valid Compute Server license, the operation will not be authorized.
Remote Services Agent (grb_rs)¶
To form a Remote Services cluster, you need to run the Remote Services
agent (grb_rs) on all of the nodes that make up the cluster. These
agents communicate amongst themselves, and also with the Cluster Manager
or the clients.
The primary task of the Remote Services agents is to collectively manage the queueing and the execution of jobs. The agents work together to balance the load by assigning a new job to the node with the fewest running jobs whenever possible. If all nodes are at capacity, newly submitted jobs will be queued, and the first node with available capacity will later execute the job. If a new node is added to the cluster, it will immediately start processing queued jobs.
The grb_rs executable provides several commands and flags to help in
the configuration and execution of the agent. We will review these
commands step by step in the following sections. You can see the full
list of commands in the reference section or
by using the command-line help:
> grb_rs --help
Configuring a Cluster Node¶
The Remote Services agent has a number of configuration properties that
affect its behavior. These can be controlled using a grb_rs.cnf
configuration file. The installation package includes a predefined
configuration file that can be used as a starting point
(<installdir>/bin/grb_rs.cnf).
The simplest way to modify the parameters is to edit the default
configuration file. Other options are available, though. The grb_rs
process uses the following precedence rules:
First priority: command-line flag
—-configSecond priority: a configuration file in the current directory
Third priority: a configuration file in a shared directory (
C:\gurobi,/opt/gurobi,/Library/gurobifor Windows, Linux and macOS platforms, respectively)Fourth priority: a configuration file in the directory where
grb_rsis located
Most of the properties that are configured through this file are related
to communication options and job processing options. The configuration
file is only read once, when grb_rs first starts. Subsequent changes
to the file won’t affect parameter values on a running server.
Configuration file format¶
The configuration file contains a list of properties of the form
PROPERTY=value. Lines that begin with the # symbol are treated as
comments and are ignored. Here is an example:
# grb_rs.cnf configuration file
PORT=61000
MANAGER=http://mymanager:61080
While you could create this file from scratch, we recommend you start with the version of this file that is included with the product and modify it instead.
The command grb_rs properties lists all of the available properties,
their default values, and provides documentation for each. Some can be
overridden on the command-line of grb_rs; the name of the
command-line flag you would use to do so is provided as well. Some
properties are important and must be changed for a production
deployment. However, we need to distinguish between deployment with a
Cluster Manager and without.
Important Properties with a Cluster Manager¶
When deploying a node with a Cluster Manager, the configuration is easier and you need to review the following properties:
- MANAGER
This is the URL of the manager.
- HOSTNAME
This must be the DNS name of the node that can be resolved from the other nodes or from the Cluster Manager.
grb_rstries to get a reasonable default value, but this value may still not be resolved by other nodes and could generate connection errors. It this case, you need to override this name in the configuration file with a fully qualified name of your node, for example:HOSTNAME=server1
Note that you do not need to give addresses that can be resolved by clients because all communication is routed through the Cluster Manager. The nodes are never accessed directly by the clients.
- CLUSTER_TOKEN
The token is a private key that enables different nodes to join the same cluster. All nodes of a cluster and the Cluster Manager must have the same token. We recommended that you generate a brand new token when you set up your cluster. The
grb_rs tokencommand will generate a random token, which you can copy into the configuration file.- JOBLIMIT
This property sets the maximum number of jobs that can run concurrently when using Compute Server on a specific node. The limit can be changed on a running cluster using the
grbcluster node config —-job-limit <new_limit>command, in which case the new value will persist and the value in the configuration file will be ignored from that point on (even if you stop and restart the cluster).- HARDJOBLIMIT
Certain jobs (those with priority 100) are allowed to ignore the JOBLIMIT, but they aren’t allowed to ignore this limit. Client requests beyond this limit are queued, irrespective of their priority. This limit is set to 0 by default which means that it is disabled and jobs with priority 100 will not be queued.
Important Properties without a Cluster Manager¶
When installing a node that will not be connected to a Cluster Manager, authentication of clients uses predefined passwords that must be stored in the configuration file. The default configuration files must be reviewed and the following properties must be changed for a production deployment:
- HOSTNAME
This must be the DNS name of the node that can be resolved from the other nodes or the clients in your network.
grb_rstries to get a reasonable default value, but this value may still not be resolved by clients and could generate connection errors. It this case, you need to override this name in the configuration file with a fully qualified name of your node, for example:HOSTNAME=server1
If the names cannot be resolved by clients, another option is to use IP addresses directly, in this case set this property to the IP address of the node.
- CLUSTER_TOKEN
The token is a private key that enables different nodes to join the same cluster. All nodes of a cluster must have the same token. We recommended that you generate a brand new token when you set up your cluster. The
grb_rs tokencommand will generate a random token, which you can copy into the configuration file.- PASSWORD
This is the password that clients must supply in order to access the cluster. It can be stored in clear text or hashed. We recommended that you create your own password, and that you store it in hashed form. You can use the
grb_rs hashcommand to compute the hashed value for your chosen password. Note that clients must provide the original password (not hashed) and it will be exchanged encrypted if HTTPS is used.grb_rs hash newpass $$ppEieKZExlBR-pCSUMlmc4oWlG8nZsUOE2IM0hJbzsmV_Yjj
Then copy and paste the value in the configuration file:
PASSWORD=$$ppEieKZExlBR-pCSUMlmc4oWlG8nZsUOE2IM0hJbzsmV_Yjj
The default password is
pass.- ADMINPASSWORD
This is the password that clients must supply in order to run restricted administrative job commands. It can be stored in clear text or hashed. We recommended that you create your own password, and that you store it in hashed form. You can use the
grb_rs hashcommand to compute the hashed value for your chosen password. Note that clients must provide the original password (not hashed) and it will be exchanged encrypted if HTTPS is used. The default password isadmin.- CLUSTER_ADMINPASSWORD
This is the password that clients must supply in order to run restricted administrative cluster commands. It can be stored in clear text or hashed. We recommended that you create your own password, and that you store it in hashed form. You can use the
grb_rs hashcommand to compute the hashed value for your chosen password. Note that clients must provide the original password (not hashed) and it will be exchanged encrypted if HTTPS is used. The default password iscluster.- JOBLIMIT
This property sets the maximum number of jobs that can run concurrently when using Compute Server on a specific node. The limit can be changed on a running cluster using the
grbcluster node config —-job-limit <new_limit>command, in which case the new value will persist and the value in the configuration file will be ignored from that point on (even if you stop and restart the cluster).- HARDJOBLIMIT
Certain jobs (those with priority 100) are allowed to ignore the JOBLIMIT, but they aren’t allowed to ignore this limit. Client requests beyond this limit are queued, irrespective of their priority. This limit is set to 0 by default which means that it is disabled and jobs with priority 100 will not be queued.
Starting a Cluster Node as a Process¶
Once you have installed the Remote Services package (including
retrieving and installing your license file and, for Linux users,
setting your PATH variable), starting grb_rs as a standard
process is quite straightforward. From a terminal window with
administrator privileges, simply issue the following command:
> grb_rs
If you are using a Cluster Manager and you did not set the MANAGER
configuration property you can specify it on the command-line:
> grb_rs --manager=http://mymanager:61080
Both commands will start the Remote Services agent on the default port (port 80), and you should see output like the following:
info : Reading configuration file: /home/jones/gurobi_server1200/linux64/bin/grb_rs.cnf
info : Gurobi Remote Services starting...
info : Platform is linux
info : Version is 12.0.0\ (build v12.0.0rc0)
info : Variable GRB_LICENSE_FILE is not set
info : License file found at /home/jones/gurobi.lic
info : Node address is server1
info : Node FQN is server1
info : Node has 8 cores
info : Using data directory /home/jones/gurobi_server1200/linux64/bin/data
info : Data store created
info : Available runtimes: [... 12.0.0]
info : Public root is /home/jones/gurobi_server1200/linux64/resources/grb_rs/public
info : Starting API server (HTTP) on port 80...
If you do not have administrator privileges or if the default port is already in use, you will see an error about opening the port. For example, on Linux you might see an error like this:
fatal : Gurobi Remote Services terminated, listen tcp :80: bind: permission denied
or
fatal : Gurobi Remote Services terminated, listen tcp :80: bind: address already in use
Note that grb_rs does not have to be run with elevated privileges,
but it does need elevated privileges to use the default port 80.
If you would like to run grb_rs on a non-default port, use the
—-port flag or set the PORT property in the configuration file.
For example:
> grb_rs --manager=http://mymanager:61080 --port=61000
The Remote Services agent (grb_rs) needs a directory to store
various files, including the runtimes, job metadata, job log files, etc.
The default location is a directory named data, located in the same
directory as the grb_rs executable (<installdir>/bin/data). If
you have a directory named data in your current directory, it will
use that location instead.
If starting grb_rs produces an error message that indicates that
there was a problem creating the storage service (as shown below), a
likely cause is that another grb_rs process is already running.
fatal : Error creating storage service: Error opening data store: timeout
Starting a Cluster Node as a Service¶
While you always have the option of running grb_rs from a terminal
and leaving the process running in the background, we recommended that
you start it as a service instead, especially in a production
deployment. The advantage of a service is that it will automatically
restart itself if the computer is restarted or if the process terminates
unexpectedly.
grb_rs provides several commands that help you to set it up as a
service. These must be executed with administrator privileges:
- grb_rs install
Install the service. The details of exactly what this involves depend on the host operating system type and version: this uses
systemdorupstarton Linux,launchdon macOS, and Windows services on Windows.- grb_rs start
Start the service (and install it if it hasn’t already been installed).
- grb_rs stop
Stop the service.
- grb_rs restart
Stop and then start the service.
- grb_rs uninstall
Uninstall the service.
Note that the install command installs the service using default
settings. If you don’t need to modify any of these, you can use the
start command to both install and start the service. Otherwise, run
install to register the service, then modify the configuration (the
details are platform-dependent and are touched on below), and then run
start the service.
Note that you only need to start the service once; grb_rs will keep
running until you execute the grb_rs stop command. In particular, it
will start again automatically if you restart the machine.
Note also that the start command does not take any flags or
additional parameters. All of the configuration properties must be set
in the grb_rs.cnf configuration file. If you need to make a change,
edit the configuration file, then use the stop command followed by
the start command to restart grb_rs with the updated
configuration.
The one exception is the JOBLIMIT property, which can be changed on
a live server using grbcluster. If you change this property and
later restart the server, the new value will persist and the value in
the configuration file will be ignored.
The exact behavior of these commands varies depending on the host operating system and version:
Linux¶
On Linux, grb_rs supports two major service managers: systemd
and upstart. The install command will detect the service manager
available on your system and will generate a service configuration file
located in /etc/systemd/system/grb_rs.service or
/etc/init/grb_rs.conf for systemd and upstart, respectively.
Once the file is generated, you can edit it to set advanced properties.
Please refer to the documentation of systemd or upstart to learn
more about service configuration.
Use the start and stop commands to start and stop the service.
When the service is running, log messages are sent to the Linux
syslog and to a rotating log file, service.log, located in the
same directory as grb_rs.
The uninstall command will delete the generated file.
macOS¶
On macOS, the system manager is called launchd, and the install
command will generate a service file in
/Library/LaunchDaemons/grb_rs.plist. Once the file is generated, you
can edit it to set advanced properties. Please refer to the launchd
documentation to learn more about service configuration.
Use the start and stop commands to start and stop the service.
When the service is running, log messages are sent to the macOS
syslog and to a rotating log file, service.log, located in the
same directory as grb_rs.
The uninstall command will delete the generated file.
Windows¶
On Windows, the install command will declare the service to the
operating system. If you wish to set advanced properties for the service
configuration, you will need to start the Services configuration
application. Please refer to the Windows Operating System documentation
for more details.
Use the start and stop commands to start and stop the service.
When the service is running, log messages are sent to the Windows event
log and to a rotating log file, service.log, located in the same
directory as grb_rs. Note that the service must run as a user that
has write permissions to this directory; otherwise, no log file will be
generated.
The uninstall command will delete the service from the registry.
Verification¶
Once you have grb_rs running, you can check to make sure that you
will be able to submit jobs to it.
Log In with a Cluster Manager¶
As we have explained earlier, the Cluster Manager initially creates three default users with predefined passwords:
standard user:
gurobi/passadministrator:
admin/adminsystem administrator:
sysadmin/cluster
These default accounts are provided to simplify installation; you should change the passwords or delete the accounts before actually using the cluster.
You can check that you can log in using the sysadmin account with
the grbcluster command-line tool:
> grbcluster login --manager=http://mymanager:61080 --username=sysadmin
info : Using client license file '/home/jones/gurobi.lic'
Password for sysadmin:
info : User gurobi connected to http://mymanager:61080, session will expire on...
Log In without a Cluster Manager¶
With a self-managed cluster, there are no user accounts, and the access level is determined by the password used. Here are the default passwords (which can be changed in the configuration file):
standard user:
passadministrator:
adminsystem administrator:
cluster
In this case, you need to log in to one of the nodes and provide the system administrator password:
> grbcluster login --server=http://server1:61000
info : Using client license file '/home/jones/gurobi.lic'
Enter password (return to use default):
info : Connected to https://server1:61000
Note that the password you provide is stored in clear in the license file (for future use by other commands). With this in mind, make sure that access to the license file is restricted.
Accessing the Cluster¶
Once you have verified that you can log in, you should also check the list of nodes with the command:
> grbcluster nodes
ID ADDRESS STATUS TYPE LICENSE PROCESSING #Q #R JL IDLE %MEM %CPU
b7d037db server1:61000 ALIVE COMPUTE VALID ACCEPTING 0 0 10 <1s 10.89 4.99
You are ready to submit jobs if both of the following are true:
the
STATUScolumn indicates that one or more servers areALIVEthe
LICENSEcolumn indicates that the license isVALID.
If grbcluster is unable to connect or if it does not show any live
nodes, then check your network and the log of the grb_rs nodes (the
console output or <installdir>/bin/service.log if started as a
service).
If a node has an INVALID license, the ERROR field will provide
more information about the error. For example:
> grbcluster node licenses
ID ADDRESS STATUS TYPE KEY EXP ORG USER APP VER CS DL ERROR
b7d037db server1:61000 INVALID NODE false 0 No Gurobi license found...
You may also want to verify that it is possible to submit a job to your cluster. To this end, you may want to identify a machine from which the users will typically submit jobs and install the gurobi client package. Then, you can submit a job with a command like the following:
> gurobi_cl misc07.mps
For more information on how to install the client and run gurobi_cl,
please refer to the section about
using Remote Services.