Programs for executing remote functions and remote methods are written using the GridRPC API. The creation of these programs is described in section 4.6.
(If the application program is written in C, there is no particular need for the NG_COMPILER settings.)
The NG_COMPILER environment variable is used to specify the compiler to be used to compile the Ninf-G Client. The default compiler for ng_cc is cc. Required options can be specified in NG_COMPILER in addition to the compiler name (path). When g++ is used, for example, NG_COMPILER is set in the following way.
'g++ -Wall -g'
(If the application program is written in C, there is no particular need for the NG_LINKER settings.)
The NG_ LINKER environment variable is used to specify the linker to be used to link the Ninf-G Client. The default linker for ng_cc is cc. Required options can be specified in NG_ LINKER in addition to the linker name (path). When g++ is used, for example, NG_ LINKER is set in the following way.
'g++ -Wall -g'
The application program created earlier is compiled using the ng_cc command, thus creating a Ninf-G Client program. An example of using the ng_cc command is shown below.
% ${NG_DIR}/bin/ng_cc -g -o test_client test_client.c
test_client.c: Ninf-G Client source program
test_client: Ninf-G Client executable program
(The results of compiling the source code.)
This is the configuration file for settings concerning servers, clients, and the MDS.
(The configuration file specifications are described in section 4.3.)
Prepare the following files as specified in the configuration file.
Note: The local LDIF file is generated when the Ninf-G Executable is created on the server. When used, the local LDIF file that is generated on the server that executes the remote functions or remote methods must be copied to a place where it can be used from the Ninf-G Client.
Note: The environment variable LD_LIBRARY_PATH must be specified appropriately in the client configuration file if the Ninf-G Executable file will be staged to a system whose software environment is different from the build environment of the Ninf-G Executable.
The Ninf-G Client configuration file is a text file which contains settings information that is required for the operation of a Ninf-G Client. It consists of seven sections, which are INCLUDE, CLIENT, LOCAL_LDIF, FUNCTION_INFO, MDS_SERVER, INVOKE_SERVER, SERVER, and SERVER_DEFAULT.
Examples of section and attribute descriptions are shown below.
#comment <section> attribute value # comment attribute value # comment attribute value # comment </section> ... |
Examples of errors in description are shown below.
<section>Attribute Value # Section and attribute on the same line AttributeValue # No delimiter between the attribute and attribute value Attribute Value Attribute Value # Multiple attributes on the same line Attribute # Attribute and attribute value Value # extend across more than one line. Attribute \ Value # Line continued with a backslash (\) Attribute # No attribute value </section> <CLIENT> # Multiple definitions in a section where ... # </CLIENT> # <CLIENT> # multiple definitions are not allowed ... # </CLIENT> # <MDS_SERVER> # Attribute value redundancy within a section hostname example.org # where multiple definitions are allowed </MDS_SERVER> <MDS_SERVER> hostname example.org # </MDS_SERVER> # <LOCAL_LDIF> # Attribute value redundancy filename example.ngdef # for an attribute that allows filename example.ngdef # multiple attribute values </LOCAL_LDIF> <SERVER> HostName example.com # Upper and lower case letters hostname EXAMPLE.COM # Upper and lower case letters </SERVER> |
When a time value is specified as an attribute value (for time-out or other such attributes), a unit of time can be specified (e.g., 30 s, 30 sec or 30 seconds).
It is also possible to specify 'minute' or 'hour' as the time unit. Character strings such as 'se' and 'seco' are also interpreted to mean second, but strings that are not contained in 'second', such as 'set' will cause an error.
When specifying the number of bytes as an attribute value for attributes such as log file size, units (*) such as 1 K or 1 Kilo can be specified. Mega and Giga can also be specified as a unit for number of bytes.
Character strings such as Me, Meg, etc. are also interpreted to mean Mega, but strings that are not contained in Mega, such as Ma cause an error.
(*) 1 K = 1024 bytes, 1 M = 1024 Kbytes, 1 G = 1024 Mbytes
Each section of the configuration file is described below.
The INCLUDE section allows multiple definitions.
An example of an INCLUDE section description is shown below.
<INCLUDE> filename file name filename file name </INCLUDE> |
The attributes and attribute values of the INCLUDE section are shown below.
Attribute | Attribute value | Default value | Multiple | Explanation |
---|---|---|---|---|
filename | File name | None | Yes | File to be included |
The file name of the configuration file to be read is specified. The file to be read must be in the configuration file format.
The CLIENT section does not allow multiple definitions.
An example of a CLIENT section description is shown below.
<CLIENT> hostname Host name save_sessionInfo Session information count loglevel [0-5] loglevel_globusToolkit [0-5] loglevel_ninfgProtocol [0-5] loglevel_ninfgInternal [0-5] loglevel_ninfgGrpc [0-5] log_filePath File name log_suffix Suffix log_nFiles Number of files log_maxFileSize Number of bytes log_overwriteDirectory [true/false] tmp_dir Directory refresh_credential Seconds invoke_server_log log name fortran_compatible [true/false] handling_signals Signals ... listen_port Port number listen_port_authonly Port number listen_port_GSI Port number listen_port_SSL Port number tcp_nodelay [true/false] </CLIENT> |
The attributes and attribute values of the CLIENT section are shown below.
Attribute | Attribute value | Default value | Multiple | Explanation |
---|---|---|---|---|
hostname | Host name | globus_libc_ gethostname() | No | The host name of the client |
save_sessionInfo | Session information count | 256 | No | The number of session information units to be saved |
loglevel | [0-5] | 2 | No | Overall log level |
loglevel_globusToolkit | [0-5] | 2 | No | Globus Toolkit API error log level |
loglevel_ninfgProtocol | [0-5] | 2 | No | Ninf-G protocol error log level |
loglevel_ninfgInternal | [0-5] | 2 | No | Ninf-G internal error log level |
loglevel_ninfgGrpc | [0-5] | 2 | No | Grid RPC API error log level |
log_filePath | File name | stderr | No | The log file name |
log_suffix | Suffix | Sequence number | No | The log file suffix |
log_nFiles | Number of files | 1 | No | The number of files for log |
log_maxFileSize | Number of bytes | 1M/unlimited | No | Maximum number of bytes for log file |
log_overwriteDirectory | [true/false] | false | No | Over-write permission for log directory |
tmp_dir | For temporary files | /tmp | No | The directory in which temporary files are placed |
refresh_credential | Seconds | 0 | No | Refreshing proxy credential interval |
invoke_server_log | log name | None | No | Invoke Server log filename |
fortran_compatible | [true/false] | false | No | Fortran compatible mode |
handling_signals | Signal names/numbers | SIGHUP SIGINT SIGTERM | No | Handling signals |
listen_port | Port number | 0 | No | The port number for listening requests for unencrypted connections |
listen_port_authonly | Port number | 0 | No | The port number for listening requests for connections by authentication only |
listen_port_GSI | Port number | 0 | No | The port number for listening requests for connections encrypted by GSI |
listen_port_SSL | Port number | 0 | No | The port number for listening requests for connections encrypted by SSL. |
tcp_nodelay | [true/false] | false | No | TCP_NODELAY socket option |
The log level can be specified by using the strings listed in the 'Meaning' column in the table below as well as by using a number value.
The meanings of the log level values are described below.
Value | Meaning | Explanation |
---|---|---|
0 | Off | Nothing is output. |
1 | Fatal | A fatal error is output. |
2 | Error | A nonfatal error is output. |
3 | Warning | A warning error is output. |
4 | Information | Guidance or other such information is output. |
5 | Debug | Debugging information is output. |
This host name is the host name of the client machine on which the Ninf-G Client is running. It is used by the Ninf-G Executable when connecting to a Ninf-G Client.
The host name can be specified by an IP address such as 192.168.0.1 as well as by the ordinary host name.
If omitted, the hostname returned by the Globus Toolkit API globus_libc_gethostname() function is used. The hostname is equivalent to the output of the globus-hostname command.
The main purposes for which this is used are described below.
Note: If the hostname attribute is set, the environment variable GLOBUS_HOSTNAME is overwritten by its value.
Note: The hostname attribute may not be set appropriately when globus_libc_gethostname() is called before grpc_initialize().
This is the number of session information units stored internally by Ninf-G.
If the number defined here is exceeded, the session entries are discarded beginning with the oldest first.
If omitted, the value 256 is used.
The log level is specified for all log categories by log level and for each category individually by loglevel_*.
When the log level for each category has not been specified, the log level for all categories is applied.
When omitted, the value of the NG_LOG_LEVEL environment variable
is used.
If the NG_LOG_LEVEL environment variable is not set, 2 (Error)
is used as the default value of loglevel.
The name of the file to which the log is output is specified in the log file name.
The file name may include a path that includes a directory (e.g., "/home/myHome/var/logFile").
The file and directory name can include the following specifiers.
"%t" is replaced with the date as year, month and day, and the time in hours, minutes, seconds and milliseconds ("yyyymmdd-hhmmss-MMM") (e.g., "/ home/myHome/var/logDir%t/logFile" is replaced by "/home/myHome/var/logDir20030101-154801-123/logFile").
"%h" is replaced with the Ninf-G Client hostname.
"%p" is replaced with the process id of the Ninf-G Client.
When omitted, the log is output to standard error. If the log file name is omitted, the log_suffix, log_nFiles, and log_maxFileSize are ignored.
When a log file is specified, this specifies the suffix used when the log file is created.
If a suffix is specified, the generated file name will be from "filename[000].suffix" to "filename[nnn].suffix". If omitted, the generated file name will be from "filename.[000]" to "filename.[nnn]". The number of files minus 1 is "nnn."
The number of digits in "nnn" is the same as the number of digits in number of files minus 1. For example, if the number of files is set to 100, then the number will range from "00" to "99."
This is the number of files created for log output.
0 indicates that an unlimited number of files can be output. A negative value results in an error.
If omitted, the value 1 is used.
This is the maximum number of bytes for the log file.
If omitted, the value will be unlimited if the number of files is one, or 1Mbyte if the number of files is two or more.
This establishes overwrite permission for the directory. If the specified directory exists, this specifies whether the creation of log files in that directory is enabled or disabled. Operation in the case that the directory exists is shown below.
The directory in which temporal files are placed.
When omitted, TMPDIR environment variable is used if it is defined. Otherwise, "/tmp" is used.
This specifies the interval for refreshing the proxy credential.
If the value 0 is specified, Ninf-G Client will not refresh the proxy credential. If a value of 1 or greater is specified, Ninf-G Client will refresh the proxy credential and send it to Job Manager. If a negative value is specified, an error results.
If omitted, the value 0 is used.
Note: refreshing the proxy credential on a client program feature is supported only in the pthread version. It's OK to build a Ninf-G Executable with both a pthread version and a non-thread version of Globus Toolkit flavor.
Note: Ninf-G Java Client does not support this feature.
This specifies the Invoke Server log file name. If this attribute is specified, the suffix for Invoke Server name and Invoke Server number is added to the log file name, and then output.
If omitted, no values are used.
Note: This attribute appeared in Ninf-G Version 4.0.0.
This specifies a mode of argument passing.
By default, a scalar argument for remote functions and remote methods is passed by an immediate value. If this attribute is set to true, a scalar argument is passed by a pointer to the value. Otherwise, a scalar argument is passed as an immediate value.
If omitted, the default value 'false' is used.
Note: This attribute has been added in Ninf-G Version 4.1.0.
Note: Ninf-G Java Client does not support this feature.
This attribute specifies signals which will be caught by Ninf-G Client.
When the Ninf-G Client catches the signal, Ninf-G cleans up all temporary files, cancels all jobs, and exits. This clean up process is performed only for signals which are specified in this attribute.
The signals are specified by either signal name or signal number. Multiple signals can be specified by space-delimited enumeration. The value "none" can be specified if no signals need to be caught.
If ommitted, SIGINT, SIGTERM and SIGHUP will be caught by Ninf-G Client.
Note: This attribute is available for Ninf-G Version 4.2.0 or later.
Note: Ninf-G Java Client does not support this feature.
This specifies the client port number for listening requests for unencrypted connections. If the 0 value is specified, an arbitrary port number is used.
If omitted, the default value '0' is used.
Note: This attribute has been added in Ninf-G Version 4.2.0.
This specifies the client port number for listening requests for connections by authentication only. If the 0 value is specified, an arbitrary port number is used.
If omitted, the default value '0' is used.
Note: This attribute has been added in Ninf-G Version 4.2.0.
This specifies the client port number for listening requests for connections encrypted by GSI. If the 0 value is specified, an arbitrary port number is used.
If omitted, the default value '0' is used.
Note: This attribute has been added in Ninf-G Version 4.2.0.
This specifies the client port number for listening requests for connections encrypted by SSL. If the 0 value is specified, an arbitrary port number is used.
If omitted, the default value '0' is used.
Note: This attribute has been added in Ninf-G Version 4.2.0.
This specifies whether or not to set TCP_NODELAY for both ends of connections between a Ninf-G Client and Ninf-G Executables.
If omitted, "false" is used.
Note: This option is available for Ninf-G Version 4.2.3 or later.
Note: The following is a report from users.
If the size of transferred arguments or results is less than 1.5KB, performance of data transfer is improved by setting tcp_nodelay to true.
The LOCAL_LDIF section allows multiple definitions. The LOCAL_LDIF section may be omitted.
An example of a LOCAL_LDIF section description is shown below.
<LOCAL_LDIF> filename file name filename file name </LOCAL_LDIF> |
The attributes and attribute values of the LOCAL_LDIF section are shown below.
Attribute | Attribute value | Default value | Multiple | Explanation |
---|---|---|---|---|
filename | File name | None | Yes | Local LDIF filename |
This specifies the local LDIF file that contains the Ninf-G Executable information. One file name cannot have multiple descriptions. The local LDIF file is generated by the ng_gen command.
The FUNCTION_INFO section allows multiple definitions. The FUNCTION_INFO section may be omitted.
An example of a FUNCTION_INFO section description is shown below.
<FUNCTION_INFO> hostname Host name funcname Function name path Path staging [true/false] backend Backend session_timeout Seconds </FUNCTION_INFO> |
The attributes and attribute values of the FUNCTION_INFO section are shown below.
Attribute | Attribute value | Default value | Multiple | Explanation |
---|---|---|---|---|
hostname | Host name | None | No | Server machine host name |
funcname | Function name | None | No | The function name of the remote function |
path | Path | None | No | The path to the Ninf-G Executable |
staging | [true/false] | false | No | Staging enabled or disabled |
backend | Backend | normal | No | Backend software by which Ninf-G Executable is launched |
session_timeout | Seconds | 0 | No | RPC execution timeout |
This specifies the host name of the server machine.
It cannot be omitted.
This specifies the function name of the remote function.
It cannot be omitted.
This specifies the path to the Ninf-G Executable. If staging is set to true, the path on the client machine is specified. If staging is set to false, the path on the server machine is specified.
It cannot be omitted.
This specifies whether or not staging (*) is to be executed. If 'true' is specified, staging is executed.
If omitted, the value is taken to be 'false.'
(*) A function for starting up the Ninf-G Executable located on the client machine after transfer to the server machine.
Note: Invoke Server GT4py requires some steps in advance. Details are described in 4.4.1.5 Using staging function on Invoke Server GT4py.
This specifies backend software by which the Ninf-G Executable is launched. Backend should be either normal, mpi or blacs. If the backend is normal, Ninf-G Executable is launched directly by GRAM. If the backend is mpi, GRAM will use the mpirun command to launch the Ninf-G Executable as an MPI processes. blacs is used when Ninf-G Executable should be launched by blacs.
Backend should be specified if neither MDS nor local LDIF is used for execution, and users intend to use mpi or blacs for launching the Ninf-G Executable.
If omitted, the value is taken to be 'normal.'
This specifies the RPC execution timeout value. If the RPC execution time exceeds the timeout value, then the outstanding RPC will be terminated and returned as a timeout error. The handle which was associated with the RPC becomes inoperative and will not be able to be used for any RPCs.
Measurement of the execution time of an RPC is started when a session invocation API such as grpc_call() is called. The execution time of an RPC involves not only the time for computation of the remote library but also any other unexpected time. For example, the timeout error may occur when the job will not be invoked due to an unknown reason.
The session_timeout attribute can be used for avoiding unexpected freezes of the Ninf-G Client caused by rare-case accidents on Ninf-G Executables.
If 0 is specified, then the timeout feature is disabled. The default value of session_timeout is 0.
Note: The session_timeout feature is supported only for pthreads flavors.
The MDS_SERVER section allows multiple definitions. The MDS_SERVER section can be omitted.
When an MDS request for information is made, the request is issued to the MDS server specified by the MDS_SERVER section defined in the configuration file. The search is executed repeatedly in order, beginning with the first definition, until the information has been found or the last MDS_SERVER defined in the section has been searched for information.
An example of MDS_SERVER section description is shown below.
<MDS_SERVER> hostname Host name tag Tag name type Type port Port number protocol protocol name path service path subject "Subject" vo_name VONAME client_timeout Seconds server_timeout Seconds </MDS_SERVER> |
The attributes and attribute values of the MDS_SERVER section are shown below.
Attribute | Attribute value | Default value | Multiple | Explanation |
---|---|---|---|---|
hostname | Host name | None | No | MDS server host name |
tag | Tag name | None | No | MDS Server tag name |
type | Type | MDS2 | No | MDS type |
port | Port number | 2135 | No | MDS server port number |
protocol | protocol name | https | No | MDS protocol |
path | service path | default service path | No | MDS path |
subject | Subject | None | No | MDS subject |
vo_name | VONAME | Local | No | GIIS vo name |
client_timeout | Seconds | 0 | No | The client time-out time |
server_timeout | Seconds | 0 | No | The server time-out time |
This specifies the host name of the MDS server.
The MDS server host name cannot be omitted.
This specifies the tag name of the MDS Server setting. This tag name is used to specify in a <SERVER> section.
"tag" for <MDS_SERVER> section has been introduced to allow to define multiple <MDS_SERVER> sections for the same MDS server. Any tag name in a configuration file must be unique.
If omitted, no values are used.
Note: This attribute appeared in Ninf-G Version 4.0.0.
This specifies the type of MDS.
MDS has 2 types.
If omitted, MDS2 is used.
Note: This attribute appeared in Ninf-G Version 4.0.0.
This specifies the port number of the MDS server.
If omitted, the default port (2135 on MDS2, 8443 on MDS4) is used.
This specifies the protocol part of the MDS4 URL. (http or https)
If omitted, https is used.
Note: This attribute appeared in Ninf-G Version 4.0.0.
This specifies the path part of the MDS4 URL.
If omitted, the default service URL path (shown below) is used. /wsrf/services/org/apgrid/ninf/ng4/grpcinfo/GrpcInfoIndexService
Note: This attribute appeared in Ninf-G Version 4.0.0.
This specifies the subject used for authentication of MDS4 access to Web Service containers. This is useful for Web Service containers invoked by unprivileged users.
In order to include space characters, use double quote characters ("). Example: "/C=JP/O=EXAMPLE/OU=GRID/CN=Example of Subject".
If omitted, this value is not used.
This specifies the vo name of GIIS.
If omitted, "local" is used.
The client time-out value specifies the time-out time for connection between client and server.
If the value 0 is specified, there is no time-out in waiting for a response.
If omitted, the default value 0 is used.
The server time-out value specifies the time-out time for connection between servers.
If the value 0 is specified, there is no time-out in waiting for a response.
If omitted, the default value 0 is used.
The INVOKE_SERVER section allows multiple definitions. The INVOKE_SERVER section can be omitted.
Note: This section appeared in Ninf-G Version 4.0.0.
Note: The Invoke Server feature is supported only for pthreads flavors.
An example of an INVOKE_SERVER section description is shown below.
<INVOKE_SERVER> type type name path path name max_jobs Number of jobs log_filePath path of logfile status_polling Seconds option "option string" </INVOKE_SERVER> |
The attributes and attribute values of the INVOKE_SERVER section are shown below.
Attribute | Attribute value | Default value | Multiple | Explanation |
---|---|---|---|---|
type | type name | None | No | Invoke Server type |
path | path name | $NG_DIR/bin/ ng_invoke_server.[type] |
No | Invoke Server executable file path |
max_jobs | Number of jobs | 0 | No | max jobs for one Invoke Server |
log_filePath | path | None | No | log file path |
status_polling | Seconds | 0 | No | Polling interval |
option | "option string" | None | Yes | option |
This specifies the type of Invoke Server.
The default Invoke Server executable file path is $NG_DIR/bin/ng_invoke_server.[type]
The following four types are available.
The following three types are available, but unsupported.
Each Invoke Servers have respective usages and requisites. See 4.4 Invoke Server setup for detail.
This cannot be omitted.
Note: The Invoke Server feature is supported only for pthreads flavors.
This specifies the executable file of Invoke Server.
The default executable file path is "$NG_DIR/bin/ng_invoke_server.[type]".
If omitted, the default executable file path are used.
This specifies the maximum number of jobs for one Invoke Server. If the number of jobs handled reaches the value of max jobs, the next Invoke Server is launched and subsequent jobs are handled by the next Invoke Server.
If 0 was specified, all job are handled by one Invoke Server.
If omitted, 0 is used.
This specifies the log file name for Invoke Server. Nothing are added as suffix of file name.
If omitted, <CLIENT> section invoke_server_log setting is used.
This specifies the status polling interval of the Invoke Server. If the Invoke Server implementation uses polling in getting job status, this polling interval is used.
If omitted, 0 is used.
This specifies the options to pass to Invoke Server, when a function handle is created. Each Invoke Server implementation can define this option for any reason.
If the value is enclosed by double-quote characters ("), the value can include the space character.
This attribute can specify multiple times. If omitted, the Invoke Server option is not used.
The SERVER section allows multiple definitions.
When the SERVER section contains multiple definitions, the following API checks to see if remote function information is registered in the first-defined SERVER. If it is, that server is used. If it is not, a check is made for registered remote function information in the second SERVER. This is repeated until remote function information is found.
An example of a SERVER section description is shown below.
<SERVER> hostname Host name hostname Host name host name host name ... tag Tag name port Port number mds_hostname Host name mds_tag Tag name invoke_server Type invoke_server_option "option string" mpi_runNoOfCPUs [function name=]number of CPUs gass_scheme [http/https] crypt [false/authonly/SSL/GSI] protocol [XML/binary] force_xdr [true/false] jobmanager JOBMANAGER subject "Subject" client_hostname Host name job_startTimeout Seconds job_stopTimeout Seconds job_maxTime Minutes job_maxWallTime Minutes job_maxCpuTime Minutes job_queue Queue name job_project Project name job_hostCount Number of nodes job_minMemory Size job_maxMemory Size job_rslExtensions "extension string" heartbeat Seconds heartbeat_timeoutCount Times heartbeat_timeoutCountOnTransfer Times redirect_outerr [true/false] tcp_connect_retryCount Counts tcp_connect_retryBaseInterval Seconds tcp_connect_retryIncreaseRatio Ratio tcp_connect_retryRandom [true/false] argument_transfer [wait/nowait/copy] compress [raw/zlib] compress_threshold Number of bytes argument_blockSize Number of bytes workDirectory Directory name coreDumpSize Size commLog_enable [true/false] commLog_filePath File name commLog_suffix Suffix commLog_nFiles Number of files commLog_maxFileSize Number of bytes commLog_overwriteDirectory [true/false] debug [true/false] debug_display DISPLAY debug_terminal Command path name debug_debugger Command path name debug_busyLoop [true/false] environment Variable name environment Variable name = value </SERVER> |
The attributes and attribute values of the SERVER section are shown below.
Attribute | Attribute value | Default value | Multiple | Explanation |
---|---|---|---|---|
hostname | Host name | None | Yes | Server machine host name |
tag | Tag name | None | No | Server tag name |
port | Port number | 2119 | No | The server port number on which the Globus gatekeeper is listening |
mds_hostname | Host name | None | No | MDS server host name |
mds_tag | Tag name | None | No | MDS tag name |
invoke_server | type | None | No | Invoke Server type |
invoke_server_option | option string | None | Yes | Invoke Server option |
mpi_runNoOfCPUs | Function name, Number of CPUs | None | Yes | The number of CPUs used by MPI function |
gass_scheme | [http/https] | http | No | GASS server scheme |
crypt | [false/authonly/SSL/GSI] | false | No | Method of authentication and encryption for communication paths |
protocol | [XML/binary] | XML | No | Specifies the protocol. |
force_xdr | [true/false] | false | No | Makes XDR compulsory. |
jobmanager | JOBMANAGER | None | No | The job manager used on the server machine |
subject | Subject | None | No | Subject of resource manager contact |
client_hostname | Host name | hostname of CLIENT section | No | Client machine host name |
job_startTimeout | Seconds | 0 | No | The time-out at job startup |
job_stopTimeout | Seconds | -1 | No | The time-out for when the job stops |
job_maxTime | Minutes | None | No | The maximum job execution time |
job_maxWallTime | Minutes | None | No | The maximum job execution wall clock time |
job_maxCpuTime | Minutes | None | No | The maximum job execution cpu time |
job_queue | queue name | None | No | A remote queue name |
job_project | project name | None | No | A remote project name |
job_hostCount | Number of nodes | None | No | Number of nodes (for SMP clusters) |
job_minMemory | Size | None | No | Minimum amount of memory, in Megabytes |
job_maxMemory | Size | None | No | Maximum amount of memory, in Megabytes |
job_rslExtensions | extension string | None | No | RSL extensions |
heartbeat | Seconds | 60 | No | The heart-beat interval |
heartbeat_timeoutCount | Times | 5 | No | The heart-beat time-out times |
heartbeat_timeoutCountOnTransfer | Times | 5 | No | The heart-beat time-out times on transfer |
redirect_outerr | [true/false] | true | No | Ninf-G Executable output redirect |
tcp_connect_retryCount | Counts | 4 | No | The maximum number of retries for a TCP connection |
tcp_connect_retryBaseInterval | Seconds | 1 | No | The base interval time for the first retry |
tcp_connect_retryIncreaseRatio | Ratio | 2.0 | No | The increase ratio for calculating the maximum interval time between retries |
tcp_connect_retryRandom | [true/false] | true | No | A flag that specifies whether the random value is used or not for the interval time |
argument_transfer | [wait/nowait/copy] | wait | No | Returns the called function for an asynchronous function call Timing (Wait or do not wait for completion of argument transfer.) |
compress | [raw/zlib] | raw | No | Compression method |
compress_threshold | Number of bytes | 64KBytes | No | Threshold for performing compression |
argument_blockSize | Number of bytes | 16KBytes | No | The block size of transferred arguments |
workDirectory | Directory name | The path to the Ninf-G Executable | No | The working directory for Ninf-G Executable |
coreDumpSize | Size | Undefined | No | Core dump size for Ninf-G Executable |
commLog_enable | [true/false] | false | No | Whether the communication log output is enabled or disabled |
commLog_filePath | File name | stderr | No | Communication log file name |
commLog_suffix | Suffix | Sequence number | No | The communication log file suffix |
commLog_nFiles | Number of files | 1 | No | The number of files for communication log output |
commLog_maxFileSize | Number of bytes | 1M/unlimited | No | Maximum number of bytes for the communication log file |
commLog_overwriteDirectory | [true/false] | false | No | Overwrite permission for the communication log directory |
debug | [true/false] | false | No | Whether the debugging function is enabled or not |
debug_display | DISPLAY | Environment variable | No | Debugging display |
debug_terminal | Command path name | Environment variable | No | Path to the debugging terminal emulator |
debug_debugger | Command path name | Environment variable | No | Debugger path |
debug_busyLoop | [true/false] | false | No | Wait for attach from debugger or not |
environment | Character string | None | Yes | Environment variable |
This specifies the host name of the server machine.
Multiple hostname attributes can be defined.
It is possible for multiple host names to be defined on one line.
This value cannot be omitted.
This specifies the tag name of a <SERVER> section.
"tag" for <SERVER> section has been introduced to allow to define multiple <SERVER> sections for the same server. APIs which create function handles or object handles accept tag name as well as hostname as the host name of the server. Any tag name in a configuration file must be unique.
The tag name can include neither '/' nor ':' character since those characters are reserved characters of Resource Manager Contact.
If omitted, no values are used.
Note: This attribute is available for Ninf-G Version 4.0.0 or later.
This specifies the port number on which the Globus gatekeeper is listening.
If omitted, the default '2119' value is used.
This is the MDS queried first when querying for information concerning the MPI of the server machine and remote function information. This specifies the host name of the server.
If the mds_hostname attribute is specified, the mds_tag attribute cannot be specified.
If omitted, no values are used.
This specifies the MDS tag name for the MDS server setting. If the mds_tag attribute is specified, the mds_hostname attribute cannot be specified.
If omitted, no values are used.
Note: This attribute appeared in Ninf-G Version 4.0.0.
This specifies the Invoke Server type to use for the server.
The attribute arguments are described in the <INVOKE_SERVER> section type attribute.
If omitted, the Invoke Server is not used.
Note: This attribute appeared in Ninf-G Version 4.0.0.
Note: The Invoke Server feature is supported only for pthreads flavors.
This specifies the options to pass to Invoke Server, when a function handle is created. Each Invoke Server implementation can define this option for any reason.
If the value is enclosed by double-quote characters ("), the value can include the space character.
This attribute can be specified multiple times. If omitted, the Invoke Server option is not used.
Note: This attribute appeared in Ninf-G Version 4.0.0.
This specifies the number of CPUs to be used when MPI is used on a server machine.
The number of CPUs for executing particular functions can be specified with the format "function name = number of CPUs."
If the function name is omitted, the default value for the number of CPUs for MPI on that server machine is used.
This specifies the scheme for the GASS server.
If omitted, http is used.
This specifies the method of authentication and encryption for communication paths. Choices are no authentication and no encryption (false), authentication only (authonly), authentication and encryption using SSL (SSL), and authentication and encryption using GSI (GSI).
If omitted, the default value 'false' is used.
Note: Encryption of transferred data is implemented using GSI, hence proxy certificate must be delegated to Ninf-G Executable. Therefore, encryption is basically available only for Pre-WS GRAM and WS GRAM.
Note: 'authonly' is available in Ninf-G Version 4.2.0 or later.
This specifies the protocol to be used between a Ninf-G Client and Ninf-G Executable. Either XML or binary can be specified.
If omitted, XML is used.
This specifies whether or not to force the use of XDR in the protocol between a Ninf-G Client and Ninf-G Executable.
If XML is used as the protocol, this setting has no effect. The main purpose of using this is for measurement of processing speed when XDR is used.
If omitted, the default value 'false' is used.
This specifies the job manager to be used on the server machine. Any of jobmanager-fork, jobmanager-pbs, jobmanager-gdr, or jobmanager-lsf can be specified, depending on the server machine settings.
If omitted, the default job manager on the server machine is used.
This specifies the subject part of the Globus Toolkit GRAM resource manager contact. The subject is usually used for the Globus personal gatekeeper.
If the value is enclosed by double-quote characters ("), the value can include the space character, as in "/C=JP/O=EXAMPLE/OU=GRID/CN=Example of Subject".
If omitted, no value is used.
This specifies the host name of the client machine.
The Ninf-G Executable on the server will connect back
to the client machine which is specified by this attribute.
The attribute enables each server to use different names of the
client machine according to the network configuration of the client
and the servers.
If omitted, hostname of CLIENT section is used.
Note: This attribute is not available if you use PreWS GRAM and enable redirect_outerr or executable staging.
This specifies the time-out time for job startup.
When grpc_call(), grpc_invoke_np() or another such RPC is executed, if the job has not started after this time has passed since the job start request was issued, a time-out occurs; each API ends and returns an error.
If the 0 value is specified, there is no time-out and the process waits until the job starts. If a value of 1 or greater is specified, the process waits that amount of time for the job to start. If a negative value is specified, an error results.
If omitted, the 0 value is used.
When grpc_function_handle_destruct(), grpc_object_handle_destruct() or other such job stop request is issued by the API, if the job has not stopped after this time elapses, a time-out occurs; each API ends and returns an error.
If a negative value is specified, there is no time-out and the process waits until the job stops. If the 0 value is specified, the process doesn't wait for the job to stop. If a value of 1 or greater is specified, the process waits that amount of time for the job to stop.
If omitted, the -1 value is used.
Note: This attribute changed in Ninf-G Version 2.4.0.
In Ninf-G Version 2.3.0 or former:
If the 0 value is specified, there is no time-out and the process
waits until the job stops. If a negative value is specified, an
error results.
This specifies the maximum job execution time. The value specified is used to pass the Globus GRAM RSL attribute "maxTime." The units are in minutes.
If omitted, no values are used.
This specifies the maximum job execution wall clock time. The value specified is used to pass the Globus GRAM RSL attribute "maxWallTime." The units are in minutes.
If omitted, no values are used.
This specifies the maximum job execution cpu time. The value specified is used to pass the Globus GRAM RSL attribute "maxCpuTime." The units are in minutes.
If omitted, no values are used.
Target the GRAM job to a queue (class) name as defined by the scheduler at the defined (remote) resource.
If omitted, no values are used.
Target the GRAM job to be allocated to a project account as defined by the scheduler at the defined (remote) resource.
If omitted, no values are used.
Defines the number of nodes (hosts) to distribute the Ninf-G Executable processes created by handle array init API across. This attribute only applies to clusters of SMP computers.
If omitted, no values are used.
Note: There is a bug in jobmanager-pbs, so jobmanager-pbs doesn't work with this attribute variable.
Specify the minimum amount of memory required for a Ninf-G Executable process. Units are in Megabytes.
If omitted, no values are used.
Specify the maximum amount of memory required for a Ninf-G Executable process. Units are in Megabytes.
If omitted, no values are used.
This specifies the WS GRAM RSL extensions. This attribute is available for Invoke Server GT4py and for PreWS GRAM.
WS GRAM RSL extensions is currently used only to specify client-specific data which the client wishes to associate with the job it is controlling.
WS GRAM RSL extensions can be processed by user defined WS GRAM jobmanager scripts. For Globus Toolkit 4.0.1, calling $description->extensions() subroutine in the file $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager/fork.pm implements accessibility to the given RSL extensions. (See Globus Toolkit WS GRAM Users Guide for details.)
In addition, this attribute is also used to specify user defined PreWS GRAM attributes. Attribute values will just be added to the end of the RSL.
If the attribute value is enclosed by double-quote characters ("), the value can include the space and other characters.
In the string enclosed by double-quote characters, some characters are considered as escape characters.
Here is an example valid usage.
|
This attribute can be specified multiple times. If omitted, job_rslExtensions is not used.
Note: This attribute is available for Ninf-G Version 4.0.0 or later.
This specifies the interval for sending the heart-beat from Ninf-G Executable to Ninf-G Client.
If the value 0 is specified, the heart-beat is not sent. If a value of 1 or greater is specified, the heartbeat is sent at that interval. If a negative value is specified, an error results.
If omitted, the value 60 is used.
Note: The heartbeat checking on a client program feature is supported only in the pthread version. The heartbeat sending on a Ninf-G Executable feature is supported by both the pthread version and the non-thread version of Globus Toolkit flavor.
Note: If you are debugging a Ninf-G Executable or client, We suggest that you disable the heartbeat feature. This is to suppress periodic heartbeat overhead and unexpected heartbeat timeouts.
This specifies the number of times until a time-out occurs when the heart-beat is not being sent.
When the heartbeat has not been sent for a time equal to the heart-beat interval times the heart-beat time-out value, the Ninf-G Client takes it as meaning that the Ninf-G Executable is also not operating.
If omitted, the value 5 is used.
This specifies the number of continuous lost heartbeat messages for detecting heartbeat-timeout errors during data transfer between Ninf-G Client and Ninf-G Executable.
In the current Ninf-G implementation, heartbeat message is not sent while transferring data (input/output arguments of RPC). Therefore, if the data is large and it takes long time for the data transfer, heartbeat is not sent for long time hence heartbeat timeout error may occur. In order to avoid this problem, this attribute is provided to set heartbeat timeout count specific for data transfer. Set 0 to ignore heartbeat timeout during the data transfer, or set large value which is large enough to avoid unexpected heartbeat timeout error.
If ommitted, the same value with heartbeat_timeoutCount attribute is used.
Note: This attribute is available for Ninf-G Version 4.2.0 or later.
This specifies redirection of the standard error or standard output of a Ninf-G Executable to a Ninf-G Client.
If omitted, the value 'true' is used.
Note: If the save_stdout or the save_stderr attribute on the server side configuration file is set, stdout or stderr is not delivered to the Ninf-G Client regardless of the value of redirect_outerr.
This specifies the maximum number of retries for establishing a TCP connection. This attribute is used for the following cases.
The default value of this attribute is 4.
Note: Ninf-G 2.3.0 and prior versions do not support this attribute. In order to disable this attribute, set tcp_connect_retryCount to 0 if the version of Ninf-G on the server is 2.3.0 or prior.
This specifies the base interval time for the first retry. The value is in seconds and must be a non-negative integer. This value is used as the maximum interval time for the first retry.
The default value of this attribute is 1.
This specifies the increase ratio which is used to calculate the maximum interval time between retries. The maximum interval time is calculated by multiplying this value and the maximum interval time for the last retry. For the first retry, the value of tcp_connect_retryBaseInterval is used as the maximum interval time.
The value must be greater than 1.0 and the default value of this attribute is 2.0.
This specifies a flag that specifies whether a random value is used or not for the interval time. If the value is true, the interval time between retries is set randomly between 0.0 seconds to the maximum interval time. If the value is false, the maximum interval time is used as the interval time.
The default value of this attribute is true.
When an asynchronous call function is used, this specifies the timing for that function's return.
The values that can be specified are 'wait' (wait until argument transfer is completed), 'nowait' (do not wait until argument transfer is completed), and 'copy' (without waiting for the completion of argument transfer, the values of the arguments passed to the asynchronous function are copied on the client side, and the argument transfer is done in the background).
If omitted, 'wait' is used.
This specifies the method for compressing the argument information. Either 'raw' or 'zlib' can be specified.
If omitted, 'raw' is used.
This specifies the threshold value when compression is performed. If the argument information size equals or exceeds the specified value, the information is compressed.
If omitted, the value of 64 kilobytes is used.
Arguments and results are divided into a specified block size when they are transferred between a Ninf-G Client and a Ninf-G Executable.
The value of this attribute affects the performance of data transfer and an appropriate value should be specified according to the size of the transferred data and network performance.
If 0 is specified, arguments and results will not be divided. If a positive integer is specified, they are divided into blocks with the specified value. An error occurs if a negative value is specified.
If omitted, the default value 16Kbytes is used.
This specifies the working directory for the Ninf-G Executable.
If omitted, no changing for the working directory is made when the staging function is used, in any other case, the Ninf-G Executable path is used for the working directory.
This specifies the core dump file size for the Ninf-G Executable. The size is in 1024-byte increments.
If 0 is specified, it means no core dump file is created. If -1 is specified, it means core dump file size is unlimited and infinite.
If omitted, no setup for core dump file size is performed.
This specifies whether the communication log output function is enabled or disabled.
If 'true' is specified, the communication log is output.
If not specified, the default value is false.
The name of the file to which the communication log is output is specified in the log file name.
The file name may include a path that includes a directory (e.g., "/home/myHome/var/logFile").
The file and directory name can include the following specifiers.
"%t" is replaced with the date as year, month and day, and the time in hours, minutes, seconds and milliseconds ("yyyymmdd-hhmmss-MMM") (e.g., "/ home/myHome/var/logDir%t/logFile" is replaced by "/home/myHome/var/logDir20030101-154801-123/logFile").
"%h" is replaced with the Ninf-G Client hostname.
"%p" is replaced with the process id of the Ninf-G Client.
The Ninf-G Executable id number is added to the end of the file name.
When omitted, the log is output to standard error. If the communication log file name is omitted, commLog_suffix, commLog_nFiles, and commLog_maxFileSize are ignored.
When the communication log file is specified, this specifies the suffix used when the log file is created.
If a suffix is specified, the generated file name will be from "filename[000].suffix" to "filename[nnn].suffix". If omitted, the generated file name will be from "filename.[000]" to "filename.[nnn]". The number of files minus 1 is "nnn." The number of digits in "nnn" is the same as the number of digits in the number of files minus 1. For example, if the number of files is set to 100, then the number will range from "00" to "99."
This is the number of files created for communication log output.
0 indicates an unlimited number of files can be output. A negative value results in an error.
If omitted, the value 1 is used.
This specifies the maximum number of bytes for the communication log file.
If omitted, the value will be unlimited if the number of files is one, or 1Mbyte if the number of files is two or more.
This establishes overwrite permission for the directory. If the specified directory exists, this specifies whether creation of log files in that directory is enabled or disabled. Operation in the case that the directory exists is shown below.
This specifies whether the debugging function is enabled or disabled.
If 'true' is specified, the debugger will be started up when the Ninf-G Executable starts up, allowing debugging of the Ninf-G Executable. If 'false' is specified, the Ninf-G Executable starts up without starting the debugger.
If omitted, the default 'false' value is used.
This specifies an X11 display for displaying the debugging terminal emulator.
To use the debugger, start up the terminal emulator on the server machine, and run the debugger on that terminal. This defines the value for the environment variable DISPLAY that is passed to the terminal emulator.
This specifies the path to the terminal emulator command.
If omitted, the value 'xterm' is used. The Ninf-G Executable searches for terminal emulator command in PATH that is set in the Ninf-G operating environment on the server machine that is used.
This specifies the path to the debugger command.
If omitted, the value 'gdb' is used. The Ninf-G Executable searches for the debugger command in PATH that is set in the Ninf-G operating environment on the server machine that is used.
This specifies whether the Ninf-G Executable perform waiting attach from the debugger or not.
If 'true' is specified, the Ninf-G Executable waits for attaching from the debugger, just after its invocation.
The user needs to invoke the debugger and attach that Ninf-G Executable. Then the user must change the variable for waiting attach (debugBusyLoop), and continue execution. (When the user uses gdb, try "set var debugBusyLoop=0", "continue".)
If omitted, the default 'false' value is used.
Note: It's very helpful to specify this attribute with "environment NG_LOG_LEVEL=4" in the SERVER section. which displays which process id must be attached.
The environment variable specifies the environment variable that is passed to the Ninf-G Executable. It can be written as 'variable name' only or 'variable name = value' style.
If omitted, the environment variable is not used.
The SERVER_DEFAULT section does not allow multiple definitions. This section may be omitted.
The SERVER_DEFAULT section defines the default values for attributes which are used when attributes are omitted in the SERVER section.
The description of the SERVER_DEFAULT section is the same as the SERVER section, except that the attribute "hostname" is not described.
The SERVER_DEFAULT section may also be described in the configuration file or other such places. (*)
(*) For example, even if the SERVER_DEFAULT section is written later than the SERVER section, if attributes are omitted in the previously described SERVER section, the attributes defined in the SERVER_DEFAULT section are used.
Ninf-G4 implements mechanisms for remote process invocation as a separate module called Ninf-G Invoke Server. This architecture enables to support any job submission interfaces by implementing Ninf-G Invoke Server for the interface.
Users must specify the Invoke Server for each server in Ninf-G Client Configuration file except for Pre-WS GRAM. RPC mechanisms for Pre-WS GRAM is embedded in Ninf-G Library and it is not necessary to use Invoke Server for Pre-WS GRAM.
Here is an example of the description of <SERVER> section in the Ninf-G Client Configuration file for specifying WS GRAM as a job submission interface.
|
Invoke Server can be set and configured in the Ninf-G Client Configuration file as described above. The details of the configuration of Invoke Server are described in sections 4.3.9 and 4.3.10.
Each Invoke Server may have its own options. In order to specify such options, the following attributes are provided in the Ninf-G Client Configuration file.
This attribute is used to specify Invoke Server options for a specific server.
This attribute is used to specify Invoke Server options for all servers.
Example:
|
Some attributes in <SERVER> section are interpreted by each Invoke Server. For example, Invoke Server GT4py interprets "port" attribute as the port number of WS GRAM and Invoke Server SSH interprets "port" attribute as the port number of sshd.
Note: The Invoke Server feature is supported only for pthreads flavors.
Invoke Server GT4py invokes Ninf-G Executable via WS GRAM.
GT4 must be installed on both client and server.
globusrun-ws
command must be available
on the client
and remote server must be able to accept WS GRAM access.
Invoke Server GT4py is automatically installed through the Ninf-G installation processes described in section 2 of this manual.
Invoke Server GT4py accepts the following extra options.
This attribute specifies the type of delegated proxy certificate. If delegate_full_proxy is set to "true", full proxy certificate is delegated to the server. Otherwise, limited proxy certificate is delegated. This option is provided for enabling cascading RPC since limited proxy certificate does not allow subordinate GRAM accesses.
If omitted, "false" is used.
Note: If this option is set to true,
extra command (globus-credential-delegate
)
is executed internally that may take 10 to 20 seconds
if Globus Toolkit Version 4.0.3 or prior is used.
Note: This option is available for Ninf-G Version 4.2.0 or later.
This attribute specifies the protocol to WS GRAM.
If WS GRAM is non secure mode
(started by globus-start-container -nosec
),
"protocol http" must be set to access the WS GRAM.
If ommitted, "https" is used.
WS GRAM RSL has <extensions> tag, which enables the user to pass extra information to WS GRAM server.
Invoke Server handles this feature by using job_rslExtensions attribute in <SERVER> section.
Executable staging on WS GRAM server via Invoke Server GT4py requires the following steps in advance.
Invoke Server GT4py requires GridFTP servers on both remote and local hosts. The GridFTP server should be invoked either directly or via inetd/xinetd daemon. The port for the GridFTP server is not limited to the default port 2811.
If the client-side GridFTP server does not use the default port (2811), the port number of the GridFTP server must be specified in client configuration file. The port number can be specified by gsiftp_port option in invoke_server_option attribute in <SERVER> section.
|
Subject names which are used for mutual authentication between WS GRAM container and client-side GridFTP server depends on the owner of those daemons.
If they are invoked by the system, subject name of the host certificate is used. If they are invoked by a user, subject name of the user certificate is used.
According to the combination of the owners of the WS GRAM container and the client-side GridFTP server, some attributes need to be specified in the client configuration file.
It is not necessary to specify the subject name.
The subject name of the user must be specified by staging_source_subject attribute in <SERVER> section.
|
The subject name of the user must be specified by subject attribute in <SERVER> section.
|
The subject name of the user must be specified by subject attribute in <SERVER> section. The subject name of the client-side host must be specified by staging_source_subject attribute in <SERVER> section. The subject name of the user must be specified by staging_destination_subject and deletion_subject attributes in <SERVER> section.
|
For each server host, the scratch directory must be created for staging in advance.
The staging directory is "$HOME/.globus/scratch" ($GLOBUS_SCRATCH_DIR variable in GT4 GRAM RSL).
Please create the directory as follows:
|
Invoke Server SSH invokes Ninf-G Executable via SSH.
User must be able to execute commands on the server using ssh command.
In addition, it is recommended to configure user's ssh environments
not to require user's input (e.g. password) for executions to avoid
repetitious input while Ninf-G application is executed.
"ssh-agent
" and "ssh-add
" commands are usually
used for such purposes.
The following commands are required by Invoke Server SSH and must be available on the server.
/bin/sh
, /bin/echo
, /bin/grep
,
/bin/chmod
, /bin/mkdir
, /bin/cat
,
/bin/rm
, /bin/kill
Invoke Server SSH is automatically installed through the Ninf-G installation processes described in section 2 of this manual.
Like Globus GRAM, Invoke Server SSH is able to launch remote processes via a backend queuing system including SGE and PBS(*1). The backend queuing system is specified by "jobmanager" attribute in <SERVER> section in the Ninf-G Client Configuration file. The value of "jobmanager" attribute can be either "jobmanager-sge" for SGE or "jobmanager-pbs" for PBS.
Example:
|
It should be noted that although the values "jobmanager-sge" and "jobmanager-pbs" are also used for Invoke Servers for Globus GRAM (e.g. GT4py), jobmanager programs used by Invoke Server SSH are implemented by the Ninf-G development team hence they are completely different with the jobmanager programs provided by the Globus Toolkit.
The jobmanager program assumes that user's home directory is shared between front (master) node and compute nodes.
Invoke Server SSH uses qsub
, qstat
,
and qdel
commands in jobmanager-sge and jobmanager-pbs.
Therefore, the path of these commands should be included in PATH environment
variable.
Otherwise, the path of these commands must be passed by options described
below.
Command | Option |
---|---|
qsub | ssh_submitCommand |
qstat | ssh_statusCommand |
qdel | ssh_deleteCommand |
The detailed description of these options is described in section 4.4.2.4 of this manual.
(*1) Invoke Server SSH is tested with PBS Pro and Torque.
Invoke Server SSH accepts the following extra options.
This option specifies the path of "ssh
" command.
Invoke Server SSH connects to remote host using the command specified
by this attribute.
If omitted, /usr/bin/ssh is used.
This option specifies the path of shell command to invoke shell on remote host. If backend queuing system is used, the specified shell is also used in the script for backend queuing system.
If omitted, /bin/sh
is used.
This option specifies the user name on remote host.
This value is passed to "ssh
" command
as "-l
" argument.
If omitted, "-l
" option is omitted.
This option specifies the any options which will be passed to
"ssh
" command.
Multiple ssh_option options can be specified.
This option specifies the directory in which temporary files are created on remote host.
If omitted, home directory is used.
This option specifies the command for submitting jobs on remote host. This option is available only when backend queuing system is used.
If omitted, qsub
is used.
This option specifies the command for querying status of jobs on remote host. This option is available only when backend queuing system is used.
If omitted, qstat
is used.
This option specifies the command for deleting jobs on remote host. This option is available only when backend queuing system is used.
If omitted, qdel
is used.
This option specifies the command for launching a MPI program on remote host. This command is used when Invoke Server SSH invokes MPI jobs.
If omitted, "mpirun
" is used.
This option specifies the command line options which will be passed
to "mpirun
" command on remote host.
This is used when Invoke Server SSH invokes MPI jobs.
Multiple ssh_MPIoption options can be defined.
This option specifies the command line option of mpirun
command for specifying the number of processors.
This option is used when Invoke Server SSH invokes MPI jobs.
The value of this option must include "%d
" and
Invoke Server SSH replaces it by the actual number of processors.
If omitted, "-np %d
" is used.
This option specifies the command line option of "mpirun
"
command for specifying machinefile.
This option is used when Invoke Server SSH invokes MPI jobs using backend
queuing system.
The value of this option must include "%s
" and
Invoke Server SSH replaces it by the name of the actual machinefile.
If omitted, "-machinefile %s
" is used.
This option specifies the parallel environment of SGE. It is used when Invoke Server SSH invokes MPI jobs or array jobs using SGE.
If omitted, *mpi*
is used.
This option specifies the number of processors per a node. It is used when Invoke Server SSH invokes MPI jobs or array jobs using PBS.
If omitted, 1 is used.
This specifies the RSH command used on remote host when Invoke Server SSH invokes array jobs using PBS.
If omitted, /usr/bin/ssh
is used.
Invoke Server Condor invokes Ninf-G Executable via Condor(*1).
*1 Condor Project: http://www.cs.wisc.edu/condor/
Condor must be installed on both client and server machines.
Invoke Server Condor is not installed by the default Ninf-G installation and it must additionally installed manually according to the following steps.
csh.
% setenv NG_DIR /path/to/ninf-g
sh.
$ NG_DIR=/path/to/ninf-g ; export NG_DIR
% cd ng-4.x.x # expanded Ninf-G package
% cd utility/invoke_server/condor
make
" command to compile Invoke Server Condor
% make
make install
" command to install Invoke Server Condor
% make install
This command copies the following files under ${NG_DIR} directory.
${NG_DIR}/lib/
classad.jar
- Log analysis library for Condor Job
condorAPI.jar
- Condor Java API Library
condorIS.jar
- Invoke Server Condor
${NG_DIR}/bin/
ng_invoke_server.Condor
- Startup script for Invoke Server Condor
none
Invoke Server Condor automatically creates the Condor job cluster log
when it invokes jobs.
The name of the log file is "ninfg-invoke-server-condor-log
".
Invoke Server NAREGISS invokes Ninf-G Executable via NAREGI Super Scheduler.
NAREGI Middleware V1.1 or later is required. Java 1.5.0 or later.
Invoke Server NAREGISS can be installed as a part of
Ninf-G installation steps.
Invoke Server NAREGISS is installed if --with-naregi
is
specified as a Ninf-G configure script option.
Example:
% ./configure --with-naregi
NOTE: If NAREGI Middleware is not installed in default the directory (/usr/naregi), it is necessary to specify it with configuration option "--with-naregidir".
Details of Ninf-G configure script are described in 2.4 Configure command options.
Invoke Server NAREGISS assumes that the Ninf-G Client is invoked as a job via NAREGI SS, and expects the followings.
Invoke Server NAREGISS accepts the following extra options.
This option specifies the directory for the temporary files used by Invoke Server NAREGISS on remote host.
If the remote host is a PC cluster, it is recommended to set this option to a directory which is shared by all cluster nodes.
If omitted, user's home directory is used.
This option specifies the system on which Ninf-G Executable will run. This is specified by the hostname of the head node of the system.
Multiple CandidateHost options can be specified.
This option specifies the name of the operating system the computing resources. It is required by NAREGI Super Scheduler.
This option specifies CPU architecture of the computing resources. It is required by NAREGI Super Scheduler.
This option specifies minimum number of CPUs per a computing node. This is required by NAREGI Super Scheduler.
This option specifies the maximum size of physical memory that the Ninf-G Executable will use. This is required by NAREGI Super Scheduler.
This option controls the output of logs of Invoke Server NAREGISS.
The following values can be specified.
IS_COMMAND | : | Output logs about communication between Ninf-G Client and Invoke Server |
SS_COMMAND | : | Output logs of XML document related NAREGI SS |
SS_WF_ID | : | Output logs of EPR of NAREGI SS job |
ALL | : | Output all logs |
Multiple values can be specified by delimiting them by spaces.
If omitted, Invoke Server NAREGISS outputs the minimum logs.
Note: If log file is not specified using invoke_server_log attribute in <CLIENT> section or log_filePath in <INVOKE_SERVER> option, this option is ignored.
Note: Whenever a function/object handle is created, Invoke Server receives the Invoke Server options. But this option is effective only at the first time of a handle creation. Therefore, this option must be specified not in invoke_server_option attribute in <SERVER> section but in option attribute in <INVOKE_SERVER> section.
This option specifies the maximum job execution wall clock time in seconds. job_maxWallTime attribute of <SERVER> section also specifies the maximum job execution wall clock time, however it is specified in minutes.
If both this option and job_maxWallTime attribute are specified, the value of this option is used. If neither this option nor job_maxWallTime attribute are specified, "1000 seconds" is used as the default value of the maximum job execution wall clock time.
This option specifies MPI type.
If omitted, "GridMPI" is used.
This option specifies the number of processes per host.
If omitted, "1" is used.
Invoke Server NAREGISS has some problems. Details are described in 11.5 Problems related to NAREGI SS.
% grid-proxy-init
Note: This operation is required if PreWS GRAM or WS GRAM is used.
% ./test_client [args ...]
Ninf-G supports the GridRPC API for C and Java.
In this section, the flow of an application program (written in C) for using GridRPC is described and a few typical GridRPC API functions are introduced.
Of the functions described here, those that contain *_np are not included in the GridRPC API standard (i.e., they are specific to Ninf-G).
A full list of the GridRPC APIs and a detailed explanation of each API can be found in chapter 7, "API Reference."
The typical flow of an application program for using GridRPC is as follows.
The functions used in the above processes are described below.
The following function is used for initialization.
grpc_error_t grpc_initialize(char *configFile)
This function accepts the name of the configuration file as an argument, reads the file named by the argument, analyzes the content, and saves the values.
If the argument value is NULL, the file specified by the NG_CONFIG_FILE environment variable is taken to be the configuration file.
As the return value, an error status code is returned to inform of failure to read the configuration file or failure to save the values that were read.
An example of using grpc_initialize() is given below. (In this example, the configuration file name is taken from the command line argument and that value is used as the argument.)
|
In GridRPC, "handles" are used when performing operations such as executing remote functions and remote methods. A handle must be created before executing a remote function or remote method, but the type of handle created differs with the type of Ninf-G Executable used.
If only one remote function is defined for the Ninf-G Executable used, a "function handle" is used; if multiple remote methods are defined, an "object handle" is used.
Functions for creating both kinds of handles are shown below.
grpc_error_t
grpc_function_handle_init(
grpc_function_handle_t *handle,
char *server_name,
char *func_name)
grpc_error_t
grpc_object_handle_init_np(
grpc_object_handle_t_np *handle,
char *server_name,
char *class_name)
These functions accept a 'server name' and 'function or class name,' and create a handle for operating the specified Ninf-G Executable on the specified server.
As the return value, an error code is returned to inform of failure to create the handle.
For example, a function handle is created as follows.
|
The following functions for creating multiple handles at one time are also provided by Ninf-G. (See Section 7 for details)
grpc_function_handle_array_init_np()
grpc_object_handle_array_init_np()
The handle just created can be used to call the specified remote function or remote method on the server. When the call is made, the value of the argument defined by the Ninf-G IDL must be passed.
The functions used for calling a function differ for a function handle and an object handle. When calling a remote method with an object handle, the name of the remote method must be specified.
Remote functions and remote methods can be called in two ways, with a 'synchronous call' and with an 'asynchronous call.'
The synchronous call does not return until the execution of the remote function or remote method is completed.
The asynchronous call returns either at the beginning or at the completion of the sending of the arguments to the remote function or remote method; it then waits for the completion of the remote function or remote method to obtain the result. (The return timing of the function that makes the asynchronous call can be specified in the configuration file.)
Functions for making remote functions and remote methods calls of the synchronous type are shown below.
grpc_error_t
grpc_call(
grpc_function_handle_t *handle, ...)
grpc_error_t
grpc_invoke_np(
grpc_object_handle_t_np *handle,
char *method_name, ...)
These functions accept the handle and the parameter values to be passed to the remote function or remote method (the remote method name also, in the case of the grpc_invoke_np() function), execute the computation by the specified remote function or remote method, and return as soon as the computation is completed.
As the return value, an error status code is returned to inform the user when the execution of the remote function or remote method fails.
For example, a call to a remote function or remote method defined in the IDL file below is made in the form of grpc_call() below that.
|
|
Functions for making remote functions and remote methods calls of the asynchronous type are shown below.
grpc_error_t
grpc_call_async(
grpc_function_handle_t *handle,
grpc_sessionid_t *session_id, ...)
grpc_error_t
grpc_invoke_async_np(
grpc_object_handle_t_np *handle,
char *method_name,
grpc_sessionid_t *session_id, ...)
These functions accept the handle and the parameter values to be passed to the remote function or remote method (remote method name also, in the case of the grpc_invoke_np() function), issue a request for computation to the specified remote function or remote method, and return when the transmission of arguments begins or when it ends (which can be set in the configuration file).
If successful, GRPC_NO_ERROR is returned. In the case of an error, an error code is returned.
The returned session ID is used when waiting for the execution results or for other such purposes.
Functions for waiting for the completion of the computation for an asynchronous call are shown below. All of these functions return an error status code to inform of cases in which execution of the session fails.
grpc_error_t grpc_wait(grpc_sessionid_t session_id)
This waits for completion of the session specified by the session ID passed in the argument and returns when the session ends.
grpc_error_t grpc_wait_any(grpc_sessionid_t *id)
This waits for completion of any of the current sessions and returns when the session ends.
grpc_error_t grpc_wait_and(
grpc_sessionid_t *sessions, size_t length)
Waits for completion of all of the sessions specified by the array of session IDs and returns when they end.
grpc_error_t grpc_wait_or(
grpc_sessionid_t *sessions, size_t length, grpc_sessionid_t *id)
Waits for completion of any of the sessions specified by the array of session IDs and returns when one of them ends.
grpc_error_t grpc_wait_all()
This waits for completion of all of the current sessions and returns when they all have ended.
For releasing resources, unnecessary "handles" must be destructed. The function for destructing differs with the type of "handles."
Functions for destructing handles are shown below.
grpc_error_t
grpc_function_handle_destruct(
grpc_function_handle_t *handle)
grpc_error_t
grpc_object_handle_destruct_np(
grpc_object_handle_t_np *handle)
These functions destruct the specified handle.
As the return value, an error status code is returned to inform of failure to destruct the handle.
If two or more handles were created at once, then the following functions for destructing multiple handles at one time must be used.
grpc_function_handle_array_destruct_np()
grpc_object_handle_array_destruct_np()
The following function is used to perform termination processing.
grpc_error_t grpc_finalize()
This function executes the processing when the Ninf-G Client is terminated.
The return value is an error status code to inform the user when termination processing fails.
The API that provides capabilities that have been added in Ninf-G v2 is described below.
When callback is used, "a function that has both the same name as the name of the callback type argument described in the Ninf-G IDL and the same arguments" must be defined and implemented in the application program.
Below is an application program that corresponds to the callback example that appears in chapter 3, "Creating and setting up server-side programs"(section 3.1). (Ninf-G Executable and Ninf-G Client exchange status values)
Note: The maximum number of parameters which can be defined as callback function is 32.
|
A function for checking the status of a session is shown below.
grpc_error_t
grpc_session_info_get_np(
grpc_sessionid_t session_id,
grpc_session_info_t_np *info,
int *status)
This checks the status of the session that corresponds to the session ID specified in the argument.
When the heartbeat is not obtained normally, GRPC_SESSION_DOWN is returned as the 3rd argument of this function. If an error has occurred, the error code is returned.
A function for canceling a session is shown below.
grpc_error_t grpc_cancel(grpc_sessionid_t session_id)
This checks the status of the session that corresponds to the session ID specified in the argument.
An error code is returned as the return value to inform the user that an error has occurred.
Ninf-G Client is able to use multiple user proxy certificates. Being enabled by Invoke Server, this capability is useful for using different user proxy certificates according to the security configuration (accepted CAs) of servers.
This section describes how to use multiple certificates.
Copy the script from template ($NG_DIR/etc/ng_invoke_server.GTtempl).
% cp $NG_DIR/etc/ng_invoke_server.GTtempl $NG_DIR/bin/ng_invoke_server.GT4cert1 % chmod u+x $NG_DIR/bin/ng_invoke_server.GT4cert1
Modify the copied script in which you have to specify the user proxy certificate(*1) and the script file(*2) which you will use.
#! /bin/sh X509_USER_PROXY=/path/to/x509up_xxxx <- (*1) export X509_USER_PROXY exec $NG_DIR/bin/ng_invoke_server.GT4py <- (*2) |
Modify the client configuration file and specify the Invoke Server that you created at a.1.1.
<SERVER> hostname example.org : invoke_server GT4cert1 </SERVER> |
Ninf-G4 supports cascading RPC, which enables Ninf-G Executable to call GridRPC API. Cascading RPC is realized by (1) implementing remote functions that calls GridRPC API (server-side implementation) and (2) configuring Ninf-G client to enable delegation of full-proxy certificates (client-side configuration).
In the IDL file,
ng_cc
to Compiler and Linker.
Example:
|
Example:
|
It is implemented by embedding GridRPC APIs such as grpc_initialize(), grpc_function_handle_init(), and grpc_call() in the body of the remote function.
"delegate_full_proxy true"
option to
Invoke Server GT4py in <SERVER> section of Ninf-G
Client Configuration file.
Example:
|
Example:
|
Ninf-G Executable searches the following files in the current working directory.
If staging is off, Ninf-G Executable runs on the directory in which the Ninf-G Executable exists.
If staging is on, Ninf-G Executable always runs on the user's home directory unless working directory is explicitly specified by the user using "workDirectory" attribute in the Client Configuration file.
Example:
|
Note: Cascading RPC is available for Ninf-G Version 4.2.0 or later.