Ninf-G uses the Globus Toolkit to provide an operating environment for GridRPC.
GridRPC is middleware that provides a model for access to remote libraries and parallel programming for tasks on a grid. Typical GridRPC middleware includes Ninf and Netsolve.
GridRPC is considered effective for use in the following cases.
Commercial programs or libraries that use resources which are run on particular computers on the grid are sometimes provided only in binary format and cannot be executed on particular computers. There are also problems concerning licensing and source code compatibility. Furthermore, when using resources that can only be used with particular machines, such as video cameras, electron microscopes, telescopes and sensors, processing for the use of those resources on those machines is necessary.
In such cases, an environment that allows the resources (including software) to be used on a particular computer is needed.
When there are many programs that execute routines that do a large amount of computation on broadband servers on the grid, it takes a lot of time just to run parts of the program.
The time required to run the program can be shortened by off-loading such program parts to a broadband server.
In cases when there are strong demands on memory and disk space on the client machine so that broadband computation cannot be done, it is desirable to be able to do easily-understood offloading with no consideration given to argument marshalling.
Execution of Parameter Sweep by multiple servers on the grid
Parameter Sweep is a program that enables execution of computation on multiple servers in parallel, using some subset of the parameters. The respective servers run independently using different parameters, with virtually no dependence on other servers.
There are surprisingly many programs like Parameter Sweep.
The Monte Carlo method program is one of them.
Although Parameter Sweep can also be implemented with a Message Passing Interface (MPI), programming is rather simple with GridRPC and Parameter Sweep can be executed to match the (dynamically changing) scale of the grid (execution by multiple clusters, taking resource management, security, etc., into account).
Ordinary or large-scale task parallel programs on a grid
Task arrangement programs are easy to write with GridRPC. An API that supports the synchronization of various task arrangements with mixed exchange among multiple clients and servers can be used.
GridRPC not only provides an interface for easy mathematical computation and scheduling of tasks for parallel execution, but the execution of processing that matches the (dynamically changing) scale of the grid is possible, as in the case of Parameter Sweep.
New features and functions have been added to Ninf-G Version 4 (Ninf-G4).
Globus Toolkit Version 4 (GT4) provides a new framework and a new mechanism to provide job invocation (WS GRAM) and information services (Information Services: MDS4). Ninf-G4 supports the WS GRAM and MDS4 functions. The Ninf-G configuration file provides attributes to use these functions.
Ninf-G4 has a new module called Invoke Server. This module enables support of many type of job invocation for Ninf-G. (WS GRAM, Pre-WS GRAM, UNICORE, ...)
Any job submission interfaces can be used for remote process invocation by implementing Ninf-G Invoke Server for the interface. The detailed information on how to develop Ninf-G Invoke Server is described in Invoke Invoke Server Developer's Manual.
Ninf-G4 supports cascading RPC, which enables Ninf-G Executable to call GridRPC API. Cascading RPC implements hierarchical RPC which is a implementation technique to make applications scalable and to achieve high performance for fine-grained task parallel applications. Cascading RPC is available for Invoke Server GT4py by delegating full proxy certificates. The detailed information of this feature is available in Section 4.4.1. The other Invoke Servers such as Invoke Server SSH and Invoke Server Condor may be enabling cascading RPC, however they are not officially supported.
Ninf-G4 supports source code compatibility with Ninf-G Version 2 (Ninf-G2). Source codes of IDL and client programs are compatible between Ninf-G2 and Ninf-G4. Format of client configuration file in Ninf-G4 is expanded from Ninf-G2 and it is upper compatible with Ninf-G2.
Ninf-G4 uses the same protocol with Ninf-G2 for communication between Ninf-G Client and Ninf-G Executable, thus mixed use of Ninf-G2 and Ninf-G4 is supported, i.e. Ninf-G2 client is able to call Ninf-G4 server and vice versa.
In addition, GT2 functions (Pre-WS GRAM, Pre-WS MDS: MDS2) which are used by Ninf-G2 can be used by Ninf-G4 as well. Basically, Ninf-G2 users do not need to worry about compatibility problems. Only the user with interest in new Ninf-G4 features, must learn about the new capabilities of Ninf-G.
Ninf-G is a set of library functions that provide an RPC capability in a Grid environment, based on the GridRPC API specifications.
Ninf-G and the application programs that use Ninf-G consist of Ninf-G Executables that execute computation on server machines, and Ninf-G Clients that issue requests for computation to the Ninf-G Executables from client machines.
The Ninf-G Executables consist of functions that perform calculations (calculation functions) and a Ninf-G stub program that calls the calculation functions. Communication between clients and servers is accomplished by TCP/IP using a proprietary Ninf-G protocol.
The relationships between clients and servers are illustrated in Fig. 1.
Figure 1: Clients and servers
Ninf-G employs the capabilities provided by the Globus Toolkit (http://www.globus.org/) for server machine authentication, information search, job start-up, communication and file transfer. The relations among applications, Ninf-G, the Globus Toolkit and the OS are illustrated in Fig. 2.
Figure 2: Program hierarchy
Ninf-G Clients are comprised of the following elements.
Ninf-G Executables are comprised of the following elements.
Ninf-G is supplied to the user as a source package, which includes the library functions (API) and utility commands. The operating environment required for the library functions and utility commands are shown in Table 1.
The usage of GT2 (implying the use of Pre-WS GRAM or MDS2) requires a GT2, GT3 or GT4 installation. Every Globus Toolkit has compatibility with GT2.
The usage of GT4 (implying the use of WS GRAM or MDS4) requires a GT4 installation.
Globus Toolkit | 2.2 or later (2.4, 3.2, 4.0) |
Python | 2.3 or later |
- | - |
Target machine | SPARC |
Operating system | Solaris 9 (SunOS 5.9) |
Compiler | Sun Compiler or gcc 2.95 |
Globus Toolkit flavor | vendorcc32dbg, vendorcc32dbgpthr, gcc32dbg, gcc32dbgpthr |
- | - |
Target machine | PC-AT compatible (x86, Opteron) |
Operating system | Linux(*1) |
Compiler | gcc 2.95, gcc 3.0, 3.1, 3.2, 3.3, 3.4(*2) |
Globus Toolkit flavor | gcc32dbg, gcc32dbgpthr, gcc64dbg, gcc64dbgpthr |
- | - |
Target machine | IBM Power4 |
Operating system | AIX 5.2 |
Compiler | C for AIX Compiler, Version 6 |
Globus Toolkit flavor | vendorcc32dbg or vendorcc32dbgpthr |
- | - |
Target machine | Apple Mac (PowerPC) |
Operating system | MacOS X |
Compiler | gcc 4.0.0 |
Globus Toolkit flavor | gcc32dbg or gcc32dbgpthr |
(*1) We are checking operation with the following distributions.
(*2) There are problems with gcc 2.96, so we recommend you use gcc 2.95.x or gcc 3.0, 3.1, 3.2, 3.3, 3.4.
Ninf-G allows the definition of a single computation function (1) or multiple computation functions (2) for a Ninf-G Executable running on a server machine. The execution schemes for these are shown in Fig. 3. In either case, it is possible to execute just one computation function at a time on the Ninf-G Executable. To execute multiple computation functions at the same time, it is necessary to run multiple Ninf-G Executables. This is illustrated in Fig. 4.
In Ninf-G, the second scheme (2) is referred to as "Ninf-G Executable objectification" and the calling of the computation is referred to as a "method call."
Figure 3: Overview of operation
Figure 4: Parallel execution
Ninf-G provides handles for manipulating a Ninf-G Executable. Different handles are used for the two schemes, (1) and (2), described above. As shown in Table 2, two types of handles are provided, function handles and object handles.
Function handle | Used for manipulation of a Ninf-G Executable for which a single function is defined |
Object handle | Used for manipulation of a Ninf-G Executable for which multiple functions are defined |
Ninf-G Executables that run on server machines are started up from Ninf-G Clients, which run on client machines. A Ninf-G Executable is started up by performing the following procedure using the job control method provided by the Globus Toolkit or Invoke Server.
When running a Ninf-G Client program, however, there is no particular need for the user to be aware of this mechanism.
For example, if the Invoke Server for Globus Toolkit WS-GRAM is selected for use, the Invoke Server requests the remote WS-GRAM to perform the invocation. The requested remote WS-GRAM invokes the jobmanager, and the jobmanager invokes the Ninf-G Executable.
This process is shown in Fig. 5.
Figure 5: Starting up a Ninf-G Executable
Starting up a Ninf-G Executable requires path information that specifies the location of the Ninf-G Executable on that server machine. Information on the functions that are called by the Ninf-G Executable is also required. That information is collectively referred to as the Ninf-G Executable information. Ninf-G provides the following methods of registering and accessing Ninf-G Executable information.
When running a Ninf-G Client program, however, there is no particular need for the user to be aware of this mechanism.
Figure 6: Local LDIF file
Figure 7: Ninf-G Executable
(*) The information search function provided by the Globus Toolkit.
Figure 8: MDS
This is a program written by a user for the purpose of controlling the execution of computation. It is obtained by linking a user-written application program to the Ninf-G Client Library (and Globus Toolkit).
The Ninf-G Client Library puts together the API used by application programs that run on client machines (Ninf-G Client API).
This is a program written for the execution of user requests for computation to be performed on a remote computer. It is obtained by linking a user-written computation function to stub code and the Ninf-G Executable Library (and Globus Toolkit). The stub code is produced by the stub generator according to the interface specifications of the user-defined computation function. The interface specifications are written in the Ninf-G IDL (Interface Description Language) specified by Ninf-G.
The Ninf-G Executable Library puts together the API (Ninf-G Executable API) used by a Ninf-G Executable.
A machine that is running a Ninf-G Client.
A machine that is running a Ninf-G Executable.
A function handle is a data item whose type is grpc_function_handle_t. The function handle represents a mapping from a function name to an instance of that function on a particular server.
An object handle is a data item whose type is grpc_object_handle_t_np. The object handle represents a mapping from a class name to an instance of that class on a particular server. The instance is called a Ninf-G remote object, and it is able to contain multiple methods.
A computational function written by the user. (It might be only a single computation function for a Ninf-G Executable)
A computational function written by the user. (It might be multiple computation functions for a Ninf-G Executable)
A session extends from the time an RPC is made to the time its execution is completed.
In Ninf-G, a session extends
This is the standard API that systems implementing GridRPC should have. For the GridRPC C language API, standardization by the GGF WG is currently still in process.
IDL is the acronym for Interface Definition Language. It is a language for writing interfaces for the remote functions and remote methods defined by Ninf-G Executables.
This is the identifier for Ninf-G Executables. The user may specify any character string in the Ninf-G IDL.
Ninf-G provides the following functionalities for reducing overhead for initialization of function handles.
A single GRAM call usually takes several seconds for GSI authentication and a process invocation via the Globus jobmanager. This indicates that it will take more than several minutes to tens of minutes for hundreds of GRAM calls on a large-scale cluster. Also, many Globus jobmanager processes which will be launched on the front-end node will increase the load on the front-end node and cause the creation of additional overhead.
Ninf-G implements a functionality which enables the creation of multiple
function handles via a single GRAM call and provides an API for
utilizing this functionality. For example,
grpc_function_handle_array_default_np()
takes three
arguments, a pointer to an array of function handles, the number of
function handles, and the name of the remote executable. When
grpc_function_handle_array_default_np()
is invoked, Ninf-G will
construct an RSL in which the count
attribute is specified as the
number of function handles, and pass the RSL to the Globus GRAM. This
allows invocation of multiple remote executables, i.e. initialization of
multiple function handles, via a single GRAM call.
Querying an MDS server for getting information on remote executables is a more difficult problem from a performance point of view, since it takes several minutes if the MDS server contains a large MDS tree. Although a useful resource discovery mechanism is essential for the acceptance of grid computing, we need to provide a practical scheme for information retrieval. Several approaches could be candidates for the implementation of information retrieval. For example, in CORBA, both a client and servers generate stubs and share information statically. Although this approach is straightforward and reduces the overhead for information retrieval, client programmers need to prepare IDL files for stub generation which constitutes a burden on client programmers. Ninf-G implements a functionality which enables it to retrieve the necessary information not from an MDS server, but from Local LDIF files which are placed on the client machine in advance. When Ninf-G Executables are generated on the server machine, the LDIF files are generated by the Ninf-G IDL compiler as well. The LDIF files should be copied to the client machine and can be specified in the client configuration file which is passed to the application as the first argument.
Ninf-G provides the following functionalities for efficient data transfers and elimination of redundant data transfers.
Although the semantics of a remote executable is "stateless," it is
desirable to provide a "stateful" remote executable since typical
applications repeat computation for large data sets with different
parameters.
In the case of "stateless" executables, the executable needs to send the
data in every remote library call, which would be a severe problem in a
Grid environment. Ninf-G provides a "stateful" remote executable as
a "Ninf-G remote object." A Ninf-G remote object can hold a "state" and
be used to eliminate redundant data transfers between a client and
servers. Ninf-G provides API functions such as
grpc_object_handle_init_np()
and
grpc_invoke_np()
for
utilizing Ninf-G remote objects.
grpc_object_handle_init_np()
initializes a Ninf-G remote object
and creates an object handle which is represents a connection
between the client and the Ninf-G remote object.
grpc_invoke_np()
calls methods of the Ninf-G remote object
as described in the Ninf-G IDL.
A Ninf-G remote object is an instance of a class which is defined in an
IDL file using DefClass
statement on the server side.
Multiple methods, which can be invoked by a client using a client
API such as grpc_invoke_np()
, can be defined in
a class using the DefMethod
statement.
Ninf-G enables data transfers with compression. A flag which specifies whether to enable or disable data compression, and a data size as the threshold for compressing data can be specified in the client configuration file.
In order to compensate for the heterogeneity and unreliability of a Grid environment, Ninf-G provides the following functionalities:
The GridRPC API specifies that the first argument of a client program must be a "client configuration file" in which information required for running applications is described. In order to compensate for the heterogeneity and unreliability of a Grid environment, Ninf-G provides client configuration formats for detailed description of server attributes such as the Globus jobmanager, and a protocol for data transfers, etc.
If a server machine is fully utilized, requests for initialization of function handles and remote library calls may be stuck in the queue and will not be launched for a long time, and this may cause deadlock of applications. Ninf-G provides a functionality to specify a timeout value for initialization of function handles as well as remote library calls. The timeout values can be specified in the client configuration file.
A remote executable reports a heartbeat message to the client at a pre-specified interval. Ninf-G provides an API function for checking the heartbeat from the remote executable. The interval can be specified in the client configuration file.
Ninf-G provides a functionality called "client callbacks" by which a remote executable calls a function on the client machine. The client callback can be used for sharing status between the server and the client. For example, the client callback can be used for showing the interim status of computation at the client machine and in interactive processing.
Ninf-G provides a server-side API function named
grpc_is_canceled()
for checking the arrival of cancel requests
from the client. If the client calls a grpc_cancel()
function,
grpc_is_canceled()
returns 1
. In order to
implement cancellation of a session, remote executables are required
to call grpc_is_canceled()
at an appropriate interval and
return by itself, if grpc_is_canceled()
returns
1
.
Ninf-G provides functionalities which are useful for debugging. Ninf-G enables redirection of stdout and stderr of remote executables to the client machine. Log messages generated by Ninf-G and the Globus Toolkit can also be stored on the client machine. Furthermore, Ninf-G enables the launch of "gdb" on the server machine when a remote executable is launched on the server. These functionalities are made available by turning on the flags in the client configuration file.
The versions are source code compatible. Client-side application programs and server-side remote function programs that are used with Ninf-G2 can be used without modification.
The environment variable name, utility command name, and configuration file attribute names have compatibility with Ninf-G2.