Abstract
Rapid increase in speed and availability of network of supercomputers
is making high-performance global computing possible, including our
Ninf system. However, critical issues regarding system performance
characteristics in global computing have been little investigated,
especially under multi-client, multi-site WAN settings. In order to
investigate the feasibility of Ninf and similar systems, we conducted
benchmarks under various LAN and WAN environments, and observed the
following results: 1) Given sufficient communication bandwidth, Ninf
performance quickly overtakes client local performance, 2) current
supercomputers are sufficient platforms for supporting Ninf and
similar systems in terms of performance and OS fault resiliency, 3)
for a vector-parallel machine (Cray J90), employing optimized
data-parallel library is a better choice compared to conventional
task-parallel execution employed for non-numerical data servers, 4)
computationally intensive tasks such as EP can readily be supported
under the current Ninf infrastructure, and 5) for
communication-intensive applications such as Linpack, server CPU
utilization dominates LAN performance, while communication bandwidth
dominates WAN performance, and furthermore, aggregate bandwidth could
be sustained for multiple clients located at different Internet sites;
as a result, distribution of multiple tasks to computing servers on
different networks would be essential for achieving higher
client-observed performance. Our results are not necessarily
restricted to the Ninf system, but rather, would be applicable to
other similar global computing systems.