Performance benchmark results for MS Windows

Fortran 90 compilers

April 8, 1998

Daniel R. Fuka

dan@quetzalcoatl.com

Quetzal Computational Associates, Inc.

Durango, CO 81301

(970)382-8979

Introduction

This article is the fourth in a series of articles comparing the speed of commercially available Fortran 90 compilers. The previous three benchmark reports1,2,3 gave results for F90 compilers running on various Sun Unix workstation and Cray platforms. This study will focus on compilers that run on the Microsoft Windows NT Workstation platform and compare the runtimes with results from two RISC based workstations.

 

The F90 compilers

The compilers evaluated include the Fujitsu Fortran Version 1.3 beta 2, Lahey Fortran 90 LF90 Version 3.50 patch 3. 50f, and the Digital Visual Fortran Optimizing Compiler Version V5.0. The Fujitsu compiler is in beta release while all of the other compilers are commercially available from the manufacturers or from the Fortran Store.

The Fujitsu F90 compiler came complete with a C compiler, debugger, and development manager. This compiler is very similar to their F90 compiler for Solaris Sparc workstations, with the same compiler flags and Makefile format.

The Lahey F90 compiler was bundled with a DOS window debugging tool as well as a Windows GUI development kit. This version of the Lahey compiler was still inherently a DOS compiler. It required the files to be of MS-DOS text format and could not compile any POSIX formatted source files. The executables also required that any files that were accessed (OPEN, READ, WRITE) maintained the DOS 8+3 naming convention.

The Digital F90 compiler is bundled with the Microsoft Developer Studio which includes a GUI builder and full development/debugging environment. It is currently the compiler that is recommended by Microsoft and is fully compatible with Microsoft's GUI tools, libraries, and other compilers.

All of the compilers came with the libraries and debugging utilities that are deemed necessary for basic program development. Due to the complexity of added features that are bundled with each compiler and the wide range of intended users of the products, only performance results will be evaluated in this study. For a better description of the added tools and features that are bundled with the compilers, it is recommended that you visit the websites for the compiler manufactures, which are given at the end of this report. All compilations were performed using the MS DOS window command line compiler commands and flags with the highest compile time optimization available selected.

 

Benchmark Suite

The benchmark suite developed for this study uses nine production scientific computer codes that rigorously test the floating point capability of the machines and the optimization abilities of the compilers. Each of the benchmark codes is briefly described below. Except where noted otherwise, the benchmark codes were written by John K. Prentice of Quetzal Computational Associates, Inc.

fatigue

This code uses a continuum mechanics cumulative damage model to predict metal fatigue in thin wires undergoing cyclic bending. The code solves the relevant differential equations in time using a simple time-stepping algorithm.

 

inductance

This code discretizes a thin metal box using rectangular grid elements. It then calculates the self inductance of each element, the mutual inductance between all the elements, the mutual inductance between various external and internal cylindrical coils, and the mutual inductance between these coils and each grid element on the metal box. The integrations required for the inductance calculations are done either analytically or numerically using Gaussian quadrature techniques.

 

monte_carlo

This code numerically integrates

using a simple Monte Carlo integration scheme

where vector xn are 10-dimensional random numbers. The code repeats the integration for different values of n. The multidimensional random numbers are generated using a function GLEICH, developed by Tony Warnock at Los Alamos National Laboratory.

 

rnflow

This code was contributed by Michel Olagnon of IFREMER. It is a code for verifying the validity of some theoretical methods for analysis of fatigue loading of materials. It reads some standard fatigue data, extracts "turning points", computes the transition probability matrix, calculates the theoretical rainflow matrix, simulates 16 random sequences, performs rainflow count (using 2 different methods) and compares the simulated average rainflow transition matrix with the theoretical one. It uses some LAPACK routines, which are provided with (minimal) conversion to F90.

 

channel

This code was contributed by John McCalpin, formerly of the College of Marine Studies at the University of Delaware. It implements a simple A-grid shallow water model for a meridional channel.

 

gas_dynamics

This benchmark code is a one-and-a-half dimensional explicit finite difference code for modeling the fluid dynamics of a gas jetting out of a combustion chamber into an evacuated cavity with an expanding radius. At the far end of the cavity are screens which impede the flow. A real gas equation of state is used and the drag caused by the screens is explicitly calculated. The expanding gas front is moved by free molecular flow, whereas the flow behind the front is calculated by numerically solving the nonlinear partial differential equations for conservation of mass, momentum, and energy using a finite difference scheme.

 

kepler

This code numerically solves the equations of motion for a planet orbiting the sun. The equations of motion are solved numerically using a 12th order explicit symmetric multistep method.

 

protein

This code exhaustively calculates the density of states and the canonical partition function for an 8 amino acid protein-like model polypeptide adsorbed onto a heterogeneous polymeric-like surface. The calculation is done on a two-dimensional lattice.

 

scattering

This code builds a dense, complex valued linear system of order 1,818 which arises in quantum mechanical calculations of the scattering probability for low energy atoms scattering from crystal surfaces. The main calculational effort in this code involves computing the Fourier transform of atom/surface molecular potentials and some integrals arising from the numerical method for solving the Lippman-Schwinger equation.

Benchmark Protocol

The benchmarks were run in a MS DOS shell, with runtimes monitored by an internal timing wrapper that was wrapped around each of the routines. The internal timing wrapper called the system clock at the beginning and end of each benchmark and reported the elapsed time when the benchmark was complete. The best time for three separate runs for each benchmark was taken and is reported below in the Benchmark Results section.

 

Benchmark Results

The Windows NT F90 benchmark study was performed on a Dual Processor, 200Mhz Pentium Pro workstation with 128 Mbytes of memory. The workstation was running Microsoft NT4.0 Workstation, with service pack 3 installed. Command line compilation flags used for each compiler were either the highest level optimization, or the flags that were recommended for fastest runtime results. The compilation commands and flags used for each compiler are as follows:

Fujitsu F90: frt -kfast <code.f90>

Lahey Fortran 90: lf90 -nwrap -nwin -O3 <code.f90>

Digital Visual Fortran: f90 /Optimize:3 <code.f90>

Table 1 shows the best CPU time for each benchmark code on the Windows NT workstation.

Benchmark

Fujitsu

Digital

Lahey

channel

467

482

529

fatigue

429

790

1346

gas_dynamics

252

304

279

inductance

13

27

189

kepler

112

71

160

monte_carlo

25

N/R

17

protein

86

68

973

rnflow

115

103

99

scattering

376

222

140

Table 1. Best CPU time in seconds required for the benchmark codes to run on the NT workstation. Each code was run 3 times and the best time out of the three was reported. No other processes other than system required processes were running at the time of execution. N/R in a column indicates that the code compiled but aborted during execution.

Comparison RISC Machine Results

It is interesting to see how the floating point performance of a 200MHz Pentium Pro system compares to two RISC based workstations. The benchmark suite was run on a 200 MHz Fujitsu Halstation 375, and a Rave 300 MHz Sparc Ultra2 to compare against the PC workstation. The Fujitsu Halstation was running Solaris 2.5 and the Rave Sparc Ultra 2 was running Solaris 2.6. The benchmark suite was compiled with the Fujitsu F90 Solaris 64bit Sparc V2 compiler for both RISC systems. The -Kfast optimization was the only compiler option used when compiling codes for the both RISC systems.

Table 2 shows the best CPU time for each benchmark code run on the Sparc based workstations.

Benchmark

Hal SparcStation 375

Rave Ultra 2

channel

119

139

fatigue

508

401

gas_dynamics

168

78

inductance

16

6

kepler

92

61

monte_carlo

8

11

protein

157

108

rnflow

103

60

scattering

450

216

Table 2. Best CPU time in seconds required for the benchmark codes to run the Sparc Solaris based workstations using the Fujitsu F90 Sparc compiler. The benchmarks compiled to a 64 bit executable on the Hal Sparcstation, and to a 32 bit executable on the Rave Ultra2. Each code was run 3 times and the best time out of the three was reported. No other processes other than system required processes were running at the time of execution.

Conclusions

We are pleased that the compilers were able to compile all of the benchmarks with no problems. Except for the Digital compiled monte_carlo benchmark, all of the executables ran without problems and gave fairly good performance.

Comparing the benchmark results for the four NT workstation compilers shows the Fujitsu compiler generating the fastest code for four of the benchmarks (channel, fatigue, gas_dynamics, and inductance), and the Lahey compiler generating the fastest code for three of the benchmarks (monte_carlo, rnflow, and scattering). The Digital compiler generated the fastest code for the remaining two (kepler and protein) although the Digital compiler was faster than the Lahey compiler on 5 of the 8 benchmarks that ran. Typically, the Fujitsu compiler was the fastest, with the Digital compiler being only slightly slower.

Compiler Manufacturer Links

Digital Equipment Corp.

Fujitsu Software Corp.

Lahey Computer Systems

References

1 J. K. Prentice, "Performance Benchmarks for Fortran 90 Compilers", Mathematech, vol. 1, # 1, Spring 1994, Pages 66-73

2 J. K. Prentice, "A performance Benchmark Study of Fortran 90 Compilers", Fortran Journal, vol. 5, # 3, May/June 1993, Pages 2-7

3 J. K. Prentice and Agbeli K. Ameko, "Performance Benchmark Results for Selected Fortran 90 Compilers", Fortran Journal, vol. 6, # 6, Nov/Dec 1994