Skip to content

Import Benchmark

Mark Hoemmen edited this page May 14, 2018 · 5 revisions

What this benchmark measures:

  • Map creation time, for noncontiguous, possibly overlapping Maps
  • Import creation time, for the usual two-argument Import constructor, with noncontiguous source and target Maps
  • Vector creation time, for the usual constructor that takes a Map
  • doImport (apply Import) time, between two Vectors, using the above Maps

If you enable Epetra as well as Tpetra, you may use this benchmark to compare Epetra and Tpetra performance.

CMake flags needed to build the benchmark:

  • -D Trilinos_ENABLE_Tpetra:BOOL=ON
  • -D Tpetra_ENABLE_EXAMPLES:BOOL=ON
  • Optionally, -D Trilinos_ENABLE_Epetra:BOOL=ON (if you want to compare Epetra and Tpetra performance)

Path to benchmark source:

  • Trilinos/packages/tpetra/core/example/advanced/Benchmarks/import.cpp

Path to benchmark executable:

  • $BUILD/packages/tpetra/core/example/advanced/Benchmarks/TpetraCore_import.exe

Command-line arguments:

  • numEltsPerProc (default 100,000): Number of global indices owned by each MPI process
  • numTrials (default 100): Number of times to repeat each operation in a timing loop, to smooth out performance variation and deal with any timer granularity issues
  • runEpetra (default true if Epetra package is enabled, else false): Whether to run the benchmark with Epetra objects
  • runTpetra (default true): Whether to run the benchmark with Tpetra objects

Command line for small test:

  • mpirun -np 4 TpetraCore_import.exe [optional command-line arguments]

Suggested scaling study:

  • Weak scaling: Fix numEltsPerProc (see above) and vary the number of MPI processes
  • Strong scaling: For a study from 1 to 2^k MPI processes, pick numEltsPerProc = (2^{m+k})/P, where m is a nonnegative integer, and P is the number of MPI processes (assume a power of 2)

Preliminary results:

  • Platform used:
  • Summary or screenshot:
Clone this wiki locally