Skip to content

Tom

tomw requested to merge tom into main

Move data initialization in gemver method inside of timing. This allows data initialization on separate processes which greatly reduces Scatter functions which accounts for the major time in gemver. Therefore gemver_mpi_2_new is the fastest method after input size > 5000 but with no -O3, otherwise 3 FLOPS are computed faster then sending & reciving data via MPI. Openmp is slow in this plot since they ran only on one thread... Figure_1_Performance

Edited by tomw

Merge request reports

Loading