...
We may also need to have a local repository (OpenHPC also allows some of that), for the experimental packages that haven't made into an upstream release yet.
Benchmarking
The final piece of the puzzle is how to measure performance in a CI loop.
When creating the packages on the task above, we should take care to enable them to run in two modes: validation and benchmark.
TheĀ validation style will just run a small subset that hopefully encompass most (if not all) of the functionality, so that we can have a quick return on the status of those features as work / doesn't work.
TheĀ benchmark style will run a subset of those features in larger loops and with internal timers, so that we can print out the run-time (or specific counters per second) into the test output.
Not all programs can have such an intrusive change, so we should also allow for simple "execution time", and cope with the noise by running it multiple times.
The second part of this task involved aggregating all the data into a database, so that we can track performance regressions.
Benchmark databases are generally large, no-SQL based and most of the time hand-made. Other team (ex. toolchain) already have large experience with benchmarking and tracking, so we should leverage their knowledge and existing tools.