Table of Contents | ||
---|---|---|
|
Attendance
Committee Members
Name | Present |
---|---|
Kanta Vekaria (HPC Lead, OCTO Linaro) | |
Martin Stadtler (Director of LEG, Linaro) | |
Andrew Wafaa (ARM) | |
Pak Lui (Huawei) | |
Larry Wikelius (Cavium) | |
Steve Geist (Qualcomm) | |
Hongbo Zhang (HXT) | |
Jon Masters (RedHat) | |
Takeharu Kato (Fujitsu) | |
Koichi Hirai (Fujitsu) | |
Kevin Pedretti (Sandia) |
Guests
Name | Present |
---|---|
David Rusling (CTO, Linaro) | |
Andrea Gallo (VP of Segment Groups, Linaro) | |
Anoop Saxena (Project Manager, Segments) |
Renato Golin (Linaro, Tech Lead) |
|
Ashwin Shekar, Joel Jones (Cavium) |
|
Eric VanHensbergen, David Lecomber (Arm) |
Masakazu Queno (Fujitsu) |
Agenda
- Upcoming SIG meetings
- Operational Update (Renato)
View file name HPC SC - 16 Oct 2018.pdf height 250
Minutes
Upcoming SIG meetings
SIG meeting 6th Nov - Astra Bring Up Kevin Pedretti, Sandia
SC18 12-15th Nov
SIG meeting 20th Nov - Operational update on engineering - Renato - (NB Thanksgiving)
SIG meeting 4th Dec - Karl openHPC
SIG meeting 18th Dec - Cancel
Links from the Chat
Renato Golin:
http://llvm.org/devmtg/2018-04/talks.html#Talk_11http://llvm.org/devmtg/2018-04/slides/Yatsina-LLVM
Greedy Register Allocator.pdfhttps://gitlab.com/arm-hpc/packages/wikis/homehttps://github.com/jratcliff63367/sse2neon
David Lecomber:https://github.com/arm-hpc/porting-advisor
Questions
Is Linaro using Inbox or Mellanox OFED drivers?
- We're using both and we'll add them both to the OpenHPC Ansible recipes
- Mellanox's own driver supports Socket Direct, which most of us need
- But we have to promote the open source drivers, so we're implementing the Sub-net Manager in the switch
What are the problems that we're seeing in the register allocation in LLVM?
- Mostly pathological cases where the inner loop of an HPC kernel is too large / compete with other parts of the outer loop
- This is due to the cost of region splitting and recombining, as identified by Intel (slides, video)
OpenHPC default packages offer really poor base performance on AArch64
- Most striking case is OpenBLAS, which has no NEON support at all and can be 6x slower than if compiled for ThunderX2
- Joel confirms they have done work for AArch64 in general, so their build would benefit most other Arm vendors
- We need to identify what the problem is (OpenHPC flags, package Makefiles, source IFDEFs) and fix it in the right place
How do we collaborate on the ones that cannot be fixed generically?
- Different vendors have different levels of the architecture (v8.0, v8.1, v8.2 etc) and may need different features for different packages
- Not to mention SVE vs NEON (which is similar to AX2 vs AVX512 on Intel)
- How do we manage the combinatorial problem?
A few options:
- Propose a set of rules for a very limited number of packages for a very limited number of architectures to OpenHPC
- Use Spack for the specialist builds as long as we have good enough base packages for AArch64
- Use Linaro's ERP-style overlay for specialised packages (will it play well with OpenHPC? What about version upgrades?)
- Update pages like Arm's HPC packages list with options and comments on each package
AVX/SSE to NEON/SVE tools can be used to identify and, sometimes, automatically re-implement algorithms if using Intel's intrinsics.
View file | ||||
---|---|---|---|---|
|