SFO17 LEG SC Meeting Agenda/Minutes
Agenda
New SC representatives
Meetings Schedule
Committee Evening
Major Topics
OpenBMC
Xen
RPK
DPDK
Prioritization Strategy
Slides
Documents
Minutes
LEG-SC Linaro connect SFO17 MoM
Session 1: LEG-SC Sep 25, 2017 @SFO17
Attendees
Martin, David, Kanta, Andrea, Mark, George, Christoffer,Ganesh, Gema, Anoop, Julien, Stuart
Gary Yurcak, Elsie Whalig Qualcomm
Hirai-san, , Fujitsu
MarkH, Jeff Underhill, Thomas, Matt Spencer, ARM
Wade Tresgaskis, Google
John Masters, Red Hat
Larry Wikelius, Cavium
Boazie, Kylin
?
HongBo, HXT
LEG Breakfast will not be attended by Linaro
Key topics for the next sessions
Xen - currently in LEG as ARM MarkH felt this is the place with the most interested stakeholders, Marc Zyngier (ARM) and Christoffer Dall (Linaro) are the maintainers for KVM on ARMv8 and they are both involved in background. Julien reports dotted line to Christoffer.
DPDK - it is very important for LEG and the DTS is in bad shape, not just for ARM but for x86 as well. LEG has ideas to help improve this.
CCIX
OpenBMC - several members raising this topic, not aligned across LEG yet. The Google and IBM teams seem more open to collaborate vs the Facebook OpenBMC. It would be easier to work with Google and IBM but we cannot just ignore OpenBMC. Martin will present a proposal pretty much aligned to what Dong Wei is working on at ARM, now part of the SBSA/SBBR specs.
RPK: Release 3 times a year. Like to setup a group at least once a year and it’s a technical call focusing on bugs, … It also include ERP too.
Review Engineering Prioritization. Will share the clean document and with use Zero based budget sheet
Health Check
Shared with the members and looking for feedback
Jeff: can you provide more information on the OpenStack CI?
Gema: the Kolla Project in OpenStack volunteered to run the CI on ARM servers as well, we are reserving few nodes to run these tests. First we need to stabilize the dev cloud then allocate the required number of nodes to Kolla.
Jon: have been looking at Gema’s Ansible scripts. LEG has a nicely deployable infrastructure. Shall we explore accessing also additional capacity via Packet?
Martin: yes, this is doable. Part of the Kolla project is about containerization as well as Kolla Ansible. Once our cloud is stable, the lead from the Kolla project has already volunteered to visit Linaro in Cambridge and work for one week to upgrade Linaro’s developer cloud to Kolla and containers.
MS: 50% we are doing CI and QA and ensure code works
Gema: next step is Scenario test and need more resource
MarkH: Resource of this project (Kolla project) and it’s rightly staffed . And fell more resources contribute on kolla project
MS: Huawei supported in significant amount in HW resources and also received switches from members
Jon: OpenStack on ARM does not just work, there are few fixes needed, these are not captured yet in a single place. For example, when you download the upstream OpenStack, by default it looks for different resource related to VNC, it is just one entry in the configuration.
Martin: yes it is well known and documented in Jira, there is a patch in flight
BigData 4 components in Hadoop with 1.5 resources
MarkH : Encouraging to add more engineerings in OpenJDK
Larry: there are changes that have been going on in Java and the industry for the last one month and we need to understand the role that Linaro wants to play here
Martin: Stuart (OpenJDK tech leader) will have an official talk on Friday, we will also have Azul and Oracle on Thursday
Associated Program
MS: Discuss Linaro baseline vs LEG offering. And looking for voting for LEG offering
Session 2: LEG-SC Sep 26, 2017 @SFO17
Attendees
Cavium: Larry, Zi
Huawei: Kang Kang, Kenneth, +1 (lady)
Fujitsu: Hirai-san, +1
Google: Wade
Red Hat: Jon
Qualcomm: Elsie, Gary, Sean
HXT: HongBo
ARM: Jeff, Mark, Matt, Thomas
Linaro: Martin, Christoffer, Anoop, Andrea, Gema, Julien, Leif, Stuart, Renato
Xen update
Proposed backlog:
CI
PCI passthrough
Jon: Do we care about GICv2m? Can we just work on GICv3? MarkH: please consider the wider scope, not just LEG. Christoffer: GICv2m is a tiny piece of work
ACPI
NUMA
Migration Support
Start with the simpler case of dead migration - eg a VM that is in suspend state
Kangkang: is the goal to achieve migration with pci passthrough? Christoffer/Jon: no, technically migration is not supported when pci passthrough is enabled. Elsie: this is why we are talking about dead migration instead of live migration
KangKang: it is a requirement to have migration (at least dead migration) with pci passthrough
Kangkang: Xen is a good match for Arm EL2 and 3, can we have Xen run in EL2? Christoffer: it is already running in EL2
Elsie: any idea of the sizing of them team? Martin: we are creating these cards in Jira and we already have an estimate of the required resources for each
Larry: Timeline for the backlog
Sean: can we align with the ERP CI testing.
Kangkang: what is the original motivation to do Xen? We are all doing KVM. Martin/Christoffer: we had Xen efforts in Linaro when Citrix joined, it went down when Citrix stopped their commercial support for Xen on Arm and quit Linaro. This is now picking up again.
MarkH: we now have 4 FTE engineers and the backlog is very big for 4 engineers only, we need many more and this is the reason for hosting Xen in LEG. We need at least eight engineers for a healthy project.
Zi: the ERP is referenced as part of the CI. Does this mean that Xen will be part of the ERP? Martin: yes, will discuss soon
Larry: relative to resources, let’s make sure that this ties up with the Xen project in the Linux Foundation with Lars.
MarkO: from the TSC perspective, there was a bit of surprise in the TSC Operational call few weeks ago. It is perfectly fine that Xen work is in LEG, need to make sure that if there are other requirements from outside LEG these can be taken into account. Also need to make sure that there is no overlap with the virtualization team.
Jon: is xenRAS also considered? Martin: yes but too much to fit it into the first six-month backlog
DPDK
Gema: Shared DPDK results with ERP comparing pass rate on x86 vs ARM (Thunder X). We are running the DTS (DPDK Test Suite), it is not very stable and it has 69% fail rate even on x86.
The problem with the DTS is that it is very poorly written, lot’s of hard coded values, there is no configuration to select which tests apply to a given platform or not. DTS is a 2nd test suite to use, the first to use is the DPDK unit testing, that is part of the DPDK itself.
See the Gaps/findings list in the slides
The feedback from the DPDK community is that they do not use any CI
Jon: the DPDK team in Red Hat sees similar failures. Jon will share the recommended working configuration with Gema
Martin: the baseline is the test pass/fail rate on x86
MattS: the DPDK community is looking at setting up a CI lab, recommended to donate hw to the DPDK lab instead of duplicating the CI in Linaro. Gema: planning to hook to mailing, grabbing the patches from the mailing list and submitting the results back
Larry: recommends not to fix/clean all issues in the x86 test suite itself on their behalf, need to figure out how to focus on improving the Arm architecture-specific code and performance
MattS: in the LF there is a wider collaborative project about xCI.
Martin: LEG needs to establish a baseline and possibly help fix the CI for the DTS run rate (both x86 and Arm if this happens) but then LEG will focus on fixing Arm-specific test failures only.
Larry: would like to have an update by end of October on the status of the collaboration with the DPDK / LF CI and progress on fixing the CI
Zi: want to see an alignment between the ERP release plan and the DPDK release plan. Gema: we catch the patches from the DPDK project when code freezing and testing + releasing the ERP.
Status for DPDK Performance. Gema: currently we are not doing DPDK performance
Kangkang: is this the right way to test DPDK or should we identify the functional use cases applicable for Arm and write these specific tests? Gema: once the CI is in place, we will be adding live use cases, e.g. using OVS in compute projects with Kolla OpenStack. Jon: we may also use direct passthrough into VMs and containers in VMs running VNF telco use cases. This needs VSMMU, Eric Auger at Red Hat and engineers in Arm are working on this. Agreed to come back with a use case analysis
CCIX
Refer to the CCIX consortium diagram
Several LEG members are involved in CCIX and there is a need for software enablement - not asking to staff resources in Linaro though
CCIX can be seen as plug-in memory for your machine with a coherency protocol
Red Hat, Cavium, Hisilicon and others are involved in CCIX
It includes devices with FPGA
It can be seen as point-to-point acceleration, up to 64 devices
Layering on PCIe for 1st gen devices
To enabling the broader ecosystem, we need:
a QEMU reference model for the whole system incl a PIC device providing an accelerator
a reference firmware Tianocore
a reference accelerator
Two resources for six months would be enough to kickstart this
Martin : Will go for voting on it
Larry: we are not ready for vote, once we are we would need to be very clear on what we would be working on. The list in LEG is already very long, not in favour of adding more
Jon: there are members who need a home for the resources working on this, as they would not be able to work on this in their respective projects. The ask is NOT for Linaro to assign such resources, just to host it.
Session 3: LEG-SC Sep 26, 2017 @SFO17
Attendees
Same as Session 2
OpenBMC
Martin : Share proposal
Will use the template and then go for the vote
OpenJDK
Stuart shared the OpenJDK linaro contribution
Jon : will share the writeup on what all discussed and will share with the members