2021-01-21 Project Stratos Sync Meeting notes

Attendees

  • Diana (NXP)
  • Sirivatsa 9Qualcomm)
  • Arnd
  • Mathieu
  • Alex
  • Peter
  • Illias
  • Many attendees had a conflict with the Linaro Members meeting

Goals

  • Get a view of the engineering support available to work on Stratos.

Discussion items

  • Alex: Overview of the linaro plans for the Stratos project in the next cycle

Multimedia
==========

With virtio-video approaching standardisation:

  Subject: [RFC PATCH v5] virtio-video: Add virtio video device specification
  Date: Wed, 20 Jan 2021 17:31:43 +0900
  Message-Id: <20210120083143.766189-1-acourbot@chromium.org>

we think enabling this would be a good introduction to the challenges of
high bandwidth multimedia. We considered more advanced devices such as
cameras but thought that given the Linux kernel API is still evolving it
was too soon to try and stabilise a VirtIO specification - especially if
we want to avoid just making it ape the Linux API. virtio-gpu (including
virtio-wayland) already has a number of implementations across a number
of VMMs and hypervisors that it doesn't make sense to add yet another
one to the mix. However virtio-video does share some similar problems
including needing to solve the management of memory across virtual
domains where the final location and alignment of memory are important.

Peter Griffin is leading this work and creating some cards shortly.
Broadly this will cover:

  - Helping get the Linux FE (from Google's ChromeOS) up-streamed
  - Implementing a standalone Backend (vhost-user, via QEMU)
  - Architecture document for more complex deployments

The initial demo will involve terminating the backend on a KVM Host or
Xen Dom0. The architecture work will consider how the more complex
deployments would work (splitting domains, mapping to secure world etc)
and form the basis for future work.

Memory Isolation
================

We did a bunch of investigative work last cycle but generated rather
more questions than concrete answers. There are a number of avenues to
explore but currently there isn't a clear way forward for a general
purpose solution for the problem. There is ongoing work in the community
on solving the specific zero-copy problem for virtio-gpu and we hope to
learn more lessons with our virtio-video work. In the meantime there was
a potential copy based solution proposed that works for low performance
interface. Currently described as "Fat VirtQueues" (name subject to
change) this embeds all data inside the virtqueues themselves. The major
limitation is that any data frames passed this way must be fully self
contained and not reference memory outside the queue.

This makes the isolation problem more tractable as the queue itself will
be the only thing that needs to be shared between virtual domains.

Arnd Bergmann will be leading this work which is currently captured in
the STR-25 card:

  https://projects.linaro.org/browse/STR-25


Xen Work
========

We did a bit of work on Xen last cycle which was mostly housekeeping
work to fix regressions and issues booting up on ARM64 systems. We want
to continue the work here to make Xen our reference type-1 hypervisor
for VirtIO work. There is currently a patch series:

  Subject: [PATCH V4 00/24] IOREQ feature (+ virtio-mmio) on Arm
  Date: Tue, 12 Jan 2021 23:52:08 +0200
  Message-Id: <1610488352-18494-1-git-send-email-olekstysh@gmail.com>

which we are helping review and test. It currently comes with it's own
virtio-block device backend which can replace the Xen block device
approach. We plan to build on this work and enable QEMU as a generic
virtio backend for Xen ioreq devices as a general proving ground for
virtio backends. While it won't allow for the fastest virtio it will
give access to a broad range of backends thanks to QEMUs general purpose
approach.

I'll be taking the lead on this work which is covered by STR-19:

  https://projects.linaro.org/browse/STR-19

We are also looking at implementing a Xen mediator for the Arm Firmware
Framework. This is a general purpose framework where a hypervisor can
communicate with the system firmware with a common API. This avoids the
need to have multiple firmware aware implementations in the hypervisor
for accessing secure services. As long as the firmware provides the
interface the hypervisor will be able to run on it.

Ruchika Gupta is leading this work under STR-23 which is part of the
broader trusted substrate initiative:

  https://projects.linaro.org/browse/STR-23

SCMI Server
===========

The System Control and Management Interface (SCMI) provides a mechanism
for clients (e.g. kernels needing resources) to request hardware
resources from the system. The server usually sits in the secure
firmware layer and responds to secure calls from the kernel to turn
resources on and off. It is key to efficient power management as you
might for example want to turn clock sources off between decoded video
frames.

In a multi-domain system you have to mediate between a number of
potential users of these resources. For the non-primary domain you can
use a virtio-scmi device:

  Subject: [PATCH v5] Add virtio SCMI device specification
  Date: Wed, 27 May 2020 19:43:25 +0200
  Message-ID: <20200527174325.9529-1-peter.hilber@opensynergy.com>

There is already a proposal for the kernel driver to go along with the
specification:

  Subject: [RFC PATCH v2 00/10] firmware: arm_scmi: Add virtio transport
  Date: Thu, 5 Nov 2020 22:21:06 +0100
  Message-ID: <20201105212116.411422-1-peter.hilber@opensynergy.com>

So our work would be focused on helping those get upstream and working
on an open source reference implementation of the server in the backend.
The question of where the SCMI server should be implemented is an open
one.

The simplest would be a proof of concept user-space server which extends
the existing testing build. This would demonstrate the connection but
wouldn't be usable in production as there isn't currently a method for
user space to access the resource hierarchy maintained by the kernel.

Another option would be to terminate virtio-scmi inside the host kernel
where it could then be merged with the hosts own requests. However this
does seem like a horrific hack that embeds policy decisions in the
kernel.

The other two options are enable the virtio backend for OPTEE (where the
SCMI server can live) or enable the SCMI server in a Zephyr RTOS which
has already got some experimental virtio support in preparation for a
Zephyr Dom0.

This work is being led by Vincent Guittot and can be followed from:

  https://projects.linaro.org/browse/STR-4

VirtIO serial devices
=====================

There is a desire to implement another serial like interface for virtio
which are common in exposing hardware on embedded and mobile devices.
There are several option available although currently only virtio-i2c
has a proposal for the standard:

   Date: Fri,  8 Jan 2021 15:39:08 +0800
   Message-Id: <dfb21780647c69519f01fb0afbbd18f780963af9.1610091344.git.jie.deng@intel.com>
   Subject: [virtio-comment] [PATCH v7] virtio-i2c: add the device specification

however there have been a number of alternative proposals including
using virtio-greybus or virtio-rpmsg as general purpose multiplexer
transports for these sort of low bandwidth datagram services. Having a
virtio-i2c implementation would be useful for testing the fat virtqueue
concept, although both the existing virtio-rpmb and proposed virtio-scmi
daemons could also be pressed into service for this.

Currently we don't have anyone assigned to look at this so I think this
needs someone to step forward with a proposed use case to take this up.

Housekeeping
============

I'm planning on closing out STR-7 (Create a common virtio library for
use by programs implementing a backend) as I'm not sure what it would
achieve. We have implemented one C based backend using the libvhost code
inside the QEMU repository. Although not totally separate from the rest
of the source tree it could be made so with minimal effort if needed. In
the meantime Takahiro has enabled VirtIO inside Zephyr by adapting the
current Linux code into it.

The main contender for a common library comes from the rust-vmm project:

  https://github.com/rust-vmm

and specifically the vhost-user-backend crate:

  https://github.com/rust-vmm/vhost-user-backend/

There are a number of backends that have been implemented with it but it
probably requires someone with good Rust background to evaluate the
current state of the libraries. To my untrained eye there is still some
commonality in the handling that could be moved from the individual
daemons to make the core libraries easier to use. If we want to go
forward with Rust we should create a specific card for that that a Rust
expert could work on.

Discussion Points 

  • Multimedia: Arnd Bergmannasked wasn't virtio-camera and virtio-video the same thing, Alex Bennéepointed towards Peter Griffin and "it's more complex than that"
  • Memory Isolation: Srivatsa Vaddagiri(question) asked about expanding virtqueue limits and testing with virtio-block. Arnd Bergmannsaid initial implementation would probably be too constrained but can look at it after. Alex Bennéepointed out it would be a baseline reference so we know the performance range between non-isolated virtio guests and Fat VirtQueues. From there it's deciding which trade-offs to make.
  • SCMI Server: ? was keen for the SCMI server to be in userspace. Long discussion about the merits of the various approaches and potential performance hits of userspace (+custom non-upstream kernel API) vs cost to world switch to an RTOS like Zephyr or a Unikernel approach
  • Serial Device: Alex Bennéecalled for engineering resource if we want to take it forward this cycle. Arnd Bergmanndeclared the greybus solution as dead, Mathieu Poiriertalked about rpmsg being actively used as a generic multiplexer slotting into the virtio stack. Pointed out with large numbers of devices on memory constrained systems some sort of multipexler is needed. Arnd Bergmannthrew shade at virto-serial and raised the idea of a generic virtio-on-virtio multiplexer service.
  • AoB: discussion about virtio-vpmb and emmc passthrough.

Action items

  • Ilias Apalodimas , Arnd Bergmann , Alex Bennée - chase Kernel on RPMB API, propose spec update to handle eMMC passthrough : 4th Feb 2021 need to talk to Ulf, Arnd says there is a hold up, but need a proper patch for USFS but the author never followed up, this was up to v7.  there is no proper interface. Arnd / Illias to continue the discussion with Ulf and perhaps Intel. : Feb 18th  API from intel is too detailed, not useful for RPMB - proposed to list this week. probably have to respin with a higher-level API for RPMB. If this does not go in OASIS might have to change.  Alex change driver mode of Linux is the best answer:  feb19 2021 - moved to 
    Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.