About
Goal: Split current monolithic wperf-driver
(Kernel driver) into small, independent drivers with well defined Kernel ↔︎ User space protocol.
Relates to - WPERF-709Getting issue details... STATUS .
Introduction
Note: Please note that this is draft, we may want to split drivers in a different way e.g. core PMU and uncore PMU as two separate drivers (not three like its stated below). See below draft of ideas and possible solutions to DT architecture. Further chapters should describe architecture we want o actually implement.
[ON HOLD] Split current
wperf-driver
into three separate independent drivers (core, DSU and DMC driver).Introduce “driver type” for WindowsPerf drivers in DT.
“core PMU” driver type to support vanilla Arm PMUs.
“uncore PMU” driver(s):
Introduce “DSU” driver type to support uncore PMU counting.
Introduce “DMC” driver type to support Dynamic Memory Controllers.
[ON HOLD] Introduce driver “template” for each type.
Note: We may want to have one template for “PMU like devices”
[ON HOLD] Move current
wperf-driver
core PMU and uncore PMU implementations under new driver types.Add DT driver enumeration to user space.
Adapt
wperf
and drivers to communicate with N drivers installed.Adapt
wperf-devgen
to support new driver types enumeration and installation / removal.
Design And Architecture
This chapter contains design and architectural decisions we agree to implement with DT feature.
All below architectural changes are a roadmap (no order in which they should be added).
DMC driver prototype
Kernel - user-space Interface
Interface between user-space (wperf
application and Kernel driver aka wperf-driver
(and other wperf-driver-*
drivers divided into logical sections and is defined by IOCTLs as follows:
Counting Interface
This supports generic Arm PMU hardware.
Counting model, for obtaining aggregate counts of occurrences of special events.
IOCTL | Description / Notes |
---|---|
IOCTL_PMU_CTL_START | |
IOCTL_PMU_CTL_STOP | |
IOCTL_PMU_CTL_RESET | |
IOCTL_PMU_CTL_QUERY_SUPP_EVENTS | |
IOCTL_PMU_CTL_ASSIGN_EVENTS | |
IOCTL_PMU_CTL_READ_COUNTING | |
IOCTL_DSU_CTL_INIT | |
IOCTL_DSU_CTL_READ_COUNTING | |
IOCTL_DMC_CTL_INIT | |
IOCTL_DMC_CTL_READ_COUNTING |
Sampling Interface
This supports generic Arm PMU hardware and other devices (such as SPE?) which can perform sampling and deliver the same typo of information to user space: PC values with hit counts.
Sampling model, for determining the frequencies of event occurrences produced by program locations at the function, basic block, and/or instruction levels.
IOCTL | Description / Notes |
---|---|
IOCTL_PMU_CTL_SAMPLE_SET_SRC | |
IOCTL_PMU_CTL_SAMPLE_START | |
IOCTL_PMU_CTL_SAMPLE_STOP | |
IOCTL_PMU_CTL_SAMPLE_GET |
Interoperability Interface
This interface is common amongst all drivers. It is user to query, detect, fetch hardware information from each driver.
Note: ALL DRIVERS MUST CORRECTLY AND IN THE SAME WAY (ABI) IMPLEMENT THIS INTERFACE.
IOCTL | Description / Notes |
---|---|
IOCTL_PMU_CTL_QUERY_HW_CFG | |
IOCTL_PMU_CTL_QUERY_VERSION | |
IOCTL_PMU_CTL_LOCK_ACQUIRE | |
IOCTL_PMU_CTL_LOCK_RELEASE |
Device_ID string pattern
See:
Implementation details:
Introduction
<dev_type>.<dev_func>=<event_prefix_list>
- separated with semicolon ;
for each supported "capabilities".
Where:
<dev_type>
- E.g. supported type such as:core
(Arm PMU),dsu
dmc
(Arm DDR controller), andspe
(Arm Statistical Profiling Extension).
<dev_func>
stat
for counting withwperf stat
(also timelines)sample
orrecord
for sampling withwperf sample
/wperf record
.
<event_prefix_list>
- list of event prefixes supported (withwperf ... -e <events>
) that will indicate which device to select for the events.
Examples
Example Device_ID string for pristine wperf-driver
:
core.stat=core;core.sample=core;dsu.stat=dsu;dmc.stat=dmc_clk,dmc_clkdiv2
Example Device_ID for each device
core.stat=core
- events starting with/core/
(or no prefix!) selected with:wperf stat -e <event>
wperf stat -e /core/<event>
orwill be used to count events over time.
core.sample=core
- events starting with/core/
(or no prefix!) selected withwperf sample -e <event>
wperf sample -e /core/<event>
orwill be used to sample events over time.
dmc.stat=dmc_clk,dmc_clkdiv2 - events starting with
/dmc_clk/or
/dmc_clkdiv2/ ` selected with:wperf stat -e /dmc_clk/<event>
orwperf stat -e /dmc_clkdiv2/<events
will be used to count events over time.
Assumptions:
One driver can have many capabilities in their Device_ID string.
Driver yields data for each capability information in Device_ID string.
Notes
Przemyslaw Wirkus: Changes that we may have to implement:
New IOCTL with WindowsPerf like detection infmroation, with payload like:
Capability string (ASCIIZ) e.g.
"spe"
,"core;dsu;dmc"
where"core"
or"spe"
are defined by us devices which can e.g. count and sample. Can be semicolon;
separated.Note:
"core;spe"
- we would allow one driver to have more than one capability aka.
Functionality string (ASCIIZ) e.g.
"counting"
or"sampling"
where we e.g that"pmu"
capability comes with counting and sampling (likewperf-driver
does). Can be semicolon;
separated for each capability. Example: for capability string `"core;dsu;dmc"
(which would bewperf-driver
capability string) we can generate below functionality string:"counting,sampling;counting;counting"
and it maps like this:core
→counting,sampling
-wperf
’sstat
andsample
/record
work with events specified with-e
and/core/<event>
(or-e <event>
).Note
/core/
will be also default (as it is) for events without name.
dsu
→counting
-wperf
’sstat
work with events specified with-e
and/dsu/<events>
.dmc
→counting
-wperf
’sstat
work with events specified with-e
and/dmc/<events>
.
IOCTL_PMU_CTL_QUERY_HW_CFG
- we may have to split this IOCTL to at least two parts, one would provide hardware information (like register values) as it’s doing now, second part would be maybe interface specific.