Device Tree Feature (aka DT) Support
About
Goal: Split current monolithic wperf-driver
(Kernel driver) into small, independent drivers with well defined Kernel ↔︎ User space protocol.
Relates to WPERF-709: Device Tree SupportClosed.
Introduction
Note: Please note that this is draft, we may want to split drivers in a different way e.g. core PMU and uncore PMU as two separate drivers (not three like its stated below). See below draft of ideas and possible solutions to DT architecture. Further chapters should describe architecture we want o actually implement.
[ON HOLD] Split current
wperf-driver
into three separate independent drivers (core, DSU and DMC driver).Introduce “driver type” for WindowsPerf drivers in DT.
“core PMU” driver type to support vanilla Arm PMUs.
“uncore PMU” driver(s):
Introduce “DSU” driver type to support uncore PMU counting.
Introduce “DMC” driver type to support Dynamic Memory Controllers.
[ON HOLD] Introduce driver “template” for each type.
Note: We may want to have one template for “PMU like devices”
[ON HOLD] Move current
wperf-driver
core PMU and uncore PMU implementations under new driver types.Add DT driver enumeration to user space.
Adapt
wperf
and drivers to communicate with N drivers installed.Adapt
wperf-devgen
to support new driver types enumeration and installation / removal.
Design And Architecture
This chapter contains design and architectural decisions we agree to implement with DT feature.
All below architectural changes are a roadmap (no order in which they should be added).
DMC driver prototype
Files · feature-WPERF-729 · linaro / WindowsPerf / WindowsPerf · GitLab
Comments from @Matthew Sykes (Deactivated) I got from Slack (just to archive here):
This is the IOCTL handler. The concept of 'queue' is a bit old fashioned to be honest, I dont know why MSFT used it in KMDF, but they have, and the 'queue init' code sets the IOCTL handler entry point.
Device create sets the File Open and CLose entry point, so those handlers are in device.c
WDM drivers are much cleaner in this respect, open, close, ioctl, are just Irp requests, and all set in the same place and often handled in the same fileThen you have power and plug and play handlers, they also have their own files
IOCTL handlers by their nature can be hugeEspecially so since often the Irp needs handling on the way back up from a lower device, to there is twice as much code, one section sends it down, the other handles it on the way up, as the lower driver, usually a PDO, sets some values in the Irp
But yeah, its a bit of a moot point whether to put queue and ioctl into two files or one, as they are actually the same thing in KMDF
Linaro Dummy is the driver, and the yellow bang is saying:
"The device cannot find enough free resources" , yet clearly in the previous screen shot there is nothing else using this range. So its almost as if the system has reserved it and wont let the DMC have it.
Kernel - user-space Interface
Interface between user-space (wperf
application and Kernel driver aka wperf-driver
(and other wperf-driver-*
drivers divided into logical sections and is defined by IOCTLs as follows:
Counting Interface
This supports generic Arm PMU hardware.
Counting model, for obtaining aggregate counts of occurrences of special events.
IOCTL | Description / Notes |
---|---|
IOCTL_PMU_CTL_START |
|
IOCTL_PMU_CTL_STOP |
|
IOCTL_PMU_CTL_RESET |
|
IOCTL_PMU_CTL_QUERY_SUPP_EVENTS |
|
IOCTL_PMU_CTL_ASSIGN_EVENTS |
|
IOCTL_PMU_CTL_READ_COUNTING |
|
IOCTL_DSU_CTL_INIT |
|
IOCTL_DSU_CTL_READ_COUNTING |
|
IOCTL_DMC_CTL_INIT |
|
IOCTL_DMC_CTL_READ_COUNTING |
|
Sampling Interface
This supports generic Arm PMU hardware and other devices (such as SPE?) which can perform sampling and deliver the same typo of information to user space: PC values with hit counts.
Sampling model, for determining the frequencies of event occurrences produced by program locations at the function, basic block, and/or instruction levels.
IOCTL | Description / Notes |
---|---|
IOCTL_PMU_CTL_SAMPLE_SET_SRC |
|
IOCTL_PMU_CTL_SAMPLE_START |
|
IOCTL_PMU_CTL_SAMPLE_STOP |
|
IOCTL_PMU_CTL_SAMPLE_GET |
|
Interoperability Interface
This interface is common amongst all drivers. It is user to query, detect, fetch hardware information from each driver.
Note: ALL DRIVERS MUST CORRECTLY AND IN THE SAME WAY (ABI) IMPLEMENT THIS INTERFACE.
IOCTL | Description / Notes |
---|---|
IOCTL_PMU_CTL_QUERY_HW_CFG |
|
IOCTL_PMU_CTL_QUERY_VERSION |
|
IOCTL_PMU_CTL_LOCK_ACQUIRE |
|
IOCTL_PMU_CTL_LOCK_RELEASE |
|
Device_ID string pattern
See:
Implementation details:
Introduction
<dev_type>.<dev_func>=<event_prefix_list>
- separated with semicolon ;
for each supported "capabilities".
Where:
<dev_type>
- E.g. supported type such as:core
(Arm PMU),dsu
dmc
(Arm DDR controller), andspe
(Arm Statistical Profiling Extension).
<dev_func>
stat
for counting withwperf stat
(also timelines)sample
orrecord
for sampling withwperf sample
/wperf record
.
<event_prefix_list>
- list of event prefixes supported (withwperf ... -e <events>
) that will indicate which device to select for the events.
Examples
Example Device_ID string for pristine wperf-driver
:
core.stat=core;core.sample=core;dsu.stat=dsu;dmc.stat=dmc_clk,dmc_clkdiv2
Example Device_ID for each device
core.stat=core
- events starting with/core/
(or no prefix!) selected with:wperf stat -e <event>
wperf stat -e /core/<event>
orwill be used to count events over time.
core.sample=core
- events starting with/core/
(or no prefix!) selected withwperf sample -e <event>
wperf sample -e /core/<event>
orwill be used to sample events over time.
dmc.stat=dmc_clk,dmc_clkdiv2 - events starting with
/dmc_clk/or
/dmc_clkdiv2/ ` selected with:wperf stat -e /dmc_clk/<event>
orwperf stat -e /dmc_clkdiv2/<events
will be used to count events over time.
Assumptions:
One driver can have many capabilities in their Device_ID string.
Driver yields data for each capability information in Device_ID string.
Notes
@Przemyslaw Wirkus: Changes that we may have to implement:
New IOCTL with WindowsPerf like detection infmroation, with payload like:
Capability string (ASCIIZ) e.g.
"spe"
,"core;dsu;dmc"
where"core"
or"spe"
are defined by us devices which can e.g. count and sample. Can be semicolon;
separated.Note:
"core;spe"
- we would allow one driver to have more than one capability aka.
Functionality string (ASCIIZ) e.g.
"counting"
or"sampling"
where we e.g that"pmu"
capability comes with counting and sampling (likewperf-driver
does). Can be semicolon;
separated for each capability. Example: for capability string `"core;dsu;dmc"
(which would bewperf-driver
capability string) we can generate below functionality string:"counting,sampling;counting;counting"
and it maps like this:core
→counting,sampling
-wperf
’sstat
andsample
/record
work with events specified with-e
and/core/<event>
(or-e <event>
).Note
/core/
will be also default (as it is) for events without name.
dsu
→counting
-wperf
’sstat
work with events specified with-e
and/dsu/<events>
.dmc
→counting
-wperf
’sstat
work with events specified with-e
and/dmc/<events>
.
IOCTL_PMU_CTL_QUERY_HW_CFG
- we may have to split this IOCTL to at least two parts, one would provide hardware information (like register values) as it’s doing now, second part would be maybe interface specific.