Improve mem and string function performance

Description

Rationale

The glibc implementations for mem and string functions (as defined by POSIX) on ARM are largely donated to glibc from Linaro's cortex-strings library.

The cortex-strings library is released under a modified-BSD license and copyright Linaro Limited. Glibc is released under the LGPLv2 and requires all contributions to be copyright assigned the Free Software Foundation.

This means that in order to improve performance of any of these routines we should first address performance in cortex-strings and then any changes should be merged into glibc under copyright assignment and permissive relicensing.

Most applications spend at least some computation time in mem and string routines. Many applications spend a significant portion of time in these functions. These are some of the most highly optimized functions in operating system c-libraries. Performance of mem and string routines will affect almost all real world work loads as well as show up significantly in major benchmarks such as SPEC2k.

Deliverables

  • Perform analysis of the current state of mem and string optimization routines for ARMv7 and ARMv8.

    • ARMv8-A string routines for cortex-a53 and cortex-a57 is state-of-the-art.

    • memcpy, strlen, and strcmp are all best in class for ARMv7.

    • other functions for ARMv7 are weaker, such as memset, strchr.

  • Identify preconditions for each function to be optimized.

    • Optimal data lengths

    • Aligned vs. unaligned (which do we optimize for)

    • Vectorization/NEON (expense vs benefit)

  • Provide optimized versions of functions that are performing under their potential.

    • strchr

    • memmove

    • memset

    • strcmp

Staffing

Each function optimization takes about 1m of effort and we'll shoot for four optimizations for a total expense of 4m.

External Dependencies

Does not apply

Risks and Assumptions

  • List any risks and/or assumptions for this card. Please use a table if possible.

  • Consider technical, logistic, and people (assignee) risks

  • Consider external dependency risks: community, upstream, hardware.

  • Risk Matrix (Optional section): Impact rating 1 (little impact) to 3 (severe impact)

Risk Description

Impact (1-3)

Risk Mitigation

Specific optimizations take longer than 1m

2

We will deliver one fewer optimizations than planned.

Acceptance Criteria and Closeout

Criteria

Status

Closeout Notes/Links

cortex-strings and glibc make check passes cleanly

 

 

preconditions are documented in implementations

 

 

description of acceptance criteria...

(/)(x)

notes

Legend:

Done, Not Done, Doesn't apply (note the reason)

Checklist

Activity

Show:

Maxim Kuvyrkov October 16, 2017 at 5:05 PM

TCWG has no plans to actively work on memory and string functions.  Propagating contributions to cortex-strings to C libraries are handled in other epics.

Serge Broslavsky September 4, 2015 at 7:25 AM

Migrated comment originally posted by will.newton@linaro.org on 2014-05-27 20:05:36 +0000.

Previous routines have been upstreamed to glibc, bionic and newlib by the TCWG.

I believe some may have found there way into the kernel as well, but there may be a different set of tradeoffs there e.g. less appetite for using FPU/SIMD and specialized routines for clearing pages etc.

Serge Broslavsky September 4, 2015 at 7:25 AM

Migrated comment originally posted by kate.stewart@linaro.org on 2014-06-15 03:51:16 +0000.

Reviewed in OPSCOM-1406, moving from drafting to upstream development. This has been in process under the ‘iteration’ methodology used previously. This is now exposed at the roadmap card level.

Serge Broslavsky September 4, 2015 at 7:25 AM

Migrated comment originally posted by will.newton@linaro.org on 2014-05-23 07:34:46 +0000.

I missed a function out for AArch64 - we are also missing a strcpy implementation.

Serge Broslavsky September 4, 2015 at 7:25 AM

Migrated comment originally posted by kate.stewart@linaro.org on 2014-05-27 19:52:11 +0000.

Which upstreams are the target for this work? (Will these routines be able to go into bionic? or should a separate CARD be created for this?)

Duplicate

Details

Assignee

Reporter

Priority

Checklist

Sentry

Created September 4, 2015 at 7:25 AM
Updated July 15, 2021 at 4:56 PM
Resolved October 16, 2017 at 5:05 PM