Missing x64 Intrinsics and Windows API for ARM64

This is a list of x64 intrinsics used in many popular codebases but is lacking an ARM64 implementation and an alternative MSVC intrinsics.

An alternative ARM64 implementation in C is provided to help developers port code for windows arm64 targets.

Intrinsics / Windows API

Description

ARM64 alternative implementation

Intrinsics / Windows API

Description

ARM64 alternative implementation

1

_umul128

128-bit unsigned multiplication

1 2 3 4 5 6 #pragma intrinsic(__umulh) uint64_t _umul128(uint64_t a, uint64_t b, uint64_t *high) { *high = __umulh(a, b); return a * b; }
2

UnsignedMultiply128

3

_addcarry_u64

64-bit add with carry returning the carry flag

1 2 3 4 uint64_t addcarry(uint64_t x, uint64_t y, uint64_t carry_in, uint64_t *sum) { *sum = x + y + (carry_in !=0 ? 1 : 0); return x > UINT32_MAX - y; }
4

__rdtsc

Processor time stamp - Number of clock cycles since reset

1 2 3 4 5 6 7 8 int64_t processor_cycle_count() { const int64_t pmccntr_el0 = (((3 & 1) << 14) | // op0 ((3 & 7) << 11) | // op1 ((9 & 15) << 7) | // crn ((13 & 15) << 3) | // crm ((0 & 7) << 0)); // op2 return _ReadStatusReg (pmccntr_el0); }
5

__popcnt and variants

Counts the number of 1 bits (population count) in a 16-, 32-, or 64-bit unsigned integer.