在.Net 7中Math.Round性能优化

前言

在.Net源码看到一个issue,关于Math.Round的性能优化.主要是在使用Round通过指定舍入模式(MidpointRounding.ToEven).
  1. Optimize Math.Round(x, MidpointRounding.AwayFromZero/ToEven) (#64016)
Math.Round使用舍入模式
此次代码优化,没有JIT相关的代码.

测试代码

namespace net6perf.Maths
{
    [DisassemblyDiagnoser(printSource: true, maxDepth: 3)]
    [MemoryDiagnoser]
    public class RoundTest
    {
        [Params(1.5, 2.5, 3.5)]
        public double Value { get; set; }

        [Benchmark]
        public double ToEvenTest() => Math.Round(Value, MidpointRounding.ToEven);


        [Benchmark]
        public double AwayFromZeroTest() => Math.Round(Value, MidpointRounding.AwayFromZero);

    }
}

测试结果:

Math.Round使用舍入模式,性能测试结果

发现ToEven模式性能提升比较明显,而AwayFromZero模式没有太大的变化.在去看源码,ToEven模式调用Round方法(带有Intrinsic标记,指令优化).

[Intrinsic]
public static double Round(double a)
{
    // ************************************************************************************
    // IMPORTANT: Do not change this implementation without also updating MathF.Round(float),
    //            FloatingPointUtils::round(double), and FloatingPointUtils::round(float)
    // ************************************************************************************

    // This is based on the 'Berkeley SoftFloat Release 3e' algorithm

    ulong bits = BitConverter.DoubleToUInt64Bits(a);
    int exponent = double.ExtractExponentFromBits(bits);

    if (exponent <= 0x03FE)
    {
        if ((bits << 1) == 0)
        {
            // Exactly +/- zero should return the original value
            return a;
        }

        // Any value less than or equal to 0.5 will always round to exactly zero
        // and any value greater than 0.5 will always round to exactly one. However,
        // we need to preserve the original sign for IEEE compliance.

        double result = ((exponent == 0x03FE) && (double.ExtractSignificandFromBits(bits) != 0)) ? 1.0 : 0.0;
        return CopySign(result, a);
    }

    if (exponent >= 0x0433)
    {
        // Any value greater than or equal to 2^52 cannot have a fractional part,
        // So it will always round to exactly itself.

        return a;
    }

    // The absolute value should be greater than or equal to 1.0 and less than 2^52
    Debug.Assert((0x03FF <= exponent) && (exponent <= 0x0432));

    // Determine the last bit that represents the integral portion of the value
    // and the bits representing the fractional portion

    ulong lastBitMask = 1UL << (0x0433 - exponent);
    ulong roundBitsMask = lastBitMask - 1;

    // Increment the first fractional bit, which represents the midpoint between
    // two integral values in the current window.

    bits += lastBitMask >> 1;

    if ((bits & roundBitsMask) == 0)
    {
        // If that overflowed and the rest of the fractional bits are zero
        // then we were exactly x.5 and we want to round to the even result

        bits &= ~lastBitMask;
    }
    else
    {
        // Otherwise, we just want to strip the fractional bits off, truncating
        // to the current integer value.

        bits &= ~roundBitsMask;
    }

    return BitConverter.UInt64BitsToDouble(bits);
}

为此调整测试代码:

namespace net6perf.Maths
{
    [DisassemblyDiagnoser(printSource: true, maxDepth: 3)]
    [MemoryDiagnoser]
    public class RoundTest
    {
        [Params(1.5, 2.5, 3.5)]
        public double Value { get; set; }

        [Benchmark]
        public double ToEvenTest() => Math.Round(Value, MidpointRounding.ToEven);


        [Benchmark]
        public double AwayFromZeroTest() => Math.Round(Value, MidpointRounding.AwayFromZero);


        [Benchmark(Baseline = true)]
        public double RoundDefault() => Math.Round(Value);
    }
}

看生成汇编代码:

; net6perf.Maths.RoundTest.ToEvenTest()
       vzeroupper
       vroundsd  xmm0,xmm0,qword ptr [rcx+8],4
       ret
; Total bytes of code 11


; net6perf.Maths.RoundTest.RoundDefault()
       vzeroupper
       vroundsd  xmm0,xmm0,qword ptr [rcx+8],4
       ret
; Total bytes of code 11

AwayFromZero没有做优化,生成的汇编代码太长,所以就没有展示.

秋风 2022-03-28