Advertisement

scriptMath and closeTo()

Started by February 16, 2014 08:22 PM
6 comments, last by iraxef 10 years, 9 months ago

The article you cited (in rev 1832) looked familiar. For reference, here is what we use:


namespace RelativeFltCmp
{
    template <size_t bytes>
    struct TypeWithSize
    {
        typedef void UInt;
    };

    template <>
    struct TypeWithSize<4>
    {
        typedef uint32_t UInt;
    };

    template <>
    struct TypeWithSize<8>
    {
        typedef uint64_t UInt;
    };

    template <typename RawType>
    class FloatingPoint
    {
    public:
        typedef typename TypeWithSize<sizeof(RawType)>::UInt Bits;

        static const size_t kBitCount = 8*sizeof(RawType);
        static const size_t kFracBitCount = std::numeric_limits<RawType>::digits - 1;
        static const size_t kExpBitCount = kBitCount - 1 - kFracBitCount;

        static const Bits kSignBitMask = static_cast<Bits>(1) << (kBitCount - 1);
        static const Bits kFracBitMask = ~static_cast<Bits>(0) >> (kExpBitCount + 1);
        static const Bits kExpBitMask = ~(kSignBitMask | kFracBitMask);

        static const size_t kMaxUlps = 4;

        explicit FloatingPoint(
            const RawType& x )
            : value_(x)
        {}

        inline bool AlmostEquals(
            const FloatingPoint& rhs ) const
        {
            // check for NaN to match == behavior
            if (is_nan() || rhs.is_nan())
                return false;

            return ULP_diff(bits_, rhs.bits_) <= kMaxUlps;
        }

    private:
        inline bool is_nan() const
        {
            return ((kExpBitMask & bits_) == kExpBitMask) &&
                ((kFracBitMask & bits_) != 0);
        }

        inline Bits SignAndMagnitudeToBiased(
            const Bits& sam ) const
        {
            if (kSignBitMask & sam)
                return ~sam + 1;  // two's complement
            else
                return kSignBitMask | sam;  // * 2
        }

        inline Bits ULP_diff(
            const Bits& sam1,
            const Bits& sam2 ) const
        {
            const Bits biased1 = SignAndMagnitudeToBiased(sam1);
            const Bits biased2 = SignAndMagnitudeToBiased(sam2);

            return (biased1 >= biased2) ? (biased1 - biased2) : (biased2 - biased1);
        }

        union
        {
            RawType value_;
            Bits bits_;
        };
    };
}

static inline bool FltCmp(
    float lhs,
    float rhs )
{
    return RelativeFltCmp::FloatingPoint<float>(lhs).AlmostEquals(RelativeFltCmp::FloatingPoint<float>(rhs));
}


static inline bool FltCmp(
    double lhs,
    double rhs )
{
    return RelativeFltCmp::FloatingPoint<double>(lhs).AlmostEquals(RelativeFltCmp::FloatingPoint<double>(rhs));
}

Thanks for sharing.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Advertisement

Actually that link (http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm) mentions that it's obsolete and points to: http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

Testing both closeTo (implemented in r1832) and the FltCmp posted above, both fail his so-called 'catastrophic cancellation' (comparing the result of a subtraction that's very close to 0 against 0; e.g. comparing 67329.242f - 67329.234f [they're 1 ulp apart] to 0.f).

His AlmostEqualUlpsAndAbs() works (see below and the attached test file), though it's not practical for all (most?) situations since you'd have to know the original numbers (the 67329.242f and 67329.234f, not just the end-result of the subtraction).


FltCmp(diff,0) = 0
FltCmp(sin(PI),0) = 0

closeTo(diff,0) = 0
closeTo(sin(PI),0) = 0

AlmostEqualUlpsAndAbs(diff,0) = 1
AlmostEqualUlpsAndAbs(sin(PI),0) = 1

That being said, closeTo() has (in the worst case) 3 fabs() and a divide, whereas FltCmp is "more likely to be efficient on architectures such as SSE which encourage the reinterpreting of floats as integers". But that's neither here nor there.

Is anyone (reading this forum) using something like AlmostEqualUlpsAndAbs() in production code?

I don't worry too much about performance. I doubt closeTo() is going to be called that many times from a script that it is going to be a causing a significant impact on the response time of the overall script execution.

I am also a bit wary about doing operations directly on the bits of the float values. Though for sure almost all platforms today use the IEEE734 standard for the float values, I shouldn't make that assumption in the library code. It would be fine to do so in application code where you are in control of on what platforms the code will be used, but with AngelScript I'm trying to be as portable as possible.

I'll definitely look into the case you mentioned above. The code must not fail to recognize these numbers are being close to each other.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

You could use the AlmostEqualRelativeAndAbs() variant from the article (which uses a relative epsilon for the away-from-0 case and not ulps, and therefore doesn't cast the float to ints). See updated attachment.


FltCmp(diff,0) = 0
FltCmp(sin(PI),0) = 0

AlmostEqualUlpsAndAbs(diff,0) = 1
AlmostEqualUlpsAndAbs(sin(PI),0) = 1

AlmostEqualRelativeAndAbs(diff,0) = 1
AlmostEqualRelativeAndAbs(sin(PI),0) = 1

closeTo(diff,0) = 0
closeTo(sin(PI),0) = 0

You've still got the issue of choosing the right close-to-0 and max-relative diffs.

My understanding so far from reading the article is that the issue lies in the fact that there is a ton of float precision near 0 and significantly less far away from 0. His example is 67329.234 and 67329.242. There are no valid floats between those two numbers, so they're 1 'ulp' away [if you interpret the floats as ints and subtract the ints]. However, if you subtract those two floats, you get .0078125000. Because of the high precision around 0, that's actually 1,006,632,960 ulps away from 0 [there are 1,006,632,960 valid floats between 0 and .0078125000]. 0.007 is much larger than any typical epsilon value [e.g. closeTo defaults of 0.0001f]. To be able to properly detect that the subtraction of those two large floats is close to 0, you need to have the floats themselves available so that you can determine the proper close-to-0 epsilon to use. Which is still depending on the user to do the right thing, but at least if they do, AlmostEqualRelativeAndAbs() does the correct thing (unlike FltCmp() and closeTo()).

Per the author's suggestion in one of the article comments, I'd default 'maxDiff' to the value of FLT_EPSILON and 'maxRelDiff' to something like 3*FLT_EPSILON or 4*FLT_EPSILON (typical epsilons). Along with a comment at the top of the function, maybe, explaining how to provide the correct 'maxDiff' for the close-to-0 case if you have the numbers available.

A float only has precision for about 7 digits. It doesn't matter what the value of the exponent is. In comparison a double has 15 digits.

Internally the binary representation for 67329.234 and 0.673929234 is much the same, the difference is in the exponent part.

The comparison for closeness must be aware of this, which is why the absolute difference cannot be used directly (unless both values are close to zero).

Another way of thinking about it, is to have the epsilon value change according to the magnitude of the numbers. If the numbers are in the order of 10000, then the epsilon should be around 0.1. If the numbers are in the order of 1, then epsilon should be around 0.00001, and so on.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Advertisement

The idea of choosing an epsilon based on the magnitude of the original numbers is precisely what the article suggests (their version is multiplying the bigger number by FLT_EPSILON, as I do in the attached example code). The problem is when you're comparing e.g. 0.007 with 0 and at that point you don't have the original numbers (which were subtracted), which were on the order of 60000. If you looked at the magnitude of 0.007 and 0, you'd conclude that you want a tiny epsilon.. but that doesn't work for the subtraction which was done. So again, it's up to the user to choose the epsilon and if they had the original numbers available (to judge their magnitude), something like max(A,B)*FLT_EPSILON is more precise.

To follow up on my previous code posting, we're now trying something like this:


/**
 * Check whether two floats are close enough to be considered equal.
 *
 * @param    lhs                left hand size of comparison
 * @param    lhs                right hand size of comparison
 * @param    nearZeroEpsilon    absolute epsilon to use for numbers that should be
 *                              at or near 0. Default of FLT_MIN is fine for ensuring
 *                              that subnormals/denormals (dividing by which would produce
 *                              infinity) are considered equal to 0. For 'catastrophic cancellation'
 *                              situations, like comparing sin(pi) or (67329.242f - 67329.234f)
 *                              against 0, maxInput*FLT_EPSILON may be a good epsilon
 *                              (e.g. pi*FLT_EPSILON or 67329.242f*FLT_EPSILON).
 *                              See http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition
 * @param    ulpDiff            Used when numbers are not being compared against 0.
 *
 * @return   True if values are considered equal.
 */
static inline bool FltCmp(
    float lhs,
    float rhs,
    float nearZeroEpsilon = FLT_MIN,
    float ulpDiff = 4 )
{
    if ( (lhs == rhs) || ((rhs == 0.f || lhs == 0.f) && fabsf(lhs-rhs) <= nearZeroEpsilon) )
        return true;

    return RelativeFltCmp::FloatingPoint<float>(lhs).AlmostEquals(RelativeFltCmp::FloatingPoint<float>(rhs), ulpDiff);
}

And the binding for that function is now:


RegisterGlobalFunction("bool FltCmp(float, float, float = 1.175494351e-38, uint = 4)", asFUNCTIONPR(FltCmp, (float, float, float, uint), bool), asCALL_CDECL)

This topic is closed to new replies.

Advertisement