0

I'm trying to port the Accidental Noise library from C# to Lua. I encounter an issue when trying to port the FNV-1A algorithm. The result of the multiplication with the prime doesn't match when using same input values.

First I'd like to show the C# code of the algorithm:

// The "new" FNV-1A hashing
private const UInt32 FNV_32_PRIME = 0x01000193;
private const UInt32 FNV_32_INIT = 2166136261;

public static UInt32 FNV32Buffer(Int32[] uintBuffer, UInt32 len)
{
    //NOTE: Completely untested.
    var buffer = new byte[len];
    Buffer.BlockCopy(uintBuffer, 0, buffer, 0, buffer.Length);

    var hval = FNV_32_INIT;    
    for (var i = 0; i < len; i++)
    {
        hval ^= buffer[i];
        hval *= FNV_32_PRIME;
    }

    return hval;
}

This function is called as such (simplified) elsewhere in the codebase:

public static UInt32 HashCoordinates(Int32 x, Int32 y, Int32 seed)
{
    Int32[] d = { x, y, seed };
    return FNV32Buffer(d, sizeof(Int32) * 3);
}

I noticed the sizeof(Int32) result is always multiplied by the number of elements in the Int32[] array. In this case (on my machine) the result is 12, which causes the buffer size in the FNV32Buffer function to be an array of 12 bytes.

Inside the for loop we see the following:

  1. A bitwise XOR operation is performed on hval
  2. hval is multiplied by a prime number

The result of the multiply operation doesn't match with the result of my Lua implementation.

My Lua implementation is as such:

local FNV_32_PRIME = 0x01000193
local FNV_32_INIT = 0x811C9DC5

local function FNV32Buffer(buffer)
    local bytes = {}

    for _, v in ipairs(buffer) do
        local b = toBits(v, 32)
        for i = 1, 32, 8 do
            bytes[#bytes + 1] = string.sub(b, i, i + 7)
        end
    end

    local hash = FNV_32_INIT
    for i, v in ipairs(bytes) do
        hash = bit.bxor(hash, v)
        hash = hash * FNV_32_PRIME
    end

    return hash
end 

I don't supply the buffer length in my implementation as Lua's Bitwise operators always work on 32-bit signed integers.

In my implementation I create a bytes array and for each number in the buffer table I extract the bytes. When comparing the C# and Lua byte arrays I get mostly similar results:

byte # C# Lua
1 00000000 00000000
2 00000000 00000000
3 00000000 00000000
4 00000000 00000000
5 00000000 00000000
6 00000000 00000000
7 00000000 00000000
8 00000000 00000000
9 00101100 00000000
10 00000001 00000000
11 00000000 00000001
12 00000000 00101100

It seems due to endianness the byte ordering is different, but this I can change. I don't believe this has anything to do with my issue right now.

For the C# and Lua byte arrays, I loop through each byte and perform the FNV-1A algorithm on each byte.

When using the values {0, 0, 300} (x, y, seed) as input for the C# and Lua functions I get the following results after the first iteration of the FNV hashing loop is finished:

C#: 00000101_00001100_01011101_00011111 (84696351)

Lua: 01111110_10111100_11101000_10111000 (2126309560)

As can be seen the result after just the first hashing loop are very different. From debugging I can see the numbers diverge when multiplying with the prime. I believe the cause could be that Lua uses signed numbers by default, whereas the C# implementation works on unsigned integers. Or perhaps the results are different due to differences in endianness?

I did read that Lua uses unsigned integers when working with hex literals. Since FNV_32_PRIME is a hex literal, I guess it should work the same as the C# implementation, yet the end result differs.

How can I make sure the Lua implementation matches the results of the C# implementation?

Wolfgang Schreurs
  • 11,779
  • 7
  • 51
  • 92
  • 1) You are using the `bit` library. Do you run your code on LuaJIT? Or is it an external library loaded into vanilla Lua? 2) Your `bytes` array contains strings of digits 0/1. When you apply arithmetic on them, Lua treats them as decimal numbers, not binary. – Egor Skriptunoff Oct 11 '21 at 09:16
  • I believe my code runs indeed on LuaJIT (LÖVE 2D engine). Yes, you are right the arithmetic on the bytes array treats the values as decimal numbers. I could in theory issue `bit.tobit(n)` to normalise the number in 32-bit range, but it seems this still generate different results compared to C# implementation. Maybe due to endianness, maybe something else. – Wolfgang Schreurs Oct 11 '21 at 14:50

3 Answers3

2

LuaJIT supports CPU native datatypes.
64-bit values (suffixed with LL) are used to avoid precision loss of multiplication result.

-- LuaJIT 2.1 required
local ffi = require'ffi'

-- The "new" FNV-1A hashing
local function FNV32Buffer(data, size_in_bytes)
   data = ffi.cast("uint8_t*", data)
   local hval = 0x811C9DC5LL
   for j = 0, size_in_bytes - 1 do
      hval = bit.bxor(hval, data[j]) * 0x01000193LL
   end
   return tonumber(bit.band(2^32-1, hval))
end

local function HashCoordinates(x, y, seed)
   local d = ffi.new("int32_t[?]", 3, x, y, seed)
   return FNV32Buffer(d, ffi.sizeof(d))
end

print(HashCoordinates(0, 0, 300))  --> 3732851086
Egor Skriptunoff
  • 23,359
  • 2
  • 34
  • 64
  • Thanks man, what a wizardry - I would not have figured this out myself. I did see some loss of precision, but I figured I must have done something else wrong. From what I read your approach results in exact same value as my C# test program, so this should be everything I need. – Wolfgang Schreurs Oct 12 '21 at 16:39
  • 1
    If people encounter a similar issue and are using the current version of LÖVE (11.3), know the LuaJIT version (`2.0.x`) provided in 11.3 doesn't work properly with this version of LÖVE. I downloaded the current in development version of LÖVE (12.0) and build it on my Mac. LÖVE 12.0 has LuaJIT `2.1.x` support, so the above solution will work. – Wolfgang Schreurs Oct 14 '21 at 04:49
1

Arithmetic on 32 bit unsigned numbers does not necessarily produce a 32 bit number.

Not tested, but I think the result of the multiplication with the prime number should be normalized using bit.toBit() as stated in the reference you provide.

Emile
  • 2,200
  • 27
  • 36
  • That would make sense. And I do believe the result is not a normalised number. However, the problem then might be the following: the result of `bit.tobit()` will probably return the "wrong part" of the number, since it seems the endianness os Lua and C# don't match. I will try to get the "right part" of the number for further iterations and see if that works out well. – Wolfgang Schreurs Oct 11 '21 at 07:45
0

I use Lua 5.4.2, I don't know which version the bit operations were added.

Here is one implementation of 32 bit FNV-1a algorithm that seem to produce the same output as an online hash function. I had initially problems matching strings like Bärry with Utf-8 characters but this was due to the terminal mangling the input before sending it to the program. If I do FNV1A32("Bärry") then I get the same results as the website.

I couldn't use the currently accepted answer because the system I am deploying for doesn't have/allow ffi.

hash.lua

---Computes a 32 bit hash of a given string.
---See http://www.isthe.com/chongo/tech/comp/fnv/index.html#FNV-1a for details.
---@param str string
---@return integer hash32
local function FNV1A32(str)
    local octets = table.pack(str:byte(1, #str))
    local hash = 2166136261 -- 32 bit offset_basis
    for _, octet in ipairs(octets) do
        hash = hash ~ octet
        hash = hash * 16777619 -- 32 bit FNV_prime 
        hash = hash & 0xffffffff -- emulate uint32 overflow
    end
    return hash
end

-- Test it with command line 
-- >lua ./hash.lua "Hello FNV1A32"
-- c8c0c51a Hello FNV1A32
--
-- Cross checking website https://md5calc.com/hash/fnv1a32/Hello+FNV1A32
assert(#arg == 1, "Program requires one argument")
local argHash = FNV1A32(arg[1])
local hexHash = string.format("%x", argHash)
print(hexHash .. " " .. arg[1])

A little more condensed version:

local function FNV1A32(str)
    local octets = table.pack(str:byte(1, #str))
    local hash = 2166136261
    for _, octet in ipairs(octets) do
        hash = (hash ~ octet) * 16777619 & 0xffffffff
    end
    return hash
end
Statement
  • 3,888
  • 3
  • 36
  • 45