A special edition of the Rakudo Weekly News to inform you all of an exciting development in the world of the Raku Programming Language.
The implementation of hashes in MoarVM has been replaced! Nicholas Clark wrote a new hash implementation for MoarVM in the past months, which uses less memory and less CPU.
MoarVM had been using the open source
uthash library since July 2012. At the time it was the “Go To” hashing library for
C – full of features and flexible – so it was the logical choice. Over the years it has been modified to remove the features not needed by MoarVM, and hence the associated code and overhead. But the basic design was necessarily unchanged – it remained the “classic” approach to hash tables – “collision chaining”. The internal structure is an array of linked lists – just like Perl has done, since Perl 1.
This structure is simple to implement, but ultimately dates back to an era when CPUs didn’t even have caches, so it doesn’t fare too well now that minimising cache misses is the key to performance.
The basic problem that hash tables need to solve, is that they need a fast way to map from arbitrary keys not known in advance into something that is efficient for a CPU to manipulate – integers. You’d like to store each key (and its associated
value data) in an array, but also you can’t make the array infinitely big, so you always end up with a trade off – sometimes more than one key maps to the same integer, and so you need to code to handle this.
Collision chaining is the “obvious” way – a memory efficient way to store a variable number of keys that all need to sit at the same index in the array. Operations need to walk all keys at the same index linearly, but keep the list small (grow the array as needed) and the performance is still good.
However the downside is that the keys you need to walk linearly are not adjacent in memory, meaning you likely have cache misses. You could replace the linked lists with arrays (for more complexity, and higher delete costs), but you still have one pointer hop and potential cache miss.
With the increasing importance of staying CPU cache-friendly, a different approach to collision resolution is now better: open address hashing.
This is roughly equivalent to using a larger array than collision chaining, and eliminating the linked lists. While this reduces the collisions, it doesn’t eliminate them, so you need some other strategy to handle them. Usually this involves a strategy for storing the key/value pair in an array index close to the correct location, and searching all locations where the key might be until either it is found or the possible locations are exhausted.
Currently one of the most efficient approaches to this is “Robin Hood” hashing. On updates, it moves entries around to minimise the distance of all keys to their ideal locations, not just the key added or deleted. Usefully, this increases the chance of CPU cache hits during lookups.
Regarding hash iterators: one fun “gotcha” is that the Raku Programming Language (as well as Perl) both specify that it is acceptable to delete the key at the current hash iterator position without problems. Fortunately it was possible to adapt the insertion and iteration strategies to permit this without needing any extra complications or overhead.
All of this is now available in the main branch of the Rakudo compiler. If you have hash-intensive applications, please try it out with this (as yet unreleased) version of Rakudo. And let us know of your findings. Reports so far indicate 10% to 15% improvements in core setting compilation, as well as in roast testing. But of course, Your Mileage May Vary!