Map hashing
Rust also has another development option that allows you to make the hashing of the maps faster. This comes from the idea that when storing information in a HashMap, for example, the key gets hashed or a faster lookup. This is great, since it allows using arbitrary long and complex keys, but adds overhead when retrieving a value or inserting a new value, since the hash must be calculated.
Rust allows you to change the hashing method for a HashMap, and even create your own. Of course, usually, the best thing is to use the default hashing algorithm, since it has been thoroughly tested and avoids collisions (different keys having the same hash and overwriting one another). The default hasher for Rust is a very efficient hasher, but if you need performance and you are working with a really small HashMap or even a somehow predictable HashMap, it could make sense to use your own function or even a faster function included in Rust.
But beware—it's very risky to use one of these functions in an environment where a user can provide (or manipulate) keys. They could generate a collision and modify the value of a key they should not have access to. They could even create a denial of service attack using it.
Using a different hashing is as simple as using the with_hasher() function when creating the HashMap:
use std::collections::HashMap;
use std::collections::hash_map::RandomState;
// <u8, u8> as an example, just to make the type inference happy.
let map: HashMap<u8, u8> = HashMap::with_hasher(RandomState::new());
Currently, only RandomState is available in the standard library; the rest have been deprecated. But you can create your own by implementing the Hasher trait:
use std::hash::{BuildHasher, Hasher};
#[derive(Clone)]
struct MyHasher {
count: u64,
}
impl Hasher for MyHasher {
fn finish(&self) -> u64 {
self.count
}
fn write(&mut self, bytes: &[u8]) {
for byte in bytes {
self.count = self.count.wrapping_add(*byte as u64);
}
}
}
impl BuildHasher for MyHasher {
type Hasher = Self;
fn build_hasher(&self) -> Self::Hasher {
self.clone()
}
}
This creates the MyHasher structure, which contains a count that can be initialized as you wish. The hash function is really simple; it just adds all the bytes of the key and returns a u64 with the sum result. Generating a collision here is pretty easy: you just need to make your bytes sum the same. So [45, 23] will have the same hash as [23, 45]. But it works as an example of a hasher. The BuildHasher trait is also required, and it only needs to return an instance of a Hasher. I derived the Clone trait and just cloned it.
This can be easily used, as we saw before:
use std::collections::HashMap;
let mut map = HashMap::with_hasher(MyHasher { count: 12345 });
map.insert("Hello", "World");
This will probably be faster than the default hasher, but it will also be much, much less secure. So be careful about what hash function you use.