I have this signature for a mutable set:
open Base
open Hashable
module type MutableSet = sig
type 'a t
val contains : 'a t -> 'a -> bool
end
I want to implement the signature with HashSet using the Hashable module from the Base library.
module HashSet(H : Hashable) : MutableSet = struct
let num_buckets = 16
type 'a t = { mutable buckets : ('a list) array }
let contains s e =
let bucket_index = (H.hash e) % num_buckets in
let bucket = s.buckets.(bucket_index) in
List.exists ~f:(fun e' -> H.equal e e') bucket
end
I am getting the error
Error: Signature mismatch:
Modules do not match:
sig
type 'a t = { mutable buckets : 'a list array; }
val contains : 'a H.t t -> 'a H.t -> bool
end
is not included in
MutableSet
Values do not match:
val contains : 'a H.t t -> 'a H.t -> bool
is not included in
val contains : 'a t -> 'a -> Base.bool
I think the issue is that the type of the Hashable key is not constrained to be the same as 'a, the type of the elements that are in the set. How do I constrain the types to be the same?
The crux of the problem is the
H.equalfunction, which has type'a t -> 'a t -> bool, cf., it withH.hashwhich has type'a -> int. I think that everything went wrong, because of your wrong assumptions on what the hashable means in Base. The typeHashable.tis a record of three functions, and is defined as follows1:Therefore, any type that wants to be hashable must provide an implementation of these three functions. And although there is a module type of module Hashable it is not designed to be used as a parameter to a functor. There is only one module Hashable, that defines the interface (type class if you want) of a hashable value.
Therefore, if you need a monomorphic MutableSet for a key that is hashable, you shall write a functor, that takes a module of type
Hashable.Key.If you want to implement a polymorphic MutableSet, then you do not need to write a functor (if it is polymoprhic then it is already defined for all possible types.). You can even use the polymorphic functions from the Hashable module itself, e.g.,
Answers to the follow-up questions
1) When you need to ensure that two hashtables are using the same comparison function. For example, if you would like to merge two tables or intersect two tables, they should use the same comparison/hash functions, otherwise, the results are undefined.
2) When you need to compare two hashtables for equality.
If by "built-in" you mean primitives provided by OCaml, then the answer is, no, such hashtable have to use the polymorphic comparison primitive from the OCaml standard library.
You don't have to use the
Hashablemodule from the base library, to access to them. They are also available viaCamlorPolymorphic_comparemodules inBase. Or, if you're not using the Base or Core libraries, then thecomparefunction fromStdlibby default is polymorphic and has type'a -> 'a -> int.With all that said, I think some clarification is needed on what we say by polymorphic version. The Base's Hash_set, as well as Hashtbl, are also polymorphic data structures, as they have types
'a tand('k,'a) tcorrespondingly, which are both polymorphic in their keys. They, however, do not rely on the polymorphic comparison function but require a user to provide a comparison function during the construction. In fact, they require an implementation of thehashableinterface. Therefore, to create an empty hash table you need to pass it a module which implements it, e.g.,Where the passed module must implement the
Hashable.Keyinterface, which beyond others provide the implementation ofhashablevia theHashable.of_keyfunction. And the hashtable implementation is just storing the comparison functions in itself, e.g., roughly,I think, that given this implementation it is now more obvious when we need to compare to hashable records.
First of all, we actually have three versions. Functor, polymorphic, and one that uses the polymorphic comparison function (let's name it universal). The latter is of the least preference and should be avoided if possible.
Concerning the former two, both are good, but a polymorphic version is more versatile without involving too many compromises. Theoretically, a functor version opens more opportunities for compiler optimizations (as the comparison function could be inlined), but it comes with a price that for every key you will have a different module/type.
You can also benefit from both approaches and provide both polymorphic and monomorphic implementations (with the latter being a specialization of the former), e.g., this is how maps and sets are implemented in JS Base/Core. There is a polymorphic type for a set,
which is a binary tree coupled with a compare function, which is reflected in the set type by the
'comparator_witnesstype, so that for each comparison function a fresh new type is generated, thus preventingSet.unionet al operations between two sets that have different compare function stored in them.There is also, at the same time a
Set.Make(K : Key)functor, which creates a module that providestype t = (K.t,K.comparator_witness) settype, which can, theoretically, benefit from inlining. Moreover, each module that implementsCore.Comparable.Sand below, will also provide a.Map,.Set, etc modules, e.g.,Int.Set. Those modules are usually created via the correspondingMakefunctors (i.e.,Map.Make,Set.Make), but they open opportunities for manual specializations.1) Therefore
Hashable.equalfunction actually compares functions not values. It basically compares two type classes. And, I believe, theHashable.hashfunction is typed'a -> intaccidentally, and the type that was intended was also'a t -> int.