Use Polymorphic Variant Type in Ocaml
Intro
I have been learning and using OCaml for a while. However, I’m still quite confused about the Polymorphic variant feature. Recently, while exploring the Yojson library, I came across this feature again. After some research, I couldn’t find any comprehensive articles about polymorphic variants (the official introduction is not easy to grasp this feature IMHO), only a few related answers12. So I decided to write an article, hoping it will be helpful to you :)
Syntax
Use bracket []
to enclose all constructors, and tag each constructor with a backtick ( ` ). For example,
type either = [
| `A of int
| `B
]
When creating a value of such type, we also need to add the backtick
utop # let v = `A 1;;
val v : [> `A of int/2 ] = `A 1
Lower Bound and Upper Bound
int/2
can be found in this StackOverflow linkPreviously we defined a value v
, you may find that the type of v
is not either
but [> ...]
. What does this mean? Based on the syntax of the polymorphic variant, we know that the [ ... ]
refers to the polymorphic variant type, then what about the >
symbol?
It has a similar meaning with the >
in mathematics, that is, the polymorphic variant type can have more constructors. For example,
[`A of int | `B]
[`A of int | `B | `C]
...
Thus, the polymorphic variant is also called open variant.
Similarly, there exists [< ...]
, which means it has these constructors at most. Take the following type as an example,
[< `A of int | `B]
There can be the following three scenarios.
[`A of int]
[`B]
[`A of int | `B]
Pattern Matching
We can try pattern matching on the polymorphic variant argument of a function. For example,
utop # let foo = function
| `A -> "A"
| `B -> "B" ;;
val foo : [< `A | `B ] -> string/2 = <fun>
We can see that the OCaml infers the parameter type as [< ...]
, which is reasonable because the function cannot handle other constructors if we provide them. For example,
utop # foo `C;;
Error: This expression has type [> `C ] but an expression was expected of type
[< `A | `B ]
The second variant type does not allow tag(s) `C
Now let’s replace the case B
with case C
in the foo
function.
utop # let bar = function
| `A -> "A"
| `C -> "C"
val bar : [< `A | `C ] -> string/2 = <fun>
Then we can try to put the foo
and bar
functions in the same list. What type should this list of functions have? Let’s try inferring the type ourselves. Given that:
- The
foo
function can only handle caseA
or caseB
- The
bar
function can only handle caseA
or caseC
- Each item in the same list should have the same type
We can conclude that the list should have the ([< A] -> string) list
type, and the output from utop
confirms this conclusion.
utop # let funcs = [ foo; bar ];;
val funcs : ([< `A ] -> string/2) list/2 = [<fun>; <fun>]
[> ...]
and [< ...]
, determining the matching type.Additionally, when performing pattern matching, you can use the #some_variant
syntax to represent the entire polymorphic variant. For example,
type my_type =
[ `A
| `B
]
let process_a_b_c = function
| #my_type -> "A or B"
| `C -> "C"
Use cases
Using without declaring it first
The most intuitive use case in my opinion is that we can use the polymorphic variant type without declaring it first.
The following example demonstrates this idea.
type return_type =
| A of int
| B
val f : int -> return_val
By using the polymorphic variant, we can write the code like this.
val f : int -> [ `A of int | `B ]
Share datatype
When using a library, you may need to use its defined data types. In this case, you can list this library as a dependency. For example,
(* ./mycolor.ml *)
type rgb = Red | Green | Blue
(* ./main.ml *)
let process_color = function
| Mycolor.Red -> "Red" (* the Main module depends on Mycolor *)
| Mycolor.Green -> "Green"
| Mycolor.Blue -> "Blue"
However, if the data type is defined using a polymorphic variant, this dependency can be eliminated.
(* ./mycolor.ml *)
type rgb = [ `Red | `Green | `Blue ]
(* ./main.ml *)
let process_color = function
| `Red -> "Red" (* the dependence is gone *)
| `Green -> "Green"
| `Blue -> "Blue"
Subtype
Polymorphic variant type is particularly suitable for cases where different variant types share common parts.
For example, let’s say the common
type is the common part of foo
type and bar
type.
type common = A | B
type foo =
| Common of common (* it's kind of redundant *)
| C
| D
type bar =
| Common of common
| E
| F
Now, suppose that there is a show
function to print the above three types. In this case, you have to write three separate functions, as shown below.
let show = function
| A -> "A"
| B -> "B"
let show_foo arg =
match (arg : foo) with
| Common v -> show v
| C -> "C"
| D -> "D"
let show_bar arg =
match (arg : bar) with
| Common v -> show v
| E -> "E"
| F -> "F"
In addition, we also need to add the :foo
and :bar
to distinguish different types. However, if we use a polymorphic variant, the above code can be simplified to
type common = [ `A | `B ]
type foo = [ common | `C | `D ]
type bar = [ common | `E | `F ]
let show = function
| `A -> "A"
| `B -> "B"
| `C -> "C"
| `D -> "D"
| `E -> "E"
| `F -> "F"
Drawbacks
The polymorphic variant is quite flexible but comes with a cost3
- Complexity: The typing rule is more complex compared to the nominal variant type. You will find that the error message is more dense.
- Error-finding: It may be harder to catch the bugs.
- Efficiency: The polymorphic variant is less efficient.