RTypes Data Generators
Table of Contents
This is a continuation of the previous post about RTypes library.
A few days ago I found myself having a few hours of free time on my hands and thought that it was a great chance to improve the library. I've been "dogfooding" the library in a few projects for a while now. First as a sanity checker during active development phase, then as a data validation layer. But since its inception I had an itch to close the loop by not only being able to automatically derive data validators, but also to derive data generators to be used with property-based testing frameworks1.
Here is a quick demo of the feature:
iex(1)> require RTypes.Generator, as: G iex(2)> g = G.make(:inet.port_number(), G.StreamData) iex(3)> Enum.take(g, 10) => [53289, 49615, 25526, 14765, 2391, 57424, 1399, 48755, 23668, 57176] iex(4)> g2 = G.make(:inet.ip4_address(), G.StreamData) iex(5)> Enum.take(g2, 10) => [ {237, 57, 3, 204}, {93, 132, 242, 86}, {254, 226, 96, 62}, {61, 141, 84, 51}, {66, 182, 79, 220}, {21, 168, 172, 155}, {121, 240, 20, 76}, {60, 188, 81, 14}, {169, 112, 182, 234}, {14, 5, 193, 161} ]
Two generators were constructed using RTypes.Generator.make
macro by
supplying a required type and then asked to generate ten random port numbers
and IPv4 addresses. Isn't it cool?
How it works
The idea is to walk the AST that corresponds to the provided type expression and build a data generator using existing facilities from a property-based testing framework. The code looks very similar to "compile-to-closures" interpreter (see the the previous post for some context) with the only difference that instead of producing a chain of closures it produces a chain of data generators.
import StreamData ## prmitive types def derive({:type, _line, :integer, _args}), do: integer() # ... ## literals def derive({:atom, _line, term}), do: constant(term) ## ranges def derive({:type, _, :range, [{:integer, _, l}, {:integer, _, u}]}) do integer(l..u) end ## compound types def derive({:type, _line, :list, [typ]}) do list_of(derive(typ)) end
For the PropCheck
backend the code looks even more alike the
"compile-to-closures" implementation
import PropCheck, only: [let: 2] import PropCheck.BasicTypes # primitive types def derive({:type, _line, :any, _args}), do: &any/0 def derive({:type, _line, :atom, _args}), do: &atom/0 # ... # compound types def derive({:type, _line, :maybe_improper_list, [typ1, typ2]}) do g1 = derive(typ1) g2 = derive(typ2) fn -> let {h, t} <- {g1.(), g2.()} do oneof([[], [h | t]]) end end end # ...
because it produces a chain of functions (closures) each returning a proper
generator. Compare it with the code for "compile-to-closures" code
def build({:type, _line, :atom, _args}), do: &is_atom/1 def build({:type, _line, :integer, _args}), do: &is_integer/1 # ... def build({:type, _line, :maybe_improper_list, [typ1, typ2]}) do typ1? = build(typ1) typ2? = build(typ2) fn [] -> true [car | cdr] -> typ1?.(car) and typ2?.(cdr) _ -> false end end
Note, the derive
and build
functions above accept an AST that corresponds
to a type expression. The RTypes.make_*
and RTypes.Generator.make
macros do
some magic to allow literal type expressions. For that magic to work, the type
must be either a primitive or defined in a module that has a .beam
file
somewhere reachable.
How to use it
I see the feature primarily as a testing tool. The StreamData
backend makes
it handy to use in fuzz testing or similar applications which require an
infinite stream of random data.
require RTypes.Generator, as: G packet_gen = G.make(MyModule.packet(), G.StreamData) G.make(:inet.port_number(), G.StreamData) |> Stream.map(fn port_number -> {:ok, sock} = open_udp_socket(some_host, port_number) sock end) |> Enum.each(fn sock -> # generate 1000 random packets and send them to socket Stream.map(packet_gen, &send_udp_packet(sock, &1)) |> Enum.take(1000) close_socket(sock) end)
And here is an example how to use it with PropCheck
, e.g. to test a function
for totality. Let's suppose you have a function f
defined in module M
defmodule M do @type f_input_type :: list() @type f_result_type :: pos_integer() @spec f(f_input_type()) :: f_result_type() def f(xs) do # ... end end
Then if we claim that the function f
should return a value that belongs to
f_result_type
for any possible input that belongs to f_input_type
we say
that the function f
is total.
defmodule MTest do use PropCheck require RTypes require RTypes.Generator, as: G property "f is total" do input_generator = G.make(M.f_input_type(), G.PropCheck) result_value? = RTypes.make_predicate(M.f_result_type()) forall input <- intput_generator.() do result_value?.(M.f(input)) end end end
Conversely, if we were to test a non-total function it should fail on some
input. For instance, the hd/1
function from the standard library almost
immediately fails on the empty list []
.
defmodule MTest do use PropCheck require RTypes require RTypes.Generator, as: G property "hd is total" do gen = G.make(list(integer()), G.PropCheck) int_value? = RTypes.make_predicate(integer()) forall val <- gen.() do int_value?.(hd(val)) end end end
1) property hd is total (RTypesPropCheckTest) test/rtypes_propcheck_test.exs:10 Property Elixir.RTypesPropCheckTest.property hd is total() failed. Counter-Example is: [[]]
License issue
PropCheck
library is released under GPL 3.0 license (as well as the PropEr
library which it's based upon). It's totally fine to use PropCheck
for
testing in non-GPL projects because tests usually are not shipped with the
final product. RTypes
, however, is released under Apache 2.0 license because
I wanted to use it in projects where managers would be like "yeah, nah…" as
soon as they hear "GPL" acronym. And because RTypes
is not only a testing
library, but also can be used as a data-validation layer, I couldn't just use
PropCheck
in the release builds and keep it under Apache 2.0 license (or
could I?). The solution I came up with was to introduce a plug-in system and
release a separate rtypes_propcheck
library under GPL 3.0. I believe it's a
fine solution because it still allows to use RTypes
as a run-time dependency
and use rtypes_propcheck
as test-only dependency.
Yet maintaining both libraries simultaneously is a bit of a pain, so If anyone reading this knows how can I combine both in one package, please reach me on Twitter @plrants or send an email to hello@pl-rants.net. Thanks!
Footnotes:
Cannot not to mention the excellent book "Property Based Testing with PropEr, Erlang, and Elixir" by Fred Hebert for a deep dive into the topic.