Usain Boltz¶
Usain Boltz is a Python/Cython library meant to automate the random generation of tree-like structures.
The library was primarily designed to be used with the Sagemath mathematics software systems but Sagemath is no longer a dependency and the library is now perfectly usable both within and without a Sagemath environment.
Principle (non-technical)¶
The use of Usain Boltz starts with the specification of the structures we want
to sample using grammar
module. Then, optionally, we can
tell generator (Generator
) how to build the
objects, thanks to the builders mechanism.
For example, imagine that we want to sample tree-like structures
Tree := Leaf | Tree * Tree
These structures may be represented as XML strings:
<tree>
<tree>
<leaf/>
<leaf/>
</tree>
<tree>
<leaf/>
</tree>
</tree>
but also as JSON objects:
{ 'tree' : [
{ 'tree' : [ 'leaf', 'leaf' ] },
{ 'tree' : [ 'leaf' ]}
]
}
However the inherent tree structure is the same. Thus the same generator may generate both objects using different builders.
Principle (technical)¶
Usain Boltz is based on the principles of the Boltzmann method introduced in the article [DuFlLoSc04], whose goal is to provide fast and uniform random sampler for a large variety of combinatorial structures.
The starting point is to describe the structures we want to sample by a
Grammar
. Each rule of that grammar
defines a combinatorial class: a set where objects have a finite
size given by their number of
Atom
. Then the idea is to sample
objects \(\gamma\) from a class \(\mathcal{A}\) with respect
to the Boltzmann model of parameter \(x\):
where \(|\gamma|\) is the size of \(\gamma\) and \(A\) is the generating series of \(\mathcal{A}\) defined by \(A(z)=\sum_{\gamma \in \mathcal{A}} z^{|\gamma|}\).
Given that model, one can compute that the expected size of a sampled object is given by:
So a good choice of \(x\) allows to target a specific size for the
sampled objects. This is the work of the oracle
.
References
[DuFlLoSc04] | Philippe Duchon, Philippe Flajolet, Guy Louchard, and Gilles Schaeffer. 2004. Boltzmann Samplers for the Random Generation of Combinatorial Structures. Comb. Probab. Comput. 13, 4-5 (July 2004), 577-625. DOI=http://dx.doi.org/10.1017/S0963548304006315 |
Supported features¶
Usainboltz is still a work in progress and not all the combinators from the
grammar
module are currently fully supported.
This table describes what should be expected to work and what is still under
development.
Grammar | Oracle | Generator | |
---|---|---|---|
z, ε, +, × | ✅ | ✅ | ✅ |
Seq | ✅ | ✅ | ✅ |
Set | ✅ | partial (see 1 below) | ✅ |
Cycle | ✅ | partial (see 1 below) | ❌ |
MSet | ✅ | partial (see 1 below) | partial (see 3 below) |
UCycle | ✅ | partial (see 2 below) | ❌ |
Limitations:
- Paganini (our default oracle) does not support constrained sets, multisets
and cycles (that is iterated rules with
leq=n
and/orgeq=m)
) unlessn = m
. This can be worked around by providing your own oracle, e.g. by passing a dictionary of float values tobuild_oracle()
. - Paganini (our default oracle) only implements unlabelled cycles
(
UCycle
) of fixed size (that is cycles of the formUCycle(<expr), leq=n, geq=n)
). This can be worked around by providing your own oracle, e.g. by passing a dictionary of float values tobuild_oracle()
. - At the moment, the kinds of multisets we support are exactly those that are supported by paganini (see point 1 above).
Modules¶
usainboltz.grammar |
Context-free grammars for Boltzmann generation. |
usainboltz.generator |
Boltzmann generator for Context-free grammars. |
usainboltz.oracle |
Various oracle implementations for Boltzmann sampling. |
Examples¶
usainboltz.examples.expnums |
A Boltzmann sampler for Set partitions (Bell numbers). |
usainboltz.examples.knr19 |
|
usainboltz.examples.mondrian |
Mondrian paintings generation. |
usainboltz.examples.polya_trees |
A Boltzmann sampler for Pólya n-ary trees. |
usainboltz.examples.alcohols |
A Boltzmann sampler for alcohols |
Sage Examples¶
usainboltz.sage_examples.binary_tree |
Binary trees |
usainboltz.sage_examples.cayley_tree |
Cayley trees |
usainboltz.sage_examples.composition |
Integer compositions |
usainboltz.sage_examples.dyck_word |
Dyck words. |