usainboltz.generator

Boltzmann generator for Context-free grammars.

This module provides functions for generating combinatorial objects (i.e. objects described by a combinatorial specification, see usainboltz.grammar) according to the Boltzmann distribution.

Given an unlabelled combinatorial class A, the Boltzmann distribution of parameter \(x\) is such that an object of size n is drawn with the probability \(\frac{x^n}{A(x)}\) where \(A(x)\) denotes the ordinary generating function of A. For labelled classes, this probability is set to \(\frac{x^n}{n!A(x)}\) where \(A(x)\) denotes the exponential generating function of A. See [DuFlLoSc04] for details.

By default, the objects produced by the generator are nested tuples of strings (the atoms). For instance ('z', ('z', 'epsilon', 'epsilon'), ('z', 'epsilon', 'epsilon')) is a balanced binary tree with 3 internal nodes (z) and 4 leaves (e). To alter this behaviour and generate other types of objects, you can specify a builder function for each type of symbol in the grammar. The behaviour of the builders is that they are applied in bottom up order to the structure “on the fly” during generation. For instance, in order to generate Dyck words using the grammar for binary trees, one can use a builder that return "" for each leaf "(" + left child + ")" + right child for each node. The builders will receive a tuple for each product, a string for each atom and builders for unions should be computed using the union_builder() helper. See the example below for the case of Dyck words.

Examples

>>> from usainboltz import *
>>> from usainboltz.generator import rng_seed
>>> rng_seed(0xDEADBEEFCA7B0172)

Both binary trees and Dyck words can be generated from the following grammar

>>> epsilon, z, B = Epsilon(), Atom(), RuleName("B")
>>> grammar = Grammar({B: epsilon + z * B * B})
>>> generator = Generator(grammar, B, singular=True)

In order to build Dyck words, one can interpret epsilon as the empty word and nodes as the derivation: D -> ( D ) D

>>> def leaf_builder(_):
...     return ""
>>> def node_builder(tuple):
...     _, left, right = tuple
...     return "(" + left + ")" + right
>>> generator.set_builder(B, union_builder(leaf_builder, node_builder))
>>> res = generator.sample((10, 20))
>>> dyck_word = res.obj
>>> dyck_word
'(())(((((()(((()())))))))())()'
>>> len(dyck_word) in range(20, 41)
True

If on the contrary we want to see trees as S-expressions, we can interpret epsilon as the leaf "leaf" and nodes as (node left_child right_child)

>>> def leaf_builder(_):
...     return "leaf"
>>> def node_builder(tuple):
...     _, left, right = tuple
...     return "(node {} {})".format(left, right)
>>> generator.set_builder(B, union_builder(leaf_builder, node_builder))
>>> res = generator.sample((10, 20))
>>> print(res.obj)
(node
  (node
    leaf
    (node
      leaf
      (node
        (node
          (node
            (node leaf leaf)
            leaf)
          (node
            (node
                leaf
                (node
                  (node
                    (node leaf leaf)
                    (node
                      (node
                        leaf
                        (node
                          (node leaf leaf)
                          leaf))
                      leaf))
                  leaf))
            leaf))
        (node
          (node leaf leaf)
          leaf))))
  leaf)

Note that the builders mechanism can also be used to compute some statistics on the fly without explicitly building the whole structure. For instance the following example illustrates how to generate the height of a tree following the Boltzmann distribution without building the tree.

>>> def leaf_height(_):
...     return 0
>>> def node_height(tuple):
...     _, left, right = tuple
...     return 1 + max(left, right)
>>> generator.set_builder(B, union_builder(leaf_height, node_height))
>>> res = generator.sample((10, 20))
>>> res.obj
13
>>> res.sizes[Atom()]
16

For labelled grammar, the procedure is the same. In this case, the default builders also output the labelling, they do not only generate the structure.

Example

>>> z = Atom()
>>> e, M = Epsilon(), RuleName()
>>> g = Grammar({M: e + z * M + z * M * M}, labelled=True)
>>> generator = Generator(g, M, singular=True)
>>> res = generator.sample((5,10))  # random
>>> tree = res.obj
>>> tree
(4, (5, (0, (6, (3, (7, (1, 'epsilon', 'epsilon'), 'epsilon'), 'epsilon')), (2, 'epsilon', 'epsilon')), 'epsilon'))

Functions

rng_seed
union_builder Factory for generating builders for union rules.

Classes

Generator High level interface for Boltzmann samplers.
Mapping

Exceptions

GeneratorConfigError
class usainboltz.generator.Generator

High level interface for Boltzmann samplers.

__init__

Make a Generator out of a grammar.

Parameters:
  • grammar (Grammar) – a combinatorial grammar
  • rule_name (RuleName) – the rule name of the symbol to generate.
  • singular (bool) – if set, do singular sampling (use the singularity of the generating functions as parameter).
  • expectations (Dict[Symbol,number]) – this is passed to the oracle to skew the distribution in order to target some specific values for the expected number of the specified symbols. See OracleFromPaganini.tuning() for more details.
  • oracle (Oracle) – an oracle for computing the values of some generating functions. If not supplied, a default generic oracle that should work for most use-cases is automatically generated.

Examples:

>>> from usainboltz import *

Some examples using the grammar for binary trees >>> z, B = Atom(), RuleName() >>> grammar = Grammar({B: Epsilon() + z * B * B})

Typical use: >>> generator = Generator(grammar, B, singular=True)

Since their is only one symbol in the grammar, the second argument can be omitted: >>> generator = Generator(grammar, singular=True)

get_builder

Retrieve the current builder for a non-terminal symbol.

Parameters:non_terminal (RuleName) – the name of the non-terminal symbol.
Returns:
the current builder currently bound to the rule name
passed as argument.
Return type:function
sample

Generate a term of the grammar given the prescribed sizes.

Parameters:sizes_windows (Tuple[int,int]] | Dict[(Atom|Marker),Tuple[int,int]]) – for atoms (resp. markers) for which a size window is specified, the generator will only generate objects whose number of those atoms (resp. markers) are in the specified size windows. For instance a generator.sample({z: (100, 200)}) will only generate objects with a number of z belonging to the interval [100, 200].
set_builder

Set the builder for a non-terminal symbol.

Parameters:
  • non_terminal (RuleName) – the name of the non-terminal symbol.
  • func (function) – the builder.
exception usainboltz.generator.GeneratorConfigError
__init__

Initialize self. See help(type(self)) for accurate signature.

args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class usainboltz.generator.Mapping
__init__
nb_markers
nb_rules
usainboltz.generator.rng_seed()
usainboltz.generator.union_builder()

Factory for generating builders for union rules.

The recommended way to write a builder for a union rule is to use this helper: write a auxilliary builder for each component of the union and compose them with union_builder.

Parameters:builders (List[function]) – builder functions, one for each component of the union.
Returns:
a builder function for a disjoint union. The resulting
function applies one of the functions passed as arguments to its input depending of the component of the union its argument belongs to.
Return type:function

Examples

Assume the symbol D is defined by Union(A, B, C), then defining a builder for D looks like:

>>> def build_A(args):
...     # Do something
...     pass
>>> def build_B(args):
...     # Do something
...     pass
>>> def build_C(args):
...     # Do something
...     pass
>>> build_D = union_builder(build_A, build_B, build_C)

For instance, for binary trees defined by B = Union(leaf, Product(z, B, B)), this could be:

>>> def build_leaf(_):
...     return BinaryTree()
>>> def build_node(args):
...     z, left, right = args
...     return BinaryTree([left, right])
>>> build_binarytree = union_builder(build_leaf, build_node)