Usain Boltz =========== Usain Boltz is a Python/Cython library meant to automate the random generation of tree-like structures. The library was primarily designed to be used with the `Sagemath `_ mathematics software systems but Sagemath is no longer a dependency and the library is now perfectly usable both within and without a Sagemath environment. Principle (non-technical) ------------------------- The use of Usain Boltz starts with the specification of the structures we want to sample using :py:mod:`~usainboltz.grammar` module. Then, optionally, we can tell generator (:py:class:`~usainboltz.generator.Generator`) how to build the objects, thanks to the builders mechanism. For example, imagine that we want to sample tree-like structures :: Tree := Leaf | Tree * Tree These structures may be represented as XML strings: :: but also as JSON objects: :: { 'tree' : [ { 'tree' : [ 'leaf', 'leaf' ] }, { 'tree' : [ 'leaf' ]} ] } However the inherent tree structure is the same. Thus the *same* generator may generate both objects using different builders. Principle (technical) --------------------- Usain Boltz is based on the principles of the *Boltzmann method* introduced in the article [DuFlLoSc04]_, whose goal is to provide fast and uniform random sampler for a large variety of combinatorial structures. The starting point is to describe the structures we want to sample by a :py:class:`~usainboltz.grammar.Grammar`. Each rule of that grammar defines a *combinatorial class*: a set where objects have a finite size given by their number of :py:class:`~usainboltz.grammar.Atom`. Then the idea is to sample objects :math:`\gamma` from a class :math:`\mathcal{A}` with respect to the *Boltzmann model of parameter* :math:`x`: .. math:: \mathbb{P}(\gamma) = \frac{x^{|\gamma|}}{A(x)} where :math:`|\gamma|` is the size of :math:`\gamma` and :math:`A` is the generating series of :math:`\mathcal{A}` defined by :math:`A(z)=\sum_{\gamma \in \mathcal{A}} z^{|\gamma|}`. Given that model, one can compute that the expected size of a sampled object is given by: .. math:: \mathbb{E}[|\gamma|] = x \cdot \frac{A'(x)}{A(x)} So a good choice of :math:`x` allows to target a specific size for the sampled objects. This is the work of the :py:mod:`~usainboltz.oracle`. .. rubric:: References .. [DuFlLoSc04] Philippe Duchon, Philippe Flajolet, Guy Louchard, and Gilles Schaeffer. 2004. Boltzmann Samplers for the Random Generation of Combinatorial Structures. Comb. Probab. Comput. 13, 4-5 (July 2004), 577-625. DOI=http://dx.doi.org/10.1017/S0963548304006315 Supported features ------------------ Usainboltz is still a work in progress and not all the combinators from the :py:mod:`~usainboltz.grammar` module are currently fully supported. This table describes what should be expected to work and what is still under development. .. csv-table:: List of currently supported features :header: "", "Grammar", "Oracle", "Generator" "z, ε, +, ×", "✅", "✅", "✅" "Seq", "✅", "✅", "✅" "Set", "✅", "partial (see 1 below)", "✅" "Cycle", "✅", "partial (see 1 below)", "❌" "MSet", "✅", "partial (see 1 below)", "partial (see 3 below)" "UCycle", "✅", "partial (see 2 below)", "❌" Limitations: 1. Paganini (our default oracle) does not support constrained sets, multisets and cycles (that is iterated rules with :code:`leq=n` and/or :code:`geq=m)`) unless :code:`n = m`. This can be worked around by providing your own oracle, e.g. by passing a dictionary of float values to :py:func:`~usainboltz.oracle.build_oracle`. 2. Paganini (our default oracle) only implements unlabelled cycles (:code:`UCycle`) of fixed size (that is cycles of the form :code:`UCycle(