Workflow

The following figure illustrates the mumott workflow. Here, classes are shown in blue, input parameters and data in orange, and output data in green.

# dot -Tsvg workflow.dot -o workflow.svg digraph g { graph [ fontname = "helvetica", fontsize = 12.0, rankdir = "TB", bgcolor = "transparent" ]; edge [ fontname = "helvetica", fontsize = 12.0, penwidth = 1.5 ] node [ fontname = "helvetica", fontsize = 12.0, fontcolor = black, shape = ellipse, color = "#a0c9e5", style = filled]; Data [ color="#ffca9c", label="Measured data and metadata\nincluding geometry information", shape=box]; Resolution [ color="#ffca9c", label="Bandwidth or resolution", shape=box, target="_top"]; DataContainer [ label="DataContainer", href="../moduleref/data_handling.html#mumott.data_handling.DataContainer", target="_top" ]; Geometry [ label="Geometry", href="../moduleref/data_handling.html#mumott.data_handling.Geometry", target="_top" ]; Projector [ label="Projector", href="../moduleref/projectors.html", target="_top" ]; ResidualCalculator [ label="ResidualCalculator", href="../moduleref/residual_calculators.html", target="_top"]; Regularizer [ label="Regularizer", href="../moduleref/regularizers.html", target="_top"]; BasisSet [ label="BasisSet", href="../moduleref/basis_sets.html", target="_top" ]; LossFunction [ label="LossFunction", href="../moduleref/loss_functions.html", target="_top" ]; Optimizer [ label="Optimizer", href="../moduleref/optimizers.html", target="_top" ] Output [ shape=rectangle, color="#a2daa2", label="Tensor field properties\n(anisotropy, orientation ...)", fontcolor=black, href="../tutorial/reconstruct_and_visualizer.html", target="_top"]; Data -> DataContainer DataContainer -> ResidualCalculator DataContainer -> Geometry Geometry -> Projector Resolution -> BasisSet Projector -> ResidualCalculator BasisSet -> ResidualCalculator Regularizer -> LossFunction [label="Attached"] ResidualCalculator -> LossFunction LossFunction -> Optimizer Optimizer -> Output [label="Processed via\n BasisSet"] }

A typical workflow involves the following steps:

  1. First the measured data along with its metadata is loaded into a DataContainer object. The latter allows one to access, inspect, and modify the data in various ways as shown in the tutorial on loading and inspecting data tutorial. Note that it is possible to skip the full data when instantiating a DataContainer object. In that case only geometry and diode data are read, which is much faster and sufficient for alignment.

  2. The DataContainer object holds the information pertaining to the geometry of the data. The latter is stored in the geometry property of the DataContainer object in the form of a Geometry object.

  3. The geometry information is then used to set up a projector object, e.g., SAXSProjector. Projector objects allow one to transform tensor fields from three-dimensional space to projection space.

  4. Next a basis set object such as, e.g., SphericalHarmonics, is set up.

  5. One can then combine the projector object, the basis set, and the data from the DataContainer object to set up a residual calculator object. Residual calculator objects hold the coefficients that need to be optimized and allow one to compute the residuals of the current representation.

  6. To find the optimal coefficients a loss function object is set up, using, e.g., the SquaredLoss or HuberLoss classes. The loss function can include one or several regularization terms, which are defined by regularizer objects such as L1Norm, L2Norm or TotalVariation.

  7. The loss function object is then handed over to an optimizer object, such as LBFGS or GradientDescent, which updates the coefficients of the residual calculator object.

  8. The optimized coefficients can then be processed via the basis set object to generate tensor field properties such as the anisotropy or the orientation distribution, returned as a dict.

  9. The function dict_to_h5 can be used to convert this dictionary of properties into an h5 file, to be further processed or visualized.

Pipelines

Reconstruction workflows can be greatly abstracted via reconstruction pipelines. A pipeline contains a typical series of objects linked together, and it is possible to replace some of the components in the pipeline with others preferred by the user.

# dot -Tsvg workflow.dot -o workflow.svg digraph g { graph [ fontname = "helvetica", fontsize = 12.0, rankdir = "TB", bgcolor = "transparent" ]; edge [ fontname = "helvetica", fontsize = 12.0, penwidth = 1.5 ] node [ fontname = "helvetica", fontsize = 12.0, fontcolor = black, shape = ellipse, color = "#a0c9e5", style = filled]; Data [ color="#ffca9c", label="Measured data and metadata\nincluding geometry information", shape=box]; UserParams [ color="#ffca9c", label="User preferences", shape=box, target="_top"]; DataContainer [ label="DataContainer", href="../moduleref/data_handling.html#mumott.data_handling.DataContainer", target="_top" ]; Pipeline [ label="Pipeline", href="../moduleref/pipelines.html", target="_top"]; Components [label="Pipeline component objects", shape=box, href="../moduleref/methods.html", target="_top"]; Result [shape=box, label="Reconstruction", target="_top"]; Output [ shape=rectangle, color="#a2daa2", label="Tensor field properties\n(anisotropy, orientation ...)", fontcolor=black, href="../tutorial/reconstruct_and_visualizer.html", target="_top"]; Data -> DataContainer DataContainer -> Pipeline UserParams -> Pipeline Pipeline -> Components Pipeline -> Result Components -> Output Result -> Output [label="Processed via\n BasisSet"] }

The user interaction with the pipeline can be understood as follows:

  1. A DataContainer instance is created from input, as in a standard workflow.

  2. The DataContainer is passed to a pipeline function, e.g., the SIGTT pipeline function, along with user-specified parameters as keyword arguments.

  3. For example, one might want to set the regularization weight for the Laplacian regularizer (using the regularization_weight keyword argument), or one might want to replace the default SAXSProjector with the GPU-based SAXSProjectorCUDA (using the Projector keyword argument).

  4. The SIGTT pipeline executes, and returns a dict which contains the entry 'result' with the optimization coefficients. In addition, it contains the entries optimizer, loss_function, residual_calculator, basis_set, and projector, all containing the instances of the respective objects used in the pipeline.

  5. The get_output method of the basis set can then be used to generate tensor field properties, as in the standard workflow.