Loading and inspecting data

The first step in the reconstruction stage is to load the aligned data from file. For this purpose mumott provides the DataContainer class. Objects of this class are initialized by reading data from a file, and then hold all relevant data in one place. This tutorial illustrates this functionality and demonstrates how the data can be queried and transformed after loading.

The simulated data used in this tutorial can be obtained using, e.g., wget with

wget https://zenodo.org/records/7326784/files/saxstt_dataset_M.h5

We start by loading the data into a DataContainer object.

[1]:
from mumott.data_handling import DataContainer
from mumott import Geometry
INFO:Setting the number of threads to 8
INFO:Setting numba log level to WARNING.

The DataContainer object

We can load the data file into a data container as follows.

[2]:
dc = DataContainer('saxstt_dataset_M.h5')
INFO:Rotation matrix generated from inner and outer angles, along with inner and outer rotation axis vectors. Rotation and tilt angles assumed to be in radians.
mumott/data_handling/data_container.py:227: DeprecationWarning: Entry name rotations is deprecated. Use inner_angle instead.
  _deprecated_key_warning('rotations')
mumott/data_handling/data_container.py:236: DeprecationWarning: Entry name tilts is deprecated. Use outer_angle instead.
  _deprecated_key_warning('tilts')
mumott/data_handling/data_container.py:268: DeprecationWarning: Entry name offset_j is deprecated. Use j_offset instead.
  _deprecated_key_warning('offset_j')
mumott/data_handling/data_container.py:278: DeprecationWarning: Entry name offset_k is deprecated. Use k_offset instead.
  _deprecated_key_warning('offset_k')
INFO:No sample geometry information was found. Default mumott geometry assumed.
INFO:No detector geometry information was found. Default mumott geometry assumed.

There are now various options to query the data in the container. We can for example print the data container, which produces a string representation. The only data shown here is whether a transmission correction has been applied or not. In many cases, the transmission correction has already been performed at an earlier stage and is therefore not necessary here.

[3]:
print(dc)
==========================================================================
                              DataContainer
--------------------------------------------------------------------------
Corrected for transmission : False
==========================================================================

In a jupyter hub environment one can also generate a more nicely formatted version using the display command. If one “calls” an object.

Tip: If an object is “invoked” directly and on the last line of a cell, the display command is implied, i.e., display(obj) and obj lead to the same output. Below we use this short cut to display objects.

In this example, the data container contains 417 projections. At this point, the data has not been corrected for transmission, which should be done at some point for experimental data. In this example, however, we use simulated data and therefore the correction is not required.

We can get a more detailed view of the data by inspecting the projections property.

[4]:
dc.projections
[4]:

ProjectionStack

FieldSizeData
data (417, 50, 50, 8)4385fb (hash)
diode (417, 50, 50)430284 (hash)
weights (417, 50, 50, 8)0e62ec (hash)
Number of pixels j 1 50
Number of pixels k 1 50

For each projection there are 8 different detector angles, such that each projections comprises \(50\times50\times8=20,000\) data points. In the present example, each projection consists of \(50\times50\) pixels.

Inspecting individual projections

The projections member provides a list-like view of the data that allows us to inspect individual projections (i.e., projections) directly. We can, e.g., check the tenth projection.

[5]:
projections = dc.projections
projections[10]
[5]:

Projection

FieldSizeData
data (50, 50, 8)315ec3 (hash)
diode(50, 50) 67a88d (hash)
weights(50, 50, 8) 68e177 (hash)
rotation(3, 3) [[ 0.54064 0.84125 0. ] [-0.84125 0.54064 0. ] [ 0. 0. 1. ]]
j_offset1 0.00
k_offset1 0.00
inner_angle1 1.00
outer_angle1 0.00
inner_axis(3,) [ 0. 0. -1.]
outer_axis(3,) [1. 0. 0.]

Here, we can see the rotation matrix of the tenth cell, and we can tell that it does not have any j_offset or k_offset, since simulated data is already aligned.

Projections attributes can be modified directly.

[6]:
projections[10].j_offset = 1.0
display(projections[10])

Projection

FieldSizeData
data (50, 50, 8)315ec3 (hash)
diode(50, 50) 67a88d (hash)
weights(50, 50, 8) 68e177 (hash)
rotation(3, 3) [[ 0.54064 0.84125 0. ] [-0.84125 0.54064 0. ] [ 0. 0. 1. ]]
j_offset1 1.00
k_offset1 0.00
inner_angle1 1.00
outer_angle1 0.00
inner_axis(3,) [ 0. 0. -1.]
outer_axis(3,) [1. 0. 0.]

If we want, we can remove this particular projection from the projections using the del command. It is possible to keep a reference to the original projection object as shown by the following cell.

[7]:
f = projections[10]
del projections[10]
display(f)

Projection

FieldSizeData
data (50, 50, 8)315ec3 (hash)
diode(50, 50) 67a88d (hash)
weights(50, 50, 8) 68e177 (hash)
rotation(3, 3) [[ 0.54064 0.84125 0. ] [-0.84125 0.54064 0. ] [ 0. 0. 1. ]]
j_offset1 1.00
k_offset1 0.00
inner_angle1 1.00
outer_angle1 0.00
inner_axis(3,) [ 0. 0. -1.]
outer_axis(3,) [1. 0. 0.]

We can tell from the hashes of the data and the rotation matrices that we now have a different projection. We kept a reference from the projection, and we are able to display it again for comparison. The diode and weights have the same hashes, because these arrays only contain the value 1. and therefore generate the same hash.

The projections behaves like a list in general, with methods like append() and insert().

Geometry

DataContainer has a geometry property, which is attached to the projections. It contains information about the geometry of each projection of the projections as well as overall experimental geometry, and removal of a projection automatically removes the corresponding offsets and rotations. Thus, this geometry, from our new_projections, will contain information related to our new projections.

[8]:
geo = dc.geometry
print(geo)
display(geo)
--------------------------------------------------------------------------
                                 Geometry
--------------------------------------------------------------------------
hash_rotations     : c89ad7
hash_j_offsets     : e60e9b
hash_k_offsets     : e60e9b
p_direction_0      : [0. 1. 0.]
j_direction_0      : [1. 0. 0.]
k_direction_0      : [0. 0. 1.]
inner_axis         : [ 0.  0. -1.]
outer_axis         : [1. 0. 0.]
hash_inner_angles  : 82fbc3
hash_outer_angles  : 86aa3c
hash_inner_axes    : 46623b
hash_outer_axes    : 66971c
detector_direction_origin : [1. 0. 0.]
detector_direction_positive_90 : [0. 0. 1.]
two_theta          : 0.00°
projection_shape   : [50 50]
volume_shape       : [50 50 50]
detector_angles    : [0.    ... 2.749]
--------------------------------------------------------------------------

Geometry

FieldSizeData
rotations 416c89ad7 (hash)
j_offsets 416e60e9b (hash)
k_offsets 416e60e9b (hash)
p_direction_0 3[0. 1. 0.]
j_direction_0 3[1. 0. 0.]
k_direction_0 3[0. 0. 1.]
inner_axis 3[ 0. 0. -1.]
outer_axis 3[1. 0. 0.]
inner_angles 41682fbc3 (hash)
outer_angles 41686aa3c (hash)
inner_axes 41646623b (hash)
outer_axes 41666971c (hash)
detector_direction_origin 3[1. 0. 0.]
detector_direction_positive_90 3[0. 0. 1.]
two_theta 1$0.0^{\circ}$
projection_shape 2[50 50]
volume_shape 3[50 50 50]
detector_angles 8[0. ... 2.75]

There are two equivalent ways to access rotations, j_offsets and k_offsets per projection in a geometry object.

[9]:
print(geo.rotations[10])
print(geo[10].rotation)
[[ 0.45822652  0.88883545  0.        ]
 [-0.88883545  0.45822652  0.        ]
 [ 0.          0.          1.        ]]
[[ 0.45822652  0.88883545  0.        ]
 [-0.88883545  0.45822652  0.        ]
 [ 0.          0.          1.        ]]

The Geometry object also has read and write methods, which allow for the complete recreation of a geometry object. We can verify this by comparing their hash values, which derive only from their member data. Equivalently to read, one can simply pass the path to the file when instantiating a Geometry object.

[10]:
geo.write('test.geo')
new_geo = Geometry()
new_geo.read('test.geo')
print(hex(hash(geo))[2:8], hex(hash(new_geo))[2:8])
new_geo = Geometry('test.geo')
print(hex(hash(geo))[2:8], hex(hash(new_geo))[2:8])
d28ef4 d28ef4
d28ef4 d28ef4

Skipping data

It is possible to skip data when loading data files. This will create a projections without data, but a full Geometry object.

[11]:
dc = DataContainer(data_path='saxstt_dataset_M.h5', skip_data=True)
display(dc)
INFO:Rotation matrix generated from inner and outer angles, along with inner and outer rotation axis vectors. Rotation and tilt angles assumed to be in radians.
INFO:No sample geometry information was found. Default mumott geometry assumed.
INFO:No detector geometry information was found. Default mumott geometry assumed.

DataContainer

FieldSize
Number of projections 417
Corrected for transmission False
[ ]: