# Loading and inspecting data¶

The first step in the reconstruction stage is to load the aligned data from file. For this purpose `mumott`

provides the `DataContainer`

class. Objects of this class are initialized by reading data from a file, and then hold all relevant data in one place. This tutorial illustrates this functionality and demonstrates how the data can be queried and transformed after loading.

The simulated data used in this tutorial can be obtained using, e.g., `wget`

with

```
wget https://zenodo.org/record/7326784/files/saxstt_dataset_M.h5
```

We start by loading the data into a `DataContainer`

object.

```
[1]:
```

```
from mumott.data_handling import DataContainer
from mumott import Geometry
```

```
INFO:Setting the number of threads to 8
INFO:Setting numba log level to WARNING.
```

## The `DataContainer`

object¶

We can load the data file into a data container as follows.

```
[2]:
```

```
dc = DataContainer('saxstt_dataset_M.h5')
```

```
INFO:Rotation matrix generated from inner and outer angles, along with inner and outer rotation axis vectors. Rotation and tilt angles assumed to be in radians.
```

```
mumott/data_handling/data_container.py:227: DeprecationWarning: Entry name rotations is deprecated. Use inner_angle instead.
_deprecated_key_warning('rotations')
mumott/data_handling/data_container.py:236: DeprecationWarning: Entry name tilts is deprecated. Use outer_angle instead.
_deprecated_key_warning('tilts')
mumott/data_handling/data_container.py:268: DeprecationWarning: Entry name offset_j is deprecated. Use j_offset instead.
_deprecated_key_warning('offset_j')
mumott/data_handling/data_container.py:278: DeprecationWarning: Entry name offset_k is deprecated. Use k_offset instead.
_deprecated_key_warning('offset_k')
```

```
INFO:No sample geometry information was found. Default mumott geometry assumed.
INFO:No detector geometry information was found. Default mumott geometry assumed.
```

There are now various options to query the data in the container. We can for example print the data container, which produces a string representation. The only data shown here is whether a transmission correction has been applied or not. In many cases, the transmission correction has already been performed at an earlier stage and is therefore not necessary here.

```
[3]:
```

```
print(dc)
```

```
==========================================================================
DataContainer
--------------------------------------------------------------------------
Corrected for transmission : False
==========================================================================
```

In a jupyter hub environment one can also generate a more nicely formatted version using the display command. If one “calls” an object.

Tip: If an object is “invoked” directly and on the last line of a cell, the display command is implied, i.e., display(obj) and obj lead to the same output. Below we use this short cut to display objects.

In this example, the data container contains 417 projections. At this point, the data has not been corrected for transmission, which should be done at some point for experimental data. In this example, however, we use simulated data and therefore the correction is not required.

We can get a more detailed view of the data by inspecting the `projections`

property.

```
[4]:
```

```
dc.projections
```

```
[4]:
```

### ProjectionStack

Field | Size | Data |
---|---|---|

data | (417, 50, 50, 8) | 4385fb (hash) |

diode | (417, 50, 50) | 430284 (hash) |

weights | (417, 50, 50, 8) | 0e62ec (hash) |

Number of pixels j | 1 | 50 |

Number of pixels k | 1 | 50 |

For each projection there are 8 different detector angles, such that each projections comprises \(50\times50\times8=20,000\) data points. In the present example, each projection consists of \(50\times50\) pixels.

## Inspecting individual projections¶

The `projections`

member provides a list-like view of the data that allows us to inspect individual projections (i.e., projections) directly. We can, e.g., check the tenth projection.

```
[5]:
```

```
projections = dc.projections
projections[10]
```

```
[5]:
```

### Projection

Field | Size | Data |
---|---|---|

data | (50, 50, 8) | 315ec3 (hash) |

diode | (50, 50) | 67a88d (hash) |

weights | (50, 50, 8) | 68e177 (hash) |

rotation | (3, 3) | [[ 0.54064 0.84125 0. ] [-0.84125 0.54064 0. ] [ 0. 0. 1. ]] |

j_offset | 1 | 0.00 |

k_offset | 1 | 0.00 |

inner_angle | 1 | 1.00 |

outer_angle | 1 | 0.00 |

inner_axis | (3,) | [ 0. 0. -1.] |

outer_axis | (3,) | [1. 0. 0.] |

Here, we can see the rotation matrix of the tenth cell, and we can tell that it does not have any `j_offset`

or `k_offset`

, since simulated data is already aligned.

Projections attributes can be modified directly.

```
[6]:
```

```
projections[10].j_offset = 1.0
display(projections[10])
```

### Projection

Field | Size | Data |
---|---|---|

data | (50, 50, 8) | 315ec3 (hash) |

diode | (50, 50) | 67a88d (hash) |

weights | (50, 50, 8) | 68e177 (hash) |

rotation | (3, 3) | [[ 0.54064 0.84125 0. ] [-0.84125 0.54064 0. ] [ 0. 0. 1. ]] |

j_offset | 1 | 1.00 |

k_offset | 1 | 0.00 |

inner_angle | 1 | 1.00 |

outer_angle | 1 | 0.00 |

inner_axis | (3,) | [ 0. 0. -1.] |

outer_axis | (3,) | [1. 0. 0.] |

If we want, we can remove this particular projection from the `projections`

using the `del`

command. It is possible to keep a reference to the original `projection`

object as shown by the following cell.

```
[7]:
```

```
f = projections[10]
del projections[10]
display(f)
```

### Projection

Field | Size | Data |
---|---|---|

data | (50, 50, 8) | 315ec3 (hash) |

diode | (50, 50) | 67a88d (hash) |

weights | (50, 50, 8) | 68e177 (hash) |

rotation | (3, 3) | [[ 0.54064 0.84125 0. ] [-0.84125 0.54064 0. ] [ 0. 0. 1. ]] |

j_offset | 1 | 1.00 |

k_offset | 1 | 0.00 |

inner_angle | 1 | 1.00 |

outer_angle | 1 | 0.00 |

inner_axis | (3,) | [ 0. 0. -1.] |

outer_axis | (3,) | [1. 0. 0.] |

We can tell from the hashes of the data and the rotation matrices that we now have a different projection. We kept a reference from the projection, and we are able to display it again for comparison. The diode and weights have the same hashes, because these arrays only contain the value `1.`

and therefore generate the same hash.

The `projections`

behaves like a list in general, with methods like `append()`

and `insert()`

.

## Geometry¶

`DataContainer`

has a `geometry`

property, which is attached to the `projections`

. It contains information about the geometry of each projection of the projections as well as overall experimental geometry, and removal of a projection automatically removes the corresponding offsets and rotations. Thus, this geometry, from our `new_projections`

, will contain information related to our new projections.

```
[8]:
```

```
geo = dc.geometry
print(geo)
display(geo)
```

```
--------------------------------------------------------------------------
Geometry
--------------------------------------------------------------------------
hash_rotations : c89ad7
hash_j_offsets : e60e9b
hash_k_offsets : e60e9b
p_direction_0 : [0. 1. 0.]
j_direction_0 : [1. 0. 0.]
k_direction_0 : [0. 0. 1.]
inner_axis : [ 0. 0. -1.]
outer_axis : [1. 0. 0.]
hash_inner_angles : 82fbc3
hash_outer_angles : 86aa3c
hash_inner_axes : 46623b
hash_outer_axes : 66971c
detector_direction_origin : [1. 0. 0.]
detector_direction_positive_90 : [0. 0. 1.]
two_theta : 0.00°
projection_shape : [50 50]
volume_shape : [50 50 50]
detector_angles : [0. ... 2.749]
--------------------------------------------------------------------------
```

### Geometry

Field | Size | Data |
---|---|---|

rotations | 416 | c89ad7 (hash) |

j_offsets | 416 | e60e9b (hash) |

k_offsets | 416 | e60e9b (hash) |

p_direction_0 | 3 | [0. 1. 0.] |

j_direction_0 | 3 | [1. 0. 0.] |

k_direction_0 | 3 | [0. 0. 1.] |

inner_axis | 3 | [ 0. 0. -1.] |

outer_axis | 3 | [1. 0. 0.] |

inner_angles | 416 | 82fbc3 (hash) |

outer_angles | 416 | 86aa3c (hash) |

inner_axes | 416 | 46623b (hash) |

outer_axes | 416 | 66971c (hash) |

detector_direction_origin | 3 | [1. 0. 0.] |

detector_direction_positive_90 | 3 | [0. 0. 1.] |

two_theta | 1 | $0.0^{\circ}$ |

projection_shape | 2 | [50 50] |

volume_shape | 3 | [50 50 50] |

detector_angles | 8 | [0. ... 2.75] |

There are two equivalent ways to access `rotations`

, `j_offsets`

and `k_offsets`

per projection in a `geometry`

object.

```
[9]:
```

```
print(geo.rotations[10])
print(geo[10].rotation)
```

```
[[ 0.45822652 0.88883545 0. ]
[-0.88883545 0.45822652 0. ]
[ 0. 0. 1. ]]
[[ 0.45822652 0.88883545 0. ]
[-0.88883545 0.45822652 0. ]
[ 0. 0. 1. ]]
```

The `Geometry`

object also has `read`

and `write`

methods, which allow for the complete recreation of a `geometry`

object. We can verify this by comparing their `hash`

values, which derive only from their member data. Equivalently to `read`

, one can simply pass the path to the file when instantiating a `Geometry`

object.

```
[10]:
```

```
geo.write('test.geo')
new_geo = Geometry()
new_geo.read('test.geo')
print(hex(hash(geo))[2:8], hex(hash(new_geo))[2:8])
new_geo = Geometry('test.geo')
print(hex(hash(geo))[2:8], hex(hash(new_geo))[2:8])
```

```
d28ef4 d28ef4
d28ef4 d28ef4
```

## Skipping data¶

It is possible to skip data when loading data files. This will create a projections without data, but a full `Geometry`

object.

```
[11]:
```

```
dc = DataContainer(data_path='saxstt_dataset_M.h5', skip_data=True)
display(dc)
```

```
INFO:Rotation matrix generated from inner and outer angles, along with inner and outer rotation axis vectors. Rotation and tilt angles assumed to be in radians.
INFO:No sample geometry information was found. Default mumott geometry assumed.
INFO:No detector geometry information was found. Default mumott geometry assumed.
```

### DataContainer

Field | Size |
---|---|

Number of projections | 417 |

Corrected for transmission | False |

```
[ ]:
```

```
```