European XFEL Python data tools

karabo_data is a Python library for accessing and working with data produced at European XFEL.


karabo_data has been renamed to EXtra-data (for data access) and EXtra-geom (for detector geometry and image assembly). Please use those packages for new code.


karabo_data is available on our Anaconda installation on the Maxwell cluster:

module load exfel exfel_anaconda3

You can also install it from PyPI to use in other environments with Python 3.5 or later:

pip install karabo_data

If you get a permissions error, add the --user flag to that command.


Open a run or a file - see Opening files for more:

from karabo_data import open_run, RunDirectory, H5File

# Find a run on the Maxwell cluster
run = open_run(proposal=700000, run=1)

# Open a run with a directory path
run = RunDirectory("/gpfs/exfel/exp/XMPL/201750/p700000/raw/r0001")

# Open an individual file
file = H5File("RAW-R0017-DA01-S00000.h5")

After this step, you’ll use the same methods to get data whether you opened a run or a file.

Load data into memory - see Getting data by source & key for more:

# Get a labelled array
arr = run.get_array("SA3_XTD10_PES/ADC/1:network", "digitizers.channel_4_A.raw.samples")

# Get a pandas dataframe of 1D fields
df = run.get_dataframe(fields=[
    ("*_XGM/*", "*.i[xy]Pos"),
    ("*_XGM/*", "*.photonFlux")

Iterate through data for each pulse train - see Getting data by train for more:

for train_id, data in"*/DET/*", "").trains():
    mod0 = data["FXE_DET_LPD1M-1/DET/0CH0:xtdf"][""]

These are not the only ways to get data: Reading data files describes various other options. karabo_data also has classes to work with detector geometry, described in AGIPD, LPD & DSSC Geometry.

Documentation contents


Indices and tables