arviz_base.dict_to_dataset#

arviz_base.dict_to_dataset(data, *, attrs=None, inference_library=None, coords=None, dims=None, sample_dims=None, index_origin=None, skip_event_dims=False, check_conventions=True)[source]#

Convert a dictionary of numpy arrays to an xarray.Dataset.

The conversion considers some ArviZ conventions and adds extra attributes, so it is similar to initializing an xarray.Dataset but not equivalent.

Parameters:
datadict of {hashable: array_like}

Data to convert. Keys are variable names.

attrsdict, optional

JSON-like arbitrary metadata to attach to the dataset, in addition to default attributes added by make_attrs.

Note

No serialization checks are done in this function, so you might generate Dataset objects that can’t be serialized or that can only be serialized to some backends.

inference_librarymodule, optional

Library used for performing inference. Will be included in the xarray.Dataset attributes.

coordsdict of {hashable: array_like}, optional

Coordinates for the dataset

dimsdict of {hashable: iterable of hashable}, optional

Dimensions of each variable. The keys are variable names, values are lists of coordinates.

sample_dimsiterable of hashable, optional

Dimensions that should be assumed to be present in _all_ variables. If missing, they will be added as the dimensions corresponding to the leading axes.

index_originint, optional

Passed to generate_dims_coords

skip_event_dimsbool, optional

Passed to generate_dims_coords

check_conventionsbool, optional

Check ArviZ conventions. Per the ArviZ schema, some dimension names have specific meaning and there might be inconsistencies caught here in the dimension naming step.

Returns:
xarray.Dataset

Examples

Generate a Dataset with two variables using sample_dims:

import arviz_base as az
import numpy as np
rng = np.random.default_rng(2)
az.dict_to_dataset(
    {"a": rng.normal(size=(4, 100)), "b": rng.normal(size=(4, 100))},
    sample_dims=["chain", "draw"],
)
<xarray.Dataset> Size: 7kB
Dimensions:  (chain: 4, draw: 100)
Coordinates:
  * chain    (chain) int64 32B 0 1 2 3
  * draw     (draw) int64 800B 0 1 2 3 4 5 6 7 8 ... 91 92 93 94 95 96 97 98 99
Data variables:
    a        (chain, draw) float64 3kB 0.1891 -0.5227 -0.4131 ... 1.264 0.8073
    b        (chain, draw) float64 3kB -1.239 1.781 0.2446 ... -1.577 -0.4203
Attributes:
    created_at:                 2024-04-02T17:06:16.572067+00:00
    creation_library:           ArviZ
    creation_library_version:   0.1.0
    creation_library_language:  Python

Generate a Dataset with the chain and draw dimensions in different position. Setting the dimensions for a to “group” and “chain”, sample_dims will then be used to prepend the “draw” dimension only as “chain” is already there.

az.dict_to_dataset(
    {"a": rng.normal(size=(10, 5, 4)), "b": rng.normal(size=(10, 4))},
    dims={"a": ["group", "chain"]},
    sample_dims=["draw", "chain"],
)
<xarray.Dataset> Size: 2kB
Dimensions:  (draw: 10, group: 5, chain: 4)
Coordinates:
  * draw     (draw) int64 80B 0 1 2 3 4 5 6 7 8 9
  * group    (group) int64 40B 0 1 2 3 4
  * chain    (chain) int64 32B 0 1 2 3
Data variables:
    a        (draw, group, chain) float64 2kB -1.002 0.1678 ... -1.669 -0.4702
    b        (draw, chain) float64 320B -0.3802 -2.451 -0.4693 ... 0.7267 -1.639
Attributes:
    created_at:                 2024-04-02T17:06:16.602827+00:00
    creation_library:           ArviZ
    creation_library_version:   0.1.0
    creation_library_language:  Python