causalcompass.datasets.missing.generate_missing_var

causalcompass.datasets.missing.generate_missing_var(p, T, lag=3, sparsity=0.2, beta_value=1.0, sd=0.1, burn_in=100, seed=0, missing_config=None, interp='zoh')[source]

Generate VAR data with missing values and interpolation.

References

https://github.com/jarrycyx/UNN

Parameters:
  • p (int) – Number of variables

  • T (int) – Number of time points

  • lag (int, default 3) – Number of lags in the VAR model

  • sparsity (float, default 0.2) – Sparsity of the causal graph

  • beta_value (float, default 1.0) – Coefficient value

  • sd (float, default 0.1) – Noise standard deviation

  • burn_in (int, default 100) – Burn-in period

  • seed (int, default 0) – Random seed

  • missing_config (dict, default None) –

    Configuration for the missing pattern.

    Example

    missing_config = {"random_missing": {"missing_prob": 0.2, "missing_var": "all"}} means each entry is missing with probability 0.2. The random_missing pattern represents completely random missingness. Here, missing_prob is the missing probability and missing_var specifies which variables are allowed to be masked. Setting missing_var to "all" applies random missingness to all variables; alternatively, you can pass a list of variable indices such as [0, 2, 4] to mask only selected variables.

  • interp (str, default 'zoh') – Interpolation method. Supported values are ‘zoh’, ‘linear’, and ‘GP’.

Returns:

(data_interp, data_masked, mask, GC, original_data) — interpolated time series of shape (T, p, 1), masked time series of shape (T, p), missing data mask of shape (T, p) where 1 indicates observed and 0 indicates missing, ground-truth causal graph of shape (p, p), and original complete time series of shape (T, p).

Return type:

tuple