| Title: | R Bindings for the Nutpie NUTS Sampler |
|---|---|
| Description: | Lightweight R interface to the nutpie No-U-Turn Sampler (NUTS) implemented in Rust (nuts-rs), using BridgeStan for model gradients. |
| Authors: | Andy Timm [aut, cre] |
| Maintainer: | Andy Timm <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.7.5 |
| Built: | 2026-06-06 15:18:47 UTC |
| Source: | https://github.com/andytimm/nutpieR |
Returns the directory under which nutpie_compile_model() stores its
content-hashed compile artifacts (one subdirectory per unique
source + flags + BridgeStan version). Useful for inspection,
troubleshooting, or unlink()-ing a single entry.
nutpie_cache_dir()nutpie_cache_dir()
A character string with the path to the cache root.
nutpie_cache_dir()nutpie_cache_dir()
Removes the current resolved compile cache tree under
nutpie_cache_dir(). Cached compiled models will
be recompiled on next use.
nutpie_clear_cache()nutpie_clear_cache()
Invisibly NULL.
This deletes the underlying _model.so files. If you hold a
nutpie_model object whose library hasn't been opened yet (no prior
nutpie_sample() call on it), its lib_path will
point at a deleted file and subsequent use will fail. Models that
were already opened in the current session keep working — once
loaded, the OS retains the mapped library independently of the file
on disk.
Only the active cache root is cleared. If R_USER_CACHE_DIR was
previously unset (or pointed somewhere else) and a different root
was resolved earlier in the session, that older directory is left
alone so models still backed by it remain valid.
nutpie_clear_cache()nutpie_clear_cache()
Compiles a Stan model to a shared library using BridgeStan. Downloads BridgeStan sources on first use (this is slow).
nutpie_compile_model( stan_file = NULL, code = NULL, stanc_args = character(), compile_args = character(), verbose = 1L, cache = TRUE )nutpie_compile_model( stan_file = NULL, code = NULL, stanc_args = character(), compile_args = character(), verbose = 1L, cache = TRUE )
stan_file |
Path to a |
code |
A string containing Stan model code. |
stanc_args |
Character vector of extra arguments passed to the
|
compile_args |
Character vector of extra arguments passed to |
verbose |
Integer controlling compilation output. |
cache |
Logical, default |
An object of class "nutpie_model" containing the path to the
compiled shared library.
Compiled artifacts are stored in a content-hashed cache under
nutpie_cache_dir() (one subdirectory per unique
source + flags + BridgeStan version), regardless of whether the model
was passed as stan_file = ... or code = "...". A subsequent call
with identical inputs is a near-instant cache hit.
For stan_file = ..., the transitive #include set is hashed
together with the main file, so editing an included file (or the main
file itself) busts the cache and triggers a recompile.
The cache is bounded by nutpie_prune_cache(),
which runs automatically at the end of every successful compile
(cap: 16 entries, min age before eviction: 14 days).
Cache controls:
cache = FALSE on a single call — compile to a fresh tempdir for
this call only, without touching the persistent cache.
Sys.setenv(NUTPIER_DISABLE_COMPILE_CACHE = "1") — same effect
process-wide.
nutpie_clear_cache() wipes the cache.
Prior nutpieR versions wrote <basename>_model.so next to the
source .stan file, matching cmdstanr's convention. nutpieR now uses
a content-hashed cache directory instead. This change is required for
correctness: when the same .stan path is reloaded after a recompile,
the OS dynamic linker (dlopen) returns the previously loaded library
rather than the new one, so edits silently had no effect (see GitHub
issue #23). Distinct content → distinct path → fresh dlopen. Any
stale <basename>_model.so and <basename>_model.cache_meta files
left over from earlier versions can be deleted; nutpieR no longer
reads or writes them.
## Not run: # From a .stan file model <- nutpie_compile_model(stan_file = "my_model.stan") # From an inline code string model <- nutpie_compile_model(code = " data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { theta ~ beta(1, 1); y ~ bernoulli(theta); } ") ## End(Not run)## Not run: # From a .stan file model <- nutpie_compile_model(stan_file = "my_model.stan") # From an inline code string model <- nutpie_compile_model(code = " data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { theta ~ beta(1, 1); y ~ bernoulli(theta); } ") ## End(Not run)
Diagnostics are extracted directly from the nuts-rs sample-stats schema, so
the exact set of fields depends on the installed nuts-rs version and the
sampling options used. Count fields (depth, n_steps, chain, draw,
index_in_trajectory) are returned as R integer when every value fits
in i32, else as numeric; floating-point fields (logp, energy,
step_size, etc.) are always numeric. NAs use the matching sentinel
(NA_integer_ / NA_real_).
nutpie_diagnostics(draws)nutpie_diagnostics(draws)
draws |
A |
A nutpie_diagnostics object (a named list with a print method).
Commonly available scalar fields: diverging, tuning, maxdepth_reached
(logical); depth, n_steps, chain, draw, index_in_trajectory
(integer when fits in i32, else numeric); logp, energy,
energy_error, step_size, step_size_bar, mean_tree_accept,
mean_tree_accept_sym (numeric).
Wide fields (one row per draw): when store_unconstrained = TRUE,
unconstrained_draw (NA rows where unrecorded); when
store_gradient = TRUE, gradient (NA rows where unrecorded); when
store_mass_matrix = TRUE and save_warmup = TRUE,
mass_matrix_inv (and mass_matrix_eigvals / mass_matrix_stds when
reported), with the most recently recorded value carried forward into
draws between updates — the inverse mass matrix is piecewise-constant
between adapter steps, not undefined. Requires save_warmup = TRUE
because adaptation (the only time the matrix changes) occurs during
warmup; without warmup draws in the trace there is no value to carry
forward.
These surface as (n_draws * n_chains, ndim_unc) numeric matrices when
every recorded row has the same width; mixed-width columns fall back
to a list of numeric vectors (one per draw, NULL when not recorded).
With store_divergences = TRUE: divergence_start, divergence_end,
divergence_momentum, divergence_start_gradient (lists, only
present when at least one draw diverged).
chain is 1-indexed in 1:num_chains. draw is 1-indexed in
1:num_draws for post-warmup diagnostics and 1:num_warmup for warmup
diagnostics (returned via nutpie_warmup_diagnostics()). Matches
posterior::draws_array conventions, so a data.frame of diagnostics
joins cleanly against draws indexed by (chain, iteration).
## Not run: draws <- nutpie_sample(model, data = dat, num_draws = 1000, num_chains = 4) diag <- nutpie_diagnostics(draws) diag # printed summary sum(diag$diverging) # divergence count max(diag$depth) # peak treedepth across all draws ## End(Not run)## Not run: draws <- nutpie_sample(model, data = dat, num_draws = 1000, num_chains = 4) diag <- nutpie_diagnostics(draws) diag # printed summary sum(diag$diverging) # divergence count max(diag$depth) # peak treedepth across all draws ## End(Not run)
Reshapes the diagnostics from nutpie_diagnostics() into the four-column
long-format data.frame that bayesplot's NUTS plotting helpers (e.g.
bayesplot::mcmc_pairs(np = ...), bayesplot::mcmc_nuts_energy())
expect. Names match Stan's CSV convention (accept_stat__,
divergent__, treedepth__, n_leapfrog__, stepsize__,
energy__). Other bayesplot NUTS helpers (e.g.
mcmc_nuts_divergence(), mcmc_nuts_acceptance()) additionally
need a per-draw lp data frame, which this helper does not produce.
nutpie_nuts_params(draws)nutpie_nuts_params(draws)
draws |
A |
A data.frame with columns:
ChainInteger chain index (1-indexed).
IterationInteger post-warmup draw index (1-indexed
within chain, in 1:num_draws).
ParameterCharacter; one of "accept_stat__",
"divergent__", "treedepth__", "n_leapfrog__",
"stepsize__", "energy__".
ValueNumeric value of the corresponding diagnostic.
The data frame has num_draws * num_chains * 6 rows.
bayesplot::mcmc_pairs() for the most common consumer of this
format. nutpie_diagnostics() for the raw diagnostics.
## Not run: draws <- nutpie_sample(model, data = dat, num_chains = 4) np <- nutpie_nuts_params(draws) bayesplot::mcmc_pairs(draws, np = np, pars = c("mu", "tau")) ## End(Not run)## Not run: draws <- nutpie_sample(model, data = dat, num_chains = 4) np <- nutpie_nuts_params(draws) bayesplot::mcmc_pairs(draws, np = np, pars = c("mu", "tau")) ## End(Not run)
Returns the parameter names of a compiled Stan model, converted from
BridgeStan's dot-indexed form ("beta.1.2") to Stan's bracket convention
("beta[1,2]").
nutpie_param_names( model, data = NULL, which = c("block", "unconstrained", "full"), unconstrained = NULL )nutpie_param_names( model, data = NULL, which = c("block", "unconstrained", "full"), unconstrained = NULL )
model |
A |
data |
Model data (same format as |
which |
One of
|
unconstrained |
Deprecated. If non- |
A character vector of parameter names.
## Not run: model <- nutpie_compile_model(stan_file = "my_model.stan") nutpie_param_names(model, data = dat) # block (default) nutpie_param_names(model, data = dat, which = "unconstrained") nutpie_param_names(model, data = dat, which = "full") # incl. TP + GQ ## End(Not run)## Not run: model <- nutpie_compile_model(stan_file = "my_model.stan") nutpie_param_names(model, data = dat) # block (default) nutpie_param_names(model, data = dat, which = "unconstrained") nutpie_param_names(model, data = dat, which = "full") # incl. TP + GQ ## End(Not run)
Evicts older entries from the nutpieR compile cache so it stays
bounded. Mirrors Python nutpie's policy: cap the cache at
max_entries, but only evict entries at least min_age_days old
(oldest first). Called automatically at the end of every successful
compile – you usually don't need to invoke it directly, but it's
here for one-off manual cleanup or scripted maintenance.
nutpie_prune_cache(max_entries = 16L, min_age_days = 14L)nutpie_prune_cache(max_entries = 16L, min_age_days = 14L)
max_entries |
Maximum number of valid (fully compiled) cache entries to retain. Defaults to 16. |
min_age_days |
Minimum age (in days, by |
Invisibly, the number of entries removed.
nutpie_prune_cache() nutpie_prune_cache(max_entries = 8, min_age_days = 7)nutpie_prune_cache() nutpie_prune_cache(max_entries = 8, min_age_days = 7)
Runs the nuts-rs NUTS sampler on a compiled Stan model.
nutpie_sample( model, data = NULL, num_draws = 1000L, num_warmup = 400L, num_chains = 4L, seed = NULL, max_treedepth = NULL, mindepth = NULL, target_accept = NULL, max_energy_error = NULL, extra_doublings = NULL, refresh = 100L, init = NULL, init_mean = NULL, save_warmup = FALSE, cores = NULL, pars = NULL, include = TRUE, store_divergences = FALSE, store_mass_matrix = FALSE, store_unconstrained = FALSE, store_gradient = FALSE, adaptation = c("diag", "low_rank", "low-rank"), low_rank_modified_mass_matrix = FALSE, mass_matrix_gamma = NULL, mass_matrix_eigval_cutoff = NULL )nutpie_sample( model, data = NULL, num_draws = 1000L, num_warmup = 400L, num_chains = 4L, seed = NULL, max_treedepth = NULL, mindepth = NULL, target_accept = NULL, max_energy_error = NULL, extra_doublings = NULL, refresh = 100L, init = NULL, init_mean = NULL, save_warmup = FALSE, cores = NULL, pars = NULL, include = TRUE, store_divergences = FALSE, store_mass_matrix = FALSE, store_unconstrained = FALSE, store_gradient = FALSE, adaptation = c("diag", "low_rank", "low-rank"), low_rank_modified_mass_matrix = FALSE, mass_matrix_gamma = NULL, mass_matrix_eigval_cutoff = NULL )
model |
A |
data |
Model data. Can be:
|
num_draws |
Number of post-warmup draws per chain. |
num_warmup |
Number of warmup (tuning) draws per chain. |
num_chains |
Number of parallel chains. |
seed |
Random seed for reproducibility. |
max_treedepth |
Maximum tree depth for NUTS. The number of leapfrog
steps per draw is at most |
mindepth |
Minimum tree depth for NUTS. The number of leapfrog steps
per draw is at least |
target_accept |
Target acceptance probability for step size adaptation.
Default |
max_energy_error |
Energy-error threshold above which a leapfrog step
is treated as a divergence. Default |
extra_doublings |
Number of additional tree doublings allowed after a
turning point is reached. Default |
refresh |
How often to print progress updates, in draws per chain.
Set to |
init |
Initial values for each chain. Single entry point that dispatches on the input shape:
No jitter is applied; starting points are used exactly (after any
random fill for partial constrained inits). To work on the unconstrained
scale, see Chain assignment is unspecified. When supplying per-chain starts
(list-of-lists or |
init_mean |
Deprecated. Scalar or numeric vector on the unconstrained
scale, with ±0.5 uniform jitter per chain. Use
|
save_warmup |
If |
cores |
Number of CPU cores to use for parallel sampling. Defaults to
|
pars |
An optional character vector of block-level parameter names
(e.g. |
include |
Logical (default |
store_divergences |
If |
store_mass_matrix |
If |
store_unconstrained |
If |
store_gradient |
If |
adaptation |
Mass matrix adaptation strategy. One of:
Matches the Python nutpie |
low_rank_modified_mass_matrix |
Deprecated. If |
mass_matrix_gamma |
Regularisation parameter for low-rank mass matrix
adaptation. Only used when |
mass_matrix_eigval_cutoff |
Eigenvalue cutoff for low-rank mass matrix.
Eigenvalues outside |
A posterior::draws_array with dimensions
(num_draws, num_chains, n_params). Sampler diagnostics are attached
as an attribute and can be retrieved with nutpie_diagnostics(); see
?nutpie_diagnostics for the chain / draw indexing convention (1-indexed,
phase-relative). The attributes "num_warmup" and "num_draws" record
the sampling configuration (accessible via attr(draws, "num_warmup")
etc.). The "sampler_config" attribute is a JSON string capturing the
effective nuts-rs settings used (including any defaults that were
left unspecified by the caller, and exposing num_warmup to match the
function argument name); parse with jsonlite::fromJSON().
## Not run: model <- nutpie_compile_model(code = " data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { theta ~ beta(1, 1); y ~ bernoulli(theta); } ") draws <- nutpie_sample( model, data = list(N = 10, y = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1)), num_draws = 1000, num_chains = 4, seed = 604 ) dim(draws) # (num_draws, num_chains, n_params) posterior::variables(draws) posterior::summarize_draws(draws) nutpie_diagnostics(draws) ## End(Not run)## Not run: model <- nutpie_compile_model(code = " data { int<lower=0> N; array[N] int<lower=0,upper=1> y; } parameters { real<lower=0,upper=1> theta; } model { theta ~ beta(1, 1); y ~ bernoulli(theta); } ") draws <- nutpie_sample( model, data = list(N = 10, y = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 1)), num_draws = 1000, num_chains = 4, seed = 604 ) dim(draws) # (num_draws, num_chains, n_params) posterior::variables(draws) posterior::summarize_draws(draws) nutpie_diagnostics(draws) ## End(Not run)
Introspection / debugging helper. Takes a named list of parameter values in the constrained (user-facing) space and returns the corresponding unconstrained vector that BridgeStan uses internally. Useful when inspecting how Stan's unconstraining transform maps your values, or when sanity-checking a model's parameter order.
nutpie_unconstrain(model, params, data = NULL)nutpie_unconstrain(model, params, data = NULL)
model |
A |
params |
A named list of parameter values. Names must match the parameter names in the Stan program. Values may be scalars, vectors, matrices, or arrays — matching the declared shape. |
data |
Model data (same format as |
For setting sampler starting points, pass constrained values directly to
nutpie_sample()'s init argument (e.g. init = list(mu = 0, sigma = 1))
— this function is not part of the normal init workflow.
All parameters declared in the parameters block must be supplied.
A named numeric vector whose names are the unconstrained parameter names (in BridgeStan's internal order).
## Not run: model <- nutpie_compile_model(stan_file = "my_model.stan") nutpie_unconstrain(model, params = list(mu = 0, sigma = 1), data = dat) ## End(Not run)## Not run: model <- nutpie_compile_model(stan_file = "my_model.stan") nutpie_unconstrain(model, params = list(mu = 0, sigma = 1), data = dat) ## End(Not run)
Extract warmup diagnostics from nutpie output
nutpie_warmup_diagnostics(draws)nutpie_warmup_diagnostics(draws)
draws |
A |
A named list of diagnostic vectors for the warmup phase. chain
is 1-indexed and draw is 1-indexed in 1:num_warmup; see
nutpie_diagnostics() for the full indexing convention.
## Not run: draws <- nutpie_sample(model, data = dat, save_warmup = TRUE) wd <- nutpie_warmup_diagnostics(draws) max(wd$depth) # warmup treedepth peak ## End(Not run)## Not run: draws <- nutpie_sample(model, data = dat, save_warmup = TRUE) wd <- nutpie_warmup_diagnostics(draws) max(wd$depth) # warmup treedepth peak ## End(Not run)
Extract warmup draws from nutpie output
nutpie_warmup_draws(draws)nutpie_warmup_draws(draws)
draws |
A |
A posterior::draws_array containing the warmup draws, or NULL
if warmup draws were not saved.
## Not run: draws <- nutpie_sample(model, data = dat, save_warmup = TRUE) warmup <- nutpie_warmup_draws(draws) posterior::summarize_draws(warmup) ## End(Not run)## Not run: draws <- nutpie_sample(model, data = dat, save_warmup = TRUE) warmup <- nutpie_warmup_draws(draws) posterior::summarize_draws(warmup) ## End(Not run)