ParallelMCMC.jl

ParallelMCMC logo

ParallelMCMC.jl is a Julia package for parallel-across-the-sequence MCMC — algorithms that solve an entire trajectory of $T$ correlated steps simultaneously rather than one at a time.

DEER trajectory estimates improving on a Julia-logo-shaped posterior

DEER iterates on a synthetic Julia-logo-shaped posterior: orange trajectory estimates move toward the taped MALA path over repeated trajectory solves.

The flagship algorithm is DEER (Deterministic Equivalent-Expectation Recursion), which reformulates a chain of $T$ MALA steps as a fixed-point problem and solves it via Newton iterations, each costing $O(\log T)$ parallel work via an associative prefix scan. The result is that wall-clock time per sample is sublinear in chain length on multi-core CPUs and GPUs.

The algorithm is described in:

Zoltowski, D. M., Wu, S., Gonzalez, X., Kozachkov, L., & Linderman, S. W. (2025). Parallelizing MCMC Across the Sequence Length. NeurIPS 2025. arXiv:2508.18413

The included MALASampler and AdaptiveMALASampler are sequential MALA baselines — useful for correctness checks and step-size tuning, but not the primary focus of the package.

Samplers

SamplerRole
ParallelMALASamplerPrimary — parallel-across-sequence MALA via DEER; O(log T) per solve
MALASamplerBaseline — sequential MALA with a fixed step size
AdaptiveMALASamplerBaseline — sequential MALA with dual-averaging step-size adaptation

All samplers implement the AbstractMCMC interface and return MCMCChains.Chains objects.

Installation

To install the package into your current environment:

pkg> add ParallelMCMC

Quick start

Define a model

The simplest entry point is DensityModel, which wraps a log-density and its gradient:

using ParallelMCMC, MCMCChains
using ADTypes, Enzyme

# Example: 2-D standard normal
logp(x)      = -0.5 * sum(abs2, x)
grad_logp(x) = -x

model = DensityModel(logp, grad_logp, 2;
                     param_names=[:x1, :x2])

DEER — parallel-across-sequence (primary algorithm)

sampler = ParallelMALASampler(0.1; T=64, jacobian=:stoch_diag,
                              backend=AutoEnzyme())

chain = sample(model, sampler, 500;
               chain_type=MCMCChains.Chains)

Each call to sample draws 500 samples by solving DEER trajectories of length T=64 in parallel, re-solving from the last state when each trajectory is exhausted.

Sequential MALA baseline

sampler = AdaptiveMALASampler(0.1; n_warmup=500)

chain = sample(model, sampler, 2_000;
               chain_type=MCMCChains.Chains,
               discard_warmup=true,
               progress=true)

Turing.jl integration

When DynamicPPL (part of Turing.jl) is loaded, a one-argument DensityModel constructor is available that wraps a @model directly. Parameter names are automatically extracted, and values transformed back to the original model space:

using Turing, ParallelMCMC, MCMCChains

@model function normal_model(y)
    μ ~ Normal(0.0, 1.0)
    y ~ Normal(μ, 0.5)
end

model   = DensityModel(normal_model(1.5))
sampler = AdaptiveMALASampler(0.3; n_warmup=500)

chain = sample(model, sampler, 2_000;
               chain_type=MCMCChains.Chains,
               discard_warmup=true)

See Getting Started for worked examples and guidance on choosing samplers, and Algorithm Details for the mathematics behind DEER.

Contributors

Ryan Senne
Ryan Senne

💻 🚧 ⚠️ 🤔 👀 📖
Penelope Yong
Penelope Yong

💻 ⚠️ 🤔 👀 📖
Guillaume Dalle
Guillaume Dalle

👀 🤔
William Moses
William Moses

👀