Overview¶

In [1]:

Copied!





import matplotlib as mpl
import matplotlib.pyplot as plt
#from ipywidgets import interact, widgets

mpl.rcParams['axes.spines.right'] = False
mpl.rcParams['axes.spines.top'] = False

import altair as alt
import arviz as az
import numpyro
import pandas as pd
from jax import grad, jit
from jax import numpy as jnp
from jax import random, vmap
from numpyro import distributions as dist
from numpyro.infer import MCMC, NUTS

from lqg import LQG, Actor, Dynamics, System, xcorr
import matplotlib as mpl
import matplotlib.pyplot as plt
#from ipywidgets import interact, widgets

mpl.rcParams['axes.spines.right'] = False
mpl.rcParams['axes.spines.top'] = False

import altair as alt
import arviz as az
import numpyro
import pandas as pd
from jax import grad, jit
from jax import numpy as jnp
from jax import random, vmap
from numpyro import distributions as dist
from numpyro.infer import MCMC, NUTS

from lqg import LQG, Actor, Dynamics, System, xcorr

In [2]:

Copied!

# We have to use DataFrames that a larger than recommended -> turn off the error
alt.data_transformers.disable_max_rows()
# We have to use DataFrames that a larger than recommended -> turn off the error
alt.data_transformers.disable_max_rows()

Out[2]:

DataTransformerRegistry.enable('default')

Quick intro to jax¶

1. NumPy-like API: `jax.numpy`¶

JAX is a library that enables transformations of array-manipulating programs written with a NumPy-like API. You can think of JAX as differentiable NumPy that runs on accelerators. Many NumPy programs would run just as well in JAX if you substitute np for jnp.

In [3]:

Copied!

a = jnp.array([[1., 2.],
               [3., 4.]])
a = jnp.array([[1., 2.],
               [3., 4.]])

In [4]:

Copied!

key = random.PRNGKey(1)
b = random.normal(key, shape=(2, 3))

a @ b
key = random.PRNGKey(1)
b = random.normal(key, shape=(2, 3))

a @ b

Out[4]:

Array([[ 1.6896877, -0.6240323,  1.5889112],
       [ 4.3366823, -2.2179935,  4.184889 ]], dtype=float32)

2. Automatic differentiation: `grad`¶

You can think of jax.grad by analogy to the $\nabla$ operator from vector calculus. Given a function $f(x)$, $\nabla f$ represents the function that computes $f$’s gradient, i.e.

$$ (\nabla f)(x)_i = \frac{\partial f}{\partial x_i} (x). $$

Analogously, jax.grad(f) is the function that computes the gradient, so jax.grad(f)(x) is the gradient of f at x.

In [5]:

Copied!

def f(x):
    return jnp.sin(x)

grad(f)(jnp.pi)
def f(x):
    return jnp.sin(x)

grad(f)(jnp.pi)

Out[5]:

Array(-1., dtype=float32, weak_type=True)

3. Easy vectorization: `vmap`¶

In JAX, the jax.vmap transformation is designed to generate a vectorized implementation of a function automatically.

In [6]:

Copied!

x = jnp.linspace(0, 2 * jnp.pi)
x = jnp.linspace(0, 2 * jnp.pi)

In [7]:

Copied!

plt.plot(x, f(x))
plt.plot(x, vmap(grad(f))(x))
plt.plot(x, f(x))
plt.plot(x, vmap(grad(f))(x))

Out[7]:

[<matplotlib.lines.Line2D at 0x7feb443178e0>]

No description has been provided for this image

4. Compilation: `jit`¶

You can use the XLA (accelerated linear algebra) compiler to compile your functions with jax.jit.

In [8]:

Copied!

def f(x):
    return x * x + x * 2.0

x = jnp.ones((5000, 5000))
def f(x):
    return x * x + x * 2.0

x = jnp.ones((5000, 5000))

In [9]:

Copied!

%timeit f(x)
%timeit f(x)

63 ms ± 1.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [10]:

Copied!

jit_f = jit(f)
%timeit jit_f(x)
jit_f = jit(f)
%timeit jit_f(x)

22.8 ms ± 1.91 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Putting perception into action: Inverse optimal control for continuous psychophysics¶

Modeling a tracking task with LQG control¶

The LQG control problem is defined by a linear-Gaussian stochastic dynamical system $$ \mathbf x_{t+1} = A \mathbf x_t + B \mathbf u_t + V \mathbf \epsilon_t, \; \mathbf\epsilon_t \sim \mathcal{N}(0, I), $$

a linear-Gaussian observation model $$ \mathbf y_t = C \mathbf x_t + W \mathbf \eta_t, \; \mathbf\eta_t \sim \mathcal{N}(0, I), $$

and a quadratic cost function

$$ J = \sum_t \mathbf x_t^T Q \mathbf x_t + \mathbf u_t^T R \mathbf u_t. $$

We assume that the actor solves the linear-quadratic Gaussian problem, i.e. computes the Kalman filter $K$ and the LQR control law $L$, which are the optimal solution under the quadratic cost function

$$ J(u_{1:T}) = \sum_{t=1}^T \mathbf x_t^T Q \mathbf x_t + \mathbf u_t^T R \mathbf u_t. $$

We start by defining the matrices $A, B, C, V, W, Q, R$ as jax.numpy.arrays. according to our simple model of the continuous psychophysics tracking task:

$$ A = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \; B = \begin{bmatrix} 0 \\ dt \end{bmatrix}, \; C = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \\ V = \begin{bmatrix} \sigma_\text{rw} & 0 \\ 0 & \sigma_\text{act} \end{bmatrix}, \; W = \begin{bmatrix} \sigma & 0 \\ 0 & \sigma_\text{cursor} \end{bmatrix}, \\ Q = \begin{bmatrix} 1 & -1 \\ -1 & 1 \end{bmatrix}, \; R = \begin{bmatrix} c \end{bmatrix}. $$

In [11]:

Copied!





action_variability = 0.5
sigma_target = 6.
sigma_cursor = 1.
action_cost = .05

dt = 1. / 60.

# dynamical system
A = jnp.eye(2)
B = jnp.array([[0.], 
              [dt]])

# noise
V = jnp.diag(jnp.array([1., action_variability]))

# observation model
C = jnp.eye(2)
W = jnp.diag(jnp.array([sigma_target, sigma_cursor]))

# cost function
Q = jnp.array([[1., -1.],
              [-1., 1]])

R = jnp.eye(1) * action_cost

T = 500
model = LQG(A, B, C, V, W, Q, R, T=T)
action_variability = 0.5
sigma_target = 6.
sigma_cursor = 1.
action_cost = .05

dt = 1. / 60.

# dynamical system
A = jnp.eye(2)
B = jnp.array([[0.], 
              [dt]])

# noise
V = jnp.diag(jnp.array([1., action_variability]))

# observation model
C = jnp.eye(2)
W = jnp.diag(jnp.array([sigma_target, sigma_cursor]))

# cost function
Q = jnp.array([[1., -1.],
              [-1., 1]])

R = jnp.eye(1) * action_cost

T = 500
model = LQG(A, B, C, V, W, Q, R, T=T)

The System class, which LQG extends, can be pretty-printed. When calling display(model), the matrices of your model will be displayed as $\LaTeX$ math formula. The same happens when the output of a cell is printed. Standard (i.e., not so pretty) printing can still be achieved through print(model).

In [12]:

Copied!

model  # equivalent to `display(model)`
model  # equivalent to `display(model)`

Out[12]:

\begin{align*} \text{Dynamics:} &&A = \begin{bmatrix} 1. & 0.\\ 0. & 1.\\\end{bmatrix} &&B = \begin{bmatrix} 0.\\ 0.0167\\\end{bmatrix} &&F = \begin{bmatrix} 1. & 0.\\ 0. & 1.\\\end{bmatrix} &&V = \begin{bmatrix} 1. & 0.\\ 0. & 0.5\\\end{bmatrix} &&W = \begin{bmatrix} 6. & 0.\\ 0. & 1.\\\end{bmatrix}\\\text{Actor:} &&A = \begin{bmatrix} 1. & 0.\\ 0. & 1.\\\end{bmatrix} &&B = \begin{bmatrix} 0.\\ 0.0167\\\end{bmatrix} &&F = \begin{bmatrix} 1. & 0.\\ 0. & 1.\\\end{bmatrix} &&V = \begin{bmatrix} 1. & 0.\\ 0. & 0.5\\\end{bmatrix} &&W = \begin{bmatrix} 6. & 0.\\ 0. & 1.\\\end{bmatrix} &&Q = \begin{bmatrix} 1. & -1.\\ -1. & 1.\\\end{bmatrix} &&R = \begin{bmatrix} 0.05\\\end{bmatrix}\end{align*}

Let's simulate some tracking data by applying the Kalman filter and linear-quadratic regular. This is implemented in the method simulate(rng_key, n, T). Since jax does not have a global random number generator state, we need to pass a PRNGKey object. n is the number of trials and T is the number of time steps.

In [13]:

Copied!





x = model.simulate(random.PRNGKey(0), n=100)

plt.plot(jnp.arange(T) * dt, x[0, :, 0], label="target")
plt.plot(jnp.arange(T) * dt, x[0, :, 1], label="cursor")
plt.legend()
plt.xlabel("Time [s]")
plt.ylabel("Position [arcmin]")
plt.show()
x = model.simulate(random.PRNGKey(0), n=100)

plt.plot(jnp.arange(T) * dt, x[0, :, 0], label="target")
plt.plot(jnp.arange(T) * dt, x[0, :, 1], label="cursor")
plt.legend()
plt.xlabel("Time [s]")
plt.ylabel("Position [arcmin]")
plt.show()

In [14]:

Copied!

x.shape
x.shape

Out[14]:

(100, 500, 2)

Cross-correlograms¶

We can also look at the correlation between the velocities of the target and the cursor at different time lags. This analysis is known as a cross-correlogram (Mulligan et al., 2013) and computes the average autocorrelation of the velocities of target and response.

In [15]:

Copied!





vels = jnp.diff(x, axis=1)
lags, correls = xcorr(vels[...,1], vels[...,0], maxlags=120)

plt.plot(lags, correls.mean(axis=0))
plt.xlabel("Lag [s]")
plt.ylabel("Cross-correlation")
vels = jnp.diff(x, axis=1)
lags, correls = xcorr(vels[...,1], vels[...,0], maxlags=120)

plt.plot(lags, correls.mean(axis=0))
plt.xlabel("Lag [s]")
plt.ylabel("Cross-correlation")

Out[15]:

Text(0, 0.5, 'Cross-correlation')

Influence of model parameters¶

To look a the influence of the different model parameters, we define a class that inherits from the LQG base class and defines the matrices given the four parameters.

In [16]:

Copied!





class BoundedActor(LQG):
    def __init__(self, 
               sigma_target, 
               action_variability, 
               action_cost, 
               sigma_cursor):

        dt = 1. / 60.

        A = jnp.eye(2)
        B = jnp.array([[0.], 
                      [dt]])

        V = jnp.diag(jnp.array([1., action_variability]))

        F = jnp.eye(2)
        W = jnp.diag(jnp.array([sigma_target, sigma_cursor]))


        Q = jnp.array([[1., -1.],
                      [-1., 1]])

        R = jnp.eye(1) * action_cost


        super().__init__(A=A, B=B, F=F, V=V, W=W, Q=Q, R=R, T=T)
class BoundedActor(LQG):
    def __init__(self, 
               sigma_target, 
               action_variability, 
               action_cost, 
               sigma_cursor):

        dt = 1. / 60.

        A = jnp.eye(2)
        B = jnp.array([[0.], 
                      [dt]])

        V = jnp.diag(jnp.array([1., action_variability]))

        F = jnp.eye(2)
        W = jnp.diag(jnp.array([sigma_target, sigma_cursor]))


        Q = jnp.array([[1., -1.],
                      [-1., 1]])

        R = jnp.eye(1) * action_cost


        super().__init__(A=A, B=B, F=F, V=V, W=W, Q=Q, R=R, T=T)

We can now simulate data from the model given the four parameters. To do this efficiently, we jit-compile the simulation function.

Some observations:

An increase in action costs leads to an increased lag and decreased maximum correlation.
An increase in perceptual uncertainty about the target leads to decreased correlation and increased lag, too, but the shape of the curves changes differently compared to the effect of the behavioral cost.
Action variability does not change the lag, but decreases correlation overall.
Perceptual uncertainty about the cursor does not change the shape of the CCGs at all, but does increase the mean squared error between target and response.

In [17]:

Copied!





@jit  # jit-compile the simulation to speed up the data generation
def simulate_trajectories(sigma_target, action_cost, action_variability, sigma_cursor):
    model = BoundedActor(
        sigma_target=sigma_target,
        action_variability=action_variability,
        action_cost=action_cost,
        sigma_cursor=sigma_cursor,
    )

    x = model.simulate(random.PRNGKey(0), n=100)

    return x
@jit  # jit-compile the simulation to speed up the data generation
def simulate_trajectories(sigma_target, action_cost, action_variability, sigma_cursor):
    model = BoundedActor(
        sigma_target=sigma_target,
        action_variability=action_variability,
        action_cost=action_cost,
        sigma_cursor=sigma_cursor,
    )

    x = model.simulate(random.PRNGKey(0), n=100)

    return x

In [18]:

Copied!





# Simulate data and store it in a DataFrame
data_trajectory = []
data_ccg = []

sigma_target_list = [1.0, 10.0, 100.0]
action_cost_list = [0.2, 1.0, 5.0]
action_variability_list = [0.25, 0.5, 1.0]
sigma_cursor_list = [1.0, 10.0, 100.0]

time_max = 500
time = jnp.arange(time_max) * dt

for sigma_target in sigma_target_list:
    for action_cost in action_cost_list:
        for action_variability in action_variability_list:
            for sigma_cursor in sigma_cursor_list:
                x = simulate_trajectories(
                    sigma_target, action_cost, action_variability, sigma_cursor
                )
                for i, step in enumerate(x[0]):
                    data_trajectory.append(
                        [
                            sigma_target,
                            action_cost,
                            action_variability,
                            sigma_cursor,
                            "target",
                            time[i].item(),
                            step[0].item(),
                        ]
                    )
                    data_trajectory.append(
                        [
                            sigma_target,
                            action_cost,
                            action_variability,
                            sigma_cursor,
                            "cursor",
                            time[i].item(),
                            step[1].item(),
                        ]
                    )

                vels = jnp.diff(x, axis=1)
                lags, correls = xcorr(vels[..., 1], vels[..., 0], maxlags=120)
                lags = lags / 60
                correls = correls.mean(axis=0)

                for lag, correl in zip(lags, correls):
                    data_ccg.append(
                        [
                            sigma_target,
                            action_cost,
                            action_variability,
                            sigma_cursor,
                            lag.item(),
                            correl.item(),
                        ]
                    )

df_trajectory = pd.DataFrame(
    data_trajectory,
    columns=[
        "sigma_target",
        "action_cost",
        "action_variability",
        "sigma_cursor",
        "value",
        "time",
        "position",
    ],
)
df_ccg = pd.DataFrame(
    data_ccg,
    columns=["sigma_target", "action_cost", "action_variability", "sigma_cursor", "lag", "correl"],
)
# Simulate data and store it in a DataFrame
data_trajectory = []
data_ccg = []

sigma_target_list = [1.0, 10.0, 100.0]
action_cost_list = [0.2, 1.0, 5.0]
action_variability_list = [0.25, 0.5, 1.0]
sigma_cursor_list = [1.0, 10.0, 100.0]

time_max = 500
time = jnp.arange(time_max) * dt

for sigma_target in sigma_target_list:
    for action_cost in action_cost_list:
        for action_variability in action_variability_list:
            for sigma_cursor in sigma_cursor_list:
                x = simulate_trajectories(
                    sigma_target, action_cost, action_variability, sigma_cursor
                )
                for i, step in enumerate(x[0]):
                    data_trajectory.append(
                        [
                            sigma_target,
                            action_cost,
                            action_variability,
                            sigma_cursor,
                            "target",
                            time[i].item(),
                            step[0].item(),
                        ]
                    )
                    data_trajectory.append(
                        [
                            sigma_target,
                            action_cost,
                            action_variability,
                            sigma_cursor,
                            "cursor",
                            time[i].item(),
                            step[1].item(),
                        ]
                    )

                vels = jnp.diff(x, axis=1)
                lags, correls = xcorr(vels[..., 1], vels[..., 0], maxlags=120)
                lags = lags / 60
                correls = correls.mean(axis=0)

                for lag, correl in zip(lags, correls):
                    data_ccg.append(
                        [
                            sigma_target,
                            action_cost,
                            action_variability,
                            sigma_cursor,
                            lag.item(),
                            correl.item(),
                        ]
                    )

df_trajectory = pd.DataFrame(
    data_trajectory,
    columns=[
        "sigma_target",
        "action_cost",
        "action_variability",
        "sigma_cursor",
        "value",
        "time",
        "position",
    ],
)
df_ccg = pd.DataFrame(
    data_ccg,
    columns=["sigma_target", "action_cost", "action_variability", "sigma_cursor", "lag", "correl"],
)

In [19]:

Copied!





# Plot the data as interactive Altair chart
radio1 = alt.binding_radio(
    options=sigma_target_list,
    name="Perceptual uncertainty (target): ",
)
selection1 = alt.selection_point(
    value=10.0,
    fields=["sigma_target"],
    bind=radio1,
)

radio2 = alt.binding_radio(options=action_cost_list, name="Behavioral costs: ")
selection2 = alt.selection_point(
    value=1.0,
    fields=["action_cost"],
    bind=radio2,
)

radio3 = alt.binding_radio(options=action_variability_list, name="Action variability: ")
selection3 = alt.selection_point(
    value=0.5,
    fields=["action_variability"],
    bind=radio3,
)

radio4 = alt.binding_radio(options=sigma_cursor_list, name="Perceptual uncertainty (cursor): ")
selection4 = alt.selection_point(
    value=10.0,
    fields=["sigma_cursor"],
    bind=radio4,
)

lines_trajectory = (
    alt.Chart(df_trajectory)
    .mark_line()
    .encode(
        x="time:Q",
        y=alt.Y("position").scale(domain=(-30, 30)),
        color=alt.Color("value").sort("descending"),
        tooltip=["time", "position"],
    )
    .add_params(selection4, selection3, selection2, selection1)
    .transform_filter(selection1 & selection2 & selection3 & selection4)
    .properties(title="Trajectory")
)

lines_ccg = (
    alt.Chart(df_ccg)
    .mark_line()
    .encode(
        x="lag:Q",
        y=alt.Y("correl:Q").scale(domain=(-0.02, 0.1)),
        color=alt.value("#2CA02C"),
        tooltip=["lag", "correl"],
    )
    .add_params(selection4, selection3, selection2, selection1)
    .transform_filter(selection1 & selection2 & selection3 & selection4)
    .properties(title="Cross-correlogram")
)

chart = lines_trajectory | lines_ccg
display(chart)
# Plot the data as interactive Altair chart
radio1 = alt.binding_radio(
    options=sigma_target_list,
    name="Perceptual uncertainty (target): ",
)
selection1 = alt.selection_point(
    value=10.0,
    fields=["sigma_target"],
    bind=radio1,
)

radio2 = alt.binding_radio(options=action_cost_list, name="Behavioral costs: ")
selection2 = alt.selection_point(
    value=1.0,
    fields=["action_cost"],
    bind=radio2,
)

radio3 = alt.binding_radio(options=action_variability_list, name="Action variability: ")
selection3 = alt.selection_point(
    value=0.5,
    fields=["action_variability"],
    bind=radio3,
)

radio4 = alt.binding_radio(options=sigma_cursor_list, name="Perceptual uncertainty (cursor): ")
selection4 = alt.selection_point(
    value=10.0,
    fields=["sigma_cursor"],
    bind=radio4,
)

lines_trajectory = (
    alt.Chart(df_trajectory)
    .mark_line()
    .encode(
        x="time:Q",
        y=alt.Y("position").scale(domain=(-30, 30)),
        color=alt.Color("value").sort("descending"),
        tooltip=["time", "position"],
    )
    .add_params(selection4, selection3, selection2, selection1)
    .transform_filter(selection1 & selection2 & selection3 & selection4)
    .properties(title="Trajectory")
)

lines_ccg = (
    alt.Chart(df_ccg)
    .mark_line()
    .encode(
        x="lag:Q",
        y=alt.Y("correl:Q").scale(domain=(-0.02, 0.1)),
        color=alt.value("#2CA02C"),
        tooltip=["lag", "correl"],
    )
    .add_params(selection4, selection3, selection2, selection1)
    .transform_filter(selection1 & selection2 & selection3 & selection4)
    .properties(title="Cross-correlogram")
)

chart = lines_trajectory | lines_ccg
display(chart)

Subjective internal models¶

The dynamical system that defines the experiment (i.e. the target movement) might not be known to the agent. In that case, we can assume that the agent has their own subjective internal model, which may differ from the true model.

For example, instead of the true random walk

$$ x_{t+1} = x_t + \sigma_\text{rw} * \epsilon_t,$$

the agent could assume that there is a velocity component to the random walk

$$ x_{t+1} = x_t + dt * v_t + \sigma_s * \epsilon_t \\ v_{t+1} = v_t + \sigma_v * \varepsilon_t. $$

Let's simulate this and compare it to the random walk on position that was used in the experiment.

In [20]:

Copied!





# Simulate data and store it in a DataFrame
data_walk = []

sigma_pos_list = [0.0, 0.5, 1.0, 2.0]
sigma_vel_list = [0.0, 0.5, 1.0, 2.0]

num_steps = 100
num_sims = 20

key_s, key_v = random.split(random.PRNGKey(123))
x_true = jnp.cumsum(random.normal(key_s, shape=(num_steps, num_sims)), axis=0)

for sim_idx, sim in enumerate(x_true.T):
    for step_idx, step in enumerate(sim):
        data_walk.append(["true", 1, 0, sim_idx, step_idx, step.item()])

for sigma_pos in sigma_pos_list:
    for sigma_vel in sigma_vel_list:
        v = jnp.cumsum(random.normal(key_v, shape=(num_steps, num_sims)) * sigma_vel, axis=0)
        x = jnp.cumsum(
            random.normal(key_s, shape=(num_steps, num_sims)) * sigma_pos + dt * v, axis=0
        )

        for sim_idx, sim in enumerate(x.T):
            for step_idx, step in enumerate(sim):
                data_walk.append(
                    ["subjective", sigma_pos, sigma_vel, sim_idx, step_idx, step.item()]
                )

df_walk = pd.DataFrame(
    data_walk, columns=["kind", "sigma_pos", "sigma_vel", "simulation", "step", "position"]
)
# Simulate data and store it in a DataFrame
data_walk = []

sigma_pos_list = [0.0, 0.5, 1.0, 2.0]
sigma_vel_list = [0.0, 0.5, 1.0, 2.0]

num_steps = 100
num_sims = 20

key_s, key_v = random.split(random.PRNGKey(123))
x_true = jnp.cumsum(random.normal(key_s, shape=(num_steps, num_sims)), axis=0)

for sim_idx, sim in enumerate(x_true.T):
    for step_idx, step in enumerate(sim):
        data_walk.append(["true", 1, 0, sim_idx, step_idx, step.item()])

for sigma_pos in sigma_pos_list:
    for sigma_vel in sigma_vel_list:
        v = jnp.cumsum(random.normal(key_v, shape=(num_steps, num_sims)) * sigma_vel, axis=0)
        x = jnp.cumsum(
            random.normal(key_s, shape=(num_steps, num_sims)) * sigma_pos + dt * v, axis=0
        )

        for sim_idx, sim in enumerate(x.T):
            for step_idx, step in enumerate(sim):
                data_walk.append(
                    ["subjective", sigma_pos, sigma_vel, sim_idx, step_idx, step.item()]
                )

df_walk = pd.DataFrame(
    data_walk, columns=["kind", "sigma_pos", "sigma_vel", "simulation", "step", "position"]
)

In [21]:

Copied!





# Plot the data as interactive Altair chart
radio1 = alt.binding_radio(options=sigma_pos_list, name="Subjective position std: ")
selection1 = alt.selection_point(
    value=1,
    fields=["sigma_pos"],
    bind=radio1,
)

radio2 = alt.binding_radio(options=sigma_vel_list, name="Subjective velocity std: ")
selection2 = alt.selection_point(
    value=0,
    fields=["sigma_vel"],
    bind=radio2,
)

lines_true = (
    alt.Chart(df_walk.query("kind == 'true'"))
    .mark_line()
    .encode(
        x="step:Q",
        y="position:Q",
        color=alt.Color("simulation:N").scale(scheme="rainbow"),
        tooltip=["simulation", "step", "position"],
    )
    .properties(title="True Random Walk")
)

lines_subjective = (
    alt.Chart(df_walk.query("kind == 'subjective'"))
    .mark_line()
    .encode(
        x="step:Q",
        y="position:Q",
        color=alt.Color("simulation:N"),
        detail=["sigma_pos", "sigma_vel"],
        tooltip=["simulation", "step", "position"],
    )
    .add_params(selection2, selection1)
    .transform_filter(selection1)
    .transform_filter(selection2)
    .properties(title="Subjective Random Walk")
)

chart = (lines_true | lines_subjective).resolve_scale(y="shared")
display(chart)
# Plot the data as interactive Altair chart
radio1 = alt.binding_radio(options=sigma_pos_list, name="Subjective position std: ")
selection1 = alt.selection_point(
    value=1,
    fields=["sigma_pos"],
    bind=radio1,
)

radio2 = alt.binding_radio(options=sigma_vel_list, name="Subjective velocity std: ")
selection2 = alt.selection_point(
    value=0,
    fields=["sigma_vel"],
    bind=radio2,
)

lines_true = (
    alt.Chart(df_walk.query("kind == 'true'"))
    .mark_line()
    .encode(
        x="step:Q",
        y="position:Q",
        color=alt.Color("simulation:N").scale(scheme="rainbow"),
        tooltip=["simulation", "step", "position"],
    )
    .properties(title="True Random Walk")
)

lines_subjective = (
    alt.Chart(df_walk.query("kind == 'subjective'"))
    .mark_line()
    .encode(
        x="step:Q",
        y="position:Q",
        color=alt.Color("simulation:N"),
        detail=["sigma_pos", "sigma_vel"],
        tooltip=["simulation", "step", "position"],
    )
    .add_params(selection2, selection1)
    .transform_filter(selection1)
    .transform_filter(selection2)
    .properties(title="Subjective Random Walk")
)

chart = (lines_true | lines_subjective).resolve_scale(y="shared")
display(chart)

We now define a class for the subjective actor. We need to distinguish between the matrices of the actual dynamical system of the experiment and those of the subjective internal model of the agent. This is possible in the lqg package by defining a Dynamics and an Actor object and combining them into a System.

In [22]:

Copied!





class SubjectiveActor(System):
    def __init__(self, 
               sigma_target, 
               action_variability, 
               action_cost, 
               sigma_cursor,
               sigma_s, 
               sigma_v):

        dt = 1. / 60.

        # true dynamical system (same as above)
        A = jnp.eye(2)
        B = jnp.array([[0.], [dt]])
        V = jnp.diag(jnp.array([1., action_variability]))

        C = jnp.eye(2)
        W = jnp.diag(jnp.array([sigma_target, sigma_cursor]))

        dynamics = Dynamics(A, B, C, V, W, T=T)

        # subjective dynamical system parameters
        A_subj = jnp.array([[1., 0., dt],  # target position
                            [0., 1., 0.],  # cursor position
                            [0., 0., 1.]]) # target velocity
        B_subj = jnp.array([[0.], [dt], [0.]])
        C_subj = jnp.array([[1., 0., 0.],
                            [0., 1., 0.]])
        V_subj = jnp.diag(jnp.array([sigma_s, action_variability, sigma_v]))

        # cost function
        Q = jnp.array([[1., -1., 0.],
                    [-1., 1., 0.],
                    [0., 0., 0.]])

        R = jnp.eye(1) * action_cost

        actor = Actor(A_subj, B_subj, C_subj, V_subj, W, Q, R, T=T)

        super().__init__(actor=actor, dynamics=dynamics)
class SubjectiveActor(System):
    def __init__(self, 
               sigma_target, 
               action_variability, 
               action_cost, 
               sigma_cursor,
               sigma_s, 
               sigma_v):

        dt = 1. / 60.

        # true dynamical system (same as above)
        A = jnp.eye(2)
        B = jnp.array([[0.], [dt]])
        V = jnp.diag(jnp.array([1., action_variability]))

        C = jnp.eye(2)
        W = jnp.diag(jnp.array([sigma_target, sigma_cursor]))

        dynamics = Dynamics(A, B, C, V, W, T=T)

        # subjective dynamical system parameters
        A_subj = jnp.array([[1., 0., dt],  # target position
                            [0., 1., 0.],  # cursor position
                            [0., 0., 1.]]) # target velocity
        B_subj = jnp.array([[0.], [dt], [0.]])
        C_subj = jnp.array([[1., 0., 0.],
                            [0., 1., 0.]])
        V_subj = jnp.diag(jnp.array([sigma_s, action_variability, sigma_v]))

        # cost function
        Q = jnp.array([[1., -1., 0.],
                    [-1., 1., 0.],
                    [0., 0., 0.]])

        R = jnp.eye(1) * action_cost

        actor = Actor(A_subj, B_subj, C_subj, V_subj, W, Q, R, T=T)

        super().__init__(actor=actor, dynamics=dynamics)

In [23]:

Copied!





@jit  # jit-compile the simulation to speed up the data generation
def simulate_subjective_actor(sigma_s, sigma_v):
    model = SubjectiveActor(
        action_variability=0.5,
        action_cost=0.05,
        sigma_target=6.0,
        sigma_cursor=3.0,
        sigma_s=sigma_s,
        sigma_v=sigma_v,
    )
    x = model.simulate(random.PRNGKey(0), x0=jnp.zeros(2), n=100)

    return x
@jit  # jit-compile the simulation to speed up the data generation
def simulate_subjective_actor(sigma_s, sigma_v):
    model = SubjectiveActor(
        action_variability=0.5,
        action_cost=0.05,
        sigma_target=6.0,
        sigma_cursor=3.0,
        sigma_s=sigma_s,
        sigma_v=sigma_v,
    )
    x = model.simulate(random.PRNGKey(0), x0=jnp.zeros(2), n=100)

    return x

In [24]:

Copied!





# Simulate data and store it in a DataFrame
data_trajectory_2 = []
data_ccg_2 = []

for sigma_pos in sigma_pos_list:
    for sigma_vel in sigma_vel_list:
        x = simulate_subjective_actor(sigma_pos, sigma_vel)
        for t, step in zip(time, x[0]):
            data_trajectory_2.append([sigma_pos, sigma_vel, "target", t.item(), step[0].item()])
            data_trajectory_2.append([sigma_pos, sigma_vel, "cursor", t.item(), step[1].item()])

        vels = jnp.diff(x, axis=1)
        lags, correls = xcorr(vels[..., 1], vels[..., 0], maxlags=120)
        lags = lags / 60
        correls = correls.mean(axis=0)

        for lag, correl in zip(lags, correls):
            data_ccg_2.append([sigma_pos, sigma_vel, lag.item(), correl.item()])

df_trajectory_2 = pd.DataFrame(
    data_trajectory_2, columns=["sigma_pos", "sigma_vel", "value", "time", "position"]
)
df_ccg_2 = pd.DataFrame(data_ccg_2, columns=["sigma_pos", "sigma_vel", "lag", "correl"])
# Simulate data and store it in a DataFrame
data_trajectory_2 = []
data_ccg_2 = []

for sigma_pos in sigma_pos_list:
    for sigma_vel in sigma_vel_list:
        x = simulate_subjective_actor(sigma_pos, sigma_vel)
        for t, step in zip(time, x[0]):
            data_trajectory_2.append([sigma_pos, sigma_vel, "target", t.item(), step[0].item()])
            data_trajectory_2.append([sigma_pos, sigma_vel, "cursor", t.item(), step[1].item()])

        vels = jnp.diff(x, axis=1)
        lags, correls = xcorr(vels[..., 1], vels[..., 0], maxlags=120)
        lags = lags / 60
        correls = correls.mean(axis=0)

        for lag, correl in zip(lags, correls):
            data_ccg_2.append([sigma_pos, sigma_vel, lag.item(), correl.item()])

df_trajectory_2 = pd.DataFrame(
    data_trajectory_2, columns=["sigma_pos", "sigma_vel", "value", "time", "position"]
)
df_ccg_2 = pd.DataFrame(data_ccg_2, columns=["sigma_pos", "sigma_vel", "lag", "correl"])

In [25]:

Copied!





# Plot the data as interactive Altair chart
radio1 = alt.binding_radio(options=sigma_pos_list, name="Subjective position std: ")
selection1 = alt.selection_point(
    value=1,
    fields=["sigma_pos"],
    bind=radio1,
)

radio2 = alt.binding_radio(options=sigma_vel_list, name="Subjective velocity std: ")
selection2 = alt.selection_point(
    value=0,
    fields=["sigma_vel"],
    bind=radio2,
)

lines_trajectory = (
    alt.Chart(df_trajectory_2)
    .mark_line()
    .encode(
        x="time:Q",
        y=alt.Y("position").scale(domain=(-30, 30)),
        color=alt.Color("value").sort("descending"),
        tooltip=["time", "position"],
    )
    .add_params(selection2, selection1)
    .transform_filter(selection1 & selection2)
    .properties(title="Trajectory")
)

lines_ccg = (
    alt.Chart(df_ccg_2)
    .mark_line()
    .encode(
        x="lag:Q",
        y=alt.Y("correl:Q").scale(domain=(-0.02, 0.1)),
        color=alt.value("#2CA02C"),
        tooltip=["lag", "correl"],
    )
    .add_params(selection2, selection1)
    .transform_filter(selection1 & selection2)
    .properties(title="Cross-correlogram")
)

chart = lines_trajectory | lines_ccg
display(chart)
# Plot the data as interactive Altair chart
radio1 = alt.binding_radio(options=sigma_pos_list, name="Subjective position std: ")
selection1 = alt.selection_point(
    value=1,
    fields=["sigma_pos"],
    bind=radio1,
)

radio2 = alt.binding_radio(options=sigma_vel_list, name="Subjective velocity std: ")
selection2 = alt.selection_point(
    value=0,
    fields=["sigma_vel"],
    bind=radio2,
)

lines_trajectory = (
    alt.Chart(df_trajectory_2)
    .mark_line()
    .encode(
        x="time:Q",
        y=alt.Y("position").scale(domain=(-30, 30)),
        color=alt.Color("value").sort("descending"),
        tooltip=["time", "position"],
    )
    .add_params(selection2, selection1)
    .transform_filter(selection1 & selection2)
    .properties(title="Trajectory")
)

lines_ccg = (
    alt.Chart(df_ccg_2)
    .mark_line()
    .encode(
        x="lag:Q",
        y=alt.Y("correl:Q").scale(domain=(-0.02, 0.1)),
        color=alt.value("#2CA02C"),
        tooltip=["lag", "correl"],
    )
    .add_params(selection2, selection1)
    .transform_filter(selection1 & selection2)
    .properties(title="Cross-correlogram")
)

chart = lines_trajectory | lines_ccg
display(chart)

For the models from the paper, I have defined classes with the corresponding names in lqg.tracking.

Inverse optimal control¶

Likelihood function¶

To compute the likelihood

$$ p(\mathbf x_{1:T} \mid \theta) = \prod_{i=1}^{T-1} p(\mathbf x_{t+1} \mid \mathbf x_{1:t}, \theta), $$

we need to marginalize out the latent variables $\hat{\mathbf x}_{1:T}$ (the agent's internal beliefs). We can do this efficiently in the LQG setting by starting with our belief about the agent's belief $p(\hat{\mathbf x}_t \mid \mathbf x_{1:t})$ and then iteratively performing the following steps:

Propagate the uncertainty about the agent's belief through the joint dynamical system of states and beliefs $$ p(\hat{\mathbf{x}}_{t+1}, \mathbf x_{t+1} \mid \mathbf x_{1:t}, \theta) = \int p(\hat{\mathbf{x}}_{t+1}, \mathbf x_{t+1} \mid \hat{\mathbf{x}}_{t}, \mathbf x_{t}, \theta) p(\hat{\mathbf{x}}_{t} \mid \mathbf x_{1:t}, \theta) d \hat{\mathbf{x}}_{t} $$
Marginalize over the agent's belief $$ p(\mathbf x_{t+1} \mid \mathbf x_{1:t}, \theta) = \int p(\hat{\mathbf{x}}_{t+1}, \mathbf x_{t+1} \mid \mathbf x_{1:t}, \theta) d \hat{\mathbf x}_{t+1}$$
Condition on the observed state $$ p(\hat{\mathbf{x}}_{t+1} \mid \mathbf x_{1:t+1}, \theta) = \frac{p(\hat{\mathbf{x}}_{t+1}, \mathbf x_{t+1} \mid \mathbf x_{1:t}, \theta)}{p(\mathbf x_{t+1} \mid \mathbf x_{1:t}, \theta)} $$

This gives us the contributions to the likelihood at each time step $p(\mathbf x_{t+1} \mid \mathbf x_{1:t}, \theta)$ (2.) and the distribution over the agent's belief for the next time step $p(\hat{\mathbf{x}}_{t+1} \mid \mathbf x_{1:t+1}, \theta)$ (3.).

This algorithm for computing the likelihood is implemented in the method log_likelihood(x) of our LQG class.

In [26]:

Copied!





true_params = dict(sigma_target=25., 
                   action_variability=0.5, 
                   action_cost=0.5, 
                   sigma_cursor=3.)

# simulate some data
model = BoundedActor(**true_params)
x = model.simulate(random.PRNGKey(123), n=50)
true_params = dict(sigma_target=25., 
                   action_variability=0.5, 
                   action_cost=0.5, 
                   sigma_cursor=3.)

# simulate some data
model = BoundedActor(**true_params)
x = model.simulate(random.PRNGKey(123), n=50)

In [27]:

Copied!





# log likelihood of one parameter, keeping others constant at the true value
def ll(sigma):
    return BoundedActor(sigma_target=sigma, 
                      action_variability=true_params["action_variability"], 
                      action_cost=true_params["action_cost"],
                      sigma_cursor=true_params["sigma_cursor"]
                      ).log_likelihood(x).sum()
                      
# range of parameter values
sigmas = jnp.linspace(5., 50.)

plt.plot(sigmas, vmap(ll)(sigmas))
plt.axvline(25.)
plt.xlabel(r"$\sigma$")
plt.ylabel(r"$\log p(x \mid \sigma)$")
plt.title("Log likelihood")
# log likelihood of one parameter, keeping others constant at the true value
def ll(sigma):
    return BoundedActor(sigma_target=sigma, 
                      action_variability=true_params["action_variability"], 
                      action_cost=true_params["action_cost"],
                      sigma_cursor=true_params["sigma_cursor"]
                      ).log_likelihood(x).sum()
                      
# range of parameter values
sigmas = jnp.linspace(5., 50.)

plt.plot(sigmas, vmap(ll)(sigmas))
plt.axvline(25.)
plt.xlabel(r"$\sigma$")
plt.ylabel(r"$\log p(x \mid \sigma)$")
plt.title("Log likelihood")

Out[27]:

Text(0.5, 1.0, 'Log likelihood')

We could also do gradient-based optimization of the log likelihood function by using jax's automatic differentiation tools.

In [28]:

Copied!

grad(ll)(28.)
grad(ll)(28.)

Out[28]:

Array(-0.71183217, dtype=float32, weak_type=True)

Bayesian inference with NumPyro¶

In Bayesian inverse optimal control, we are interested in the posterior distribution of the model parameters given a trajectory

$$ p(\theta \mid \mathbf x_{1:T}) \propto p(\mathbf x_{1:T} \mid \theta) \, p(\theta). $$

To sample from this posterior distribution, we will use numpyro, a probabilistic programming package powered by jax. For every random variable in the model (in our case the parameters and the observed data), we need to call numpyro.sample to define a name and a distribution.

In [29]:

Copied!





def lqg_model(x):

    # priors
    action_variability = numpyro.sample("action_variability", dist.HalfCauchy(1.))
    action_cost = numpyro.sample("action_cost", dist.HalfCauchy(1.))
    sigma_target = numpyro.sample("sigma_target", dist.HalfCauchy(50.))
    sigma_cursor = numpyro.sample("sigma_cursor", dist.HalfCauchy(15.))

    # setup model
    model = BoundedActor(action_variability=action_variability,
                       action_cost=action_cost,
                       sigma_target=sigma_target,
                       sigma_cursor=sigma_cursor)

    # likelihood
    numpyro.sample("x", model.conditional_distribution(x),
                  obs=x[:, 1:])
def lqg_model(x):

    # priors
    action_variability = numpyro.sample("action_variability", dist.HalfCauchy(1.))
    action_cost = numpyro.sample("action_cost", dist.HalfCauchy(1.))
    sigma_target = numpyro.sample("sigma_target", dist.HalfCauchy(50.))
    sigma_cursor = numpyro.sample("sigma_cursor", dist.HalfCauchy(15.))

    # setup model
    model = BoundedActor(action_variability=action_variability,
                       action_cost=action_cost,
                       sigma_target=sigma_target,
                       sigma_cursor=sigma_cursor)

    # likelihood
    numpyro.sample("x", model.conditional_distribution(x),
                  obs=x[:, 1:])

In [30]:

Copied!

nuts_kernel = NUTS(lqg_model)

mcmc = MCMC(nuts_kernel, num_warmup=500, num_samples=1000)
mcmc.run(random.PRNGKey(0), x)
nuts_kernel = NUTS(lqg_model)

mcmc = MCMC(nuts_kernel, num_warmup=500, num_samples=1000)
mcmc.run(random.PRNGKey(0), x)

sample: 100%|██████████| 1500/1500 [05:39<00:00,  4.42it/s, 15 steps of size 4.00e-01. acc. prob=0.93]

In [31]:

Copied!

inference_data = az.from_numpyro(mcmc)

az.plot_pair(inference_data, reference_values=true_params, kind="kde")

summary = az.summary(inference_data)
inference_data = az.from_numpyro(mcmc)

az.plot_pair(inference_data, reference_values=true_params, kind="kde")

summary = az.summary(inference_data)

arviz - WARNING - Shape validation failed: input_shape: (1, 1000), minimum_shape: (chains=2, draws=4)

Belief tracking¶

Once we have fit the model, we can use it to do belief tracking: We compute the probabilitiy distribution

$$ p(\hat{\mathbf x}_t | \mathbf x_{1:t}), $$

i.e. the researcher's uncertainty about the agent's belief.

In [32]:

Copied!

model = BoundedActor(**summary["mean"].to_dict())

x = model.simulate(random.PRNGKey(1), n=100)
model = BoundedActor(**summary["mean"].to_dict())

x = model.simulate(random.PRNGKey(1), n=100)

In [33]:

Copied!





d = model.belief_tracking_distribution(x)
belief_samples = d.sample(random.PRNGKey(0), sample_shape=(10,))
belief_mean = d.loc # get mean of belief distribution

f, ax = plt.subplots(1, 2)
ax[0].plot(jnp.arange(500) / 60, x[1, :, 0], label="target")
ax[0].plot(jnp.arange(1, 500) / 60, belief_mean[1, :, 0], label="belief about target", color="C2")
ax[0].plot(jnp.arange(1, 500) / 60, belief_mean[1, :, 1], label="belief about cursor", color="C1")

for i in range(10):
    ax[0].plot(jnp.arange(1, 500) / 60, belief_samples[i, 1, :, 0],
          color="C2", alpha=0.1, zorder=-1)
    ax[0].plot(jnp.arange(1, 500) / 60, belief_samples[i, 1, :, 1],
          color="C1", alpha=0.1, zorder=-1)
ax[0].set_ylim(-10, 50)
ax[0].set_title("Belief tracking")
ax[0].set_xlabel("Time [s]")
ax[0].set_ylabel("Position [arcmin]")
ax[0].legend(frameon=False)

# cross-correlograms with belief about cursor
lags, correls = xcorr(jnp.diff(belief_mean[..., 1], axis=1),
                      jnp.diff(x[..., 0], axis=1), maxlags=120)

ax[1].plot(lags / 60, correls.mean(axis=0), color="C1")

# cross-correlograms with belief about target
lags, correls = xcorr(jnp.diff(belief_mean[..., 0], axis=1),
                      jnp.diff(x[..., 0], axis=1), maxlags=120)

ax[1].plot(lags / 60, correls.mean(axis=0), color="C2")

ax[1].set_ylim(-.03, .12)
ax[1].set_title("Cross-correlogram")
ax[1].set_ylabel("Correlation")
ax[1].set_xlabel("Time lag [s]")

f.tight_layout()
d = model.belief_tracking_distribution(x)
belief_samples = d.sample(random.PRNGKey(0), sample_shape=(10,))
belief_mean = d.loc # get mean of belief distribution

f, ax = plt.subplots(1, 2)
ax[0].plot(jnp.arange(500) / 60, x[1, :, 0], label="target")
ax[0].plot(jnp.arange(1, 500) / 60, belief_mean[1, :, 0], label="belief about target", color="C2")
ax[0].plot(jnp.arange(1, 500) / 60, belief_mean[1, :, 1], label="belief about cursor", color="C1")

for i in range(10):
    ax[0].plot(jnp.arange(1, 500) / 60, belief_samples[i, 1, :, 0],
          color="C2", alpha=0.1, zorder=-1)
    ax[0].plot(jnp.arange(1, 500) / 60, belief_samples[i, 1, :, 1],
          color="C1", alpha=0.1, zorder=-1)
ax[0].set_ylim(-10, 50)
ax[0].set_title("Belief tracking")
ax[0].set_xlabel("Time [s]")
ax[0].set_ylabel("Position [arcmin]")
ax[0].legend(frameon=False)

# cross-correlograms with belief about cursor
lags, correls = xcorr(jnp.diff(belief_mean[..., 1], axis=1),
                      jnp.diff(x[..., 0], axis=1), maxlags=120)

ax[1].plot(lags / 60, correls.mean(axis=0), color="C1")

# cross-correlograms with belief about target
lags, correls = xcorr(jnp.diff(belief_mean[..., 0], axis=1),
                      jnp.diff(x[..., 0], axis=1), maxlags=120)

ax[1].plot(lags / 60, correls.mean(axis=0), color="C2")

ax[1].set_ylim(-.03, .12)
ax[1].set_title("Cross-correlogram")
ax[1].set_ylabel("Correlation")
ax[1].set_xlabel("Time lag [s]")

f.tight_layout()

In [ ]: