Bayesian Models

Gaussian Mixture Model

class pycave.bayes.GMM(num_components, num_features, covariance='diag')

The GMM represents a mixture of a fixed number of multivariate gaussian distributions. This class may be used to find clusters whenever you expect data to be generated from a (fixed-size) set of gaussian distributions.

In addition to the methods documented below, the GMM provides the following methods as provided by the estimator mixin.

fit(…)

Optimizes the model’s parameters.

evaluate(…)

Computes the per-datapoint negative log-likelihood of the given data.

predict(…)

Computes the probability distribution over components for each datapoint. An argmax over the component dimension thus yields the most likely states.

The parameters that may be passed to the functions can be derived from the engine documentation. The data needs, however, not be passed as a PyTorch data loader but all methods also accept the following instead:

  • A single tensor (interpreted as a single batch of datapoints)

  • A list of tensors (interpreted as batches of datapoints)

Additionally, the methods allow the following keyword arguments:

fit(…)
eps: float, default: 0.01

The minimum per-datapoint difference in the negative log-likelihood to consider a model “better”, thus indicating convergence.

reg: float, default: 1e-6

A non-negative regularization term to be added to the diagonal of the covariance matrix to ensure that it is positive. If your data contains datapoints which are very close together (i.e. “singleton datapoints”), you may need to increase that regularization factor.

evaluate(…)
reduction: str, default: ‘mean’

The reduction performed for the negative log-likelihood as for common PyTorch metrics. Must be one of [‘mean’, ‘sum’, ‘none’].

property engine

Returns the engine for this model.

Returns

The engine class initialized with this model.

Return type

pyblaze.nn.BaseEngine

forward(data)

Computes the distribution over components of all datapoints as well as the negative log-likelihood of the given data.

Parameters

data (torch.Tensor [N, D]) – The data to perform computations for (number of datapoints N, dimensionality D).

Returns

  • torch.Tensor [N, K] – The responsibilities for each datapoint and component (number of components K).

  • torch.Tensor [N] – The negative log-likelihood for all data samples.

prepare_input(data)

Prepares the input for the engine. This enables passing other types of data instead of PyTorch data loaders when it is appropriate, making it easier to e.g. provide a Sklearn- like interface. By default, the data object is simply returned but subclasses may override this function as appropriate.

Parameters

data (object) – The data object passed to fit, evaluate or predict.

Returns

The iterable dataset to use for the engine’s data.

Return type

iterable

reset_parameters(data=None, max_iter=100, reg=1e-06)

Initializes the parameters of the GMM, optionally based on some data. If no data is given, means are initialized randomly from a gaussian distribution, unit covariances are used and prior probabilities are assigned randomly using a uniform distribution.

Parameters
  • data (torch.Tensor [N, D], default: None) – An optional set of datapoints to initialize the means and covariances of the gaussian distributions from. K-Means will be run to find the means and the datapoints belonging to a respective cluster are used to estimate the covariance. Note that the given data may be a (small) subset on the actual data that the GMM should be fitted on.

  • max_iter (int, default: 100) – If data is given and K-Means is run, this defines the maximum number of iterations to run K-Means for.

  • reg (float, default: 1e-6) – A non-negative regularization term to be added to the diagonal of the covariance matrix to ensure that it is positive. If your data contains datapoints which are very close together (i.e. “singleton datapoints”), you may need to increase that regularization factor. This parameter is ignored if no data is provided.

sample(n, return_components=False)

Samples a given number of samples from the GMM.

Parameters
  • n (int) – The number of samples to generate.

  • return_components (bool, default: False) – Whether to return the indices of the components from which the samples were obtained.

Returns

  • torch.Tensor [N, D] – The samples with dimensionality D.

  • torch.Tensor [N] – Optionally, the indices of the components corresponding to the returned samples.

Markov Model

class pycave.bayes.MarkovModel(num_states)

The MarkovModel models a simple MarkovChain with a fixed set of states. You may use this class whenever states are known and transition probabilities are the only quantity of interest. In case of any additional output from the states, consider using the HMM model.

In addition to the methods documented below, the Markov model provides the following methods as provided by the estimator mixin.

fit(…)

Optimizes the model’s parameters.

evaluate(…)

Computes the per-datapoint negative log-likelihood of the given data.

predict(…)

Not available.

The parameters that may be passed to the functions can be derived from the engine documentation. The data needs, however, not be passed as a PyTorch data loader but all methods also accept the following instead:

  • A single packed sequence

  • A single 2-D tensor (interpreted as batch of sequences)

  • A list of packed sequences

  • A list of 2-D tensors (interpreted as batches of sequences)

Additionally, the methods allow the following keyword arguments:

fit(…)
  • symmetric: bool, default: False

    Whether a symmetric transition matrix should be learnt from the data (e.g. useful when training on random walks from an undirected graph).

  • teleport_alpha: float, default: 0

    The probability of random teleportations from one state to a randomly selected other one upon every transition. Generally “spaces out” probabilities in the transition probability matrix.

property engine

Returns the engine for this model.

Returns

The engine class initialized with this model.

Return type

pyblaze.nn.BaseEngine

forward(data)

Runs inference for a single packed sequence, i.e. computes the negative log-likelihood of the given sequences.

Parameters

data (torch.PackedSequence [N]) – The sequences for which to compute the negative log-likelihood (number of items N).

Returns

The negative log-likelihood.

Return type

torch.Tensor [1]

prepare_input(data)

Prepares the input for the engine. This enables passing other types of data instead of PyTorch data loaders when it is appropriate, making it easier to e.g. provide a Sklearn- like interface. By default, the data object is simply returned but subclasses may override this function as appropriate.

Parameters

data (object) – The data object passed to fit, evaluate or predict.

Returns

The iterable dataset to use for the engine’s data.

Return type

iterable

reset_parameters()

Resets the parameter of the model by sampling initial probabilities as well as transition probabilities from a uniform distribution.

sample(num_sequences, sequence_length)

Samples the given number of sequences with the given length from the model’s underlying probability distribution.

Parameters
  • num_sequences (int) – The number of sequences to sample.

  • sequence_length (int) – The length of the sequences to sample. Generation tends to be much slower for longer sequences compared to a higher number of sequences. The reason is that generation of sequences needs to be iterative.

Returns

The state sequences (number of sequences N, sequence length S).

Return type

torch.Tensor [N, S]

stationary_distribution(max_iterations=100)

Computes the stationary distribution of the Markov chain. This equals the eigenvector corresponding to the largest eigenvalue of the transposed transition matrix.

Parameters

max_iterations (int, default: 100) – The number of iterations to perform for the power iteration.

Returns

The probability of a random walker visiting each of the states after infinitely many steps.

Return type

torch.Tensor [N]

Hidden Markov Model

class pycave.bayes.HMM(num_states, output='gaussian', output_num_states=1, output_dim=1, output_covariance='diag')

The HMM represents a hidden Markov model with different kinds of emissions.

In addition to the methods documented below, the HMM provides the following methods as provided by the estimator mixin.

fit(…)

Optimizes the model’s parameters.

evaluate(…)

Computes the per-datapoint negative log-likelihood of the given data.

predict(…)

Performs filtering or smoothing based on the passed parameters, returning the distribution over hidden states at the last timestep of each sequence or at every timestep, respectively.

The parameters that may be passed to the functions can be derived from the engine documentation. The data needs, however, not be passed as a PyTorch data loader but all methods also accept the following instead:

  • A single packed sequence

  • A single 2-D tensor (interpreted as batch of sequences)

  • A list of packed sequences

  • A list of 2-D tensors (interpreted as batches of sequences)

Additionally, the methods allow for the following keyword arguments:

fit(…)
  • epochs: int, default: 20

    The maximum number of iterations to run training for.

  • eps: float, default: 0.01

    The minimum per-datapoint difference in the negative log-likelihood to consider a model “better”.

  • patience: int, default: 0

    The number of times the negative log-likelihood may be above the minimum that has already been achieved before aborting training.

predict(…)
  • smooth: bool, default: False

    Whether to perform smoothing and return distributions over hidden states for all time steps of the sequences.

property engine

Returns the engine for this model.

Returns

The engine class initialized with this model.

Return type

pyblaze.nn.BaseEngine

forward(data, smooth=False, return_emission=False)

Runs inference (filtering/smoothing) for a single packed sequence.

Parameters
  • data (torch.PackedSequence [N]) – The sequences for which to compute alpha/beta values (number of items N).

  • smooth (bool, default: False) – Whether to perform filtering or smoothing.

  • return_emission (bool, default: False) –

Returns

  • torch.Tensor [N, K] – The emission probabilities for all datapoints if return_emission is True.

  • torch.Tensor ([S, K] or [N, K]) – The (normalized) alpha values, either for each sequence or for all timesteps of each sequence if smooth is True. The alpha values represent the filtered distribution over hidden states (number of sequences S, number of hidden states K).

  • torch.Tensor [N, K] – The (normalized) beta values if smooth is True.

  • torch.Tensor [1] – The negative log-likelihood of the data under the model (not normalized for the number of datapoints).

prepare_input(data)

Prepares the input for the engine. This enables passing other types of data instead of PyTorch data loaders when it is appropriate, making it easier to e.g. provide a Sklearn- like interface. By default, the data object is simply returned but subclasses may override this function as appropriate.

Parameters

data (object) – The data object passed to fit, evaluate or predict.

Returns

The iterable dataset to use for the engine’s data.

Return type

iterable

reset_parameters(data=None, max_iter=100)

Resets the parameters of the model. The initial probabilities as well as the transition probabilities are initialized by drawing from the uniform distribution

Depending on the output type, additional setup can be performed. If the output is Gaussian, an initial guess on the means of the state outputs via K-Means clustering can be made. Otherwise they are set randomly to values drawn from the (multivariate) standard normal distribution. The covariances are initialized according to the standard normal distribution.

Parameters

data (torch.Tensor [N, D]) – (A subset of) datapoints to initialize the output parameters of this model (must not be shaped as a sequence). Ensure that this tensor is not too large as K-Means will run forever otherwise (number of datapoints N, dimensionality D).

sample(num_sequences, sequence_length)

Samples the specified number of sequences of the specified length from the hidden Markov model.

Parameters
  • num_sequences (int) – The number of sequences to generate.

  • sequence_length (int) – The length of the sequences to generate. Generation of the hidden states is done via MarkovModel.sample. Read its documentation about the performance of a long sequence length.

Returns

Returns the sampled output whose shape depends on the type of output (number of sequences N, sequence length S):

  • gaussian: torch.Tensor [N, S, D] (dimensionality of Gaussians D).

  • discrete: torch.Tensor [N, S] where the values indicate the output state.

Return type

torch.Tensor [N, S, ?]