tfp.glm.fit
Stay organized with collections Save and categorize content based on your preferences.

Runs multiple Fisher scoring steps.

tfp.glm.fit(
    model_matrix,
    response,
    model,
    model_coefficients_start=None,
    predicted_linear_response_start=None,
    l2_regularizer=None,
    dispersion=None,
    offset=None,
    convergence_criteria_fn=None,
    learning_rate=None,
    fast_unsafe_numerics=True,
    maximum_iterations=None,
    l2_regularization_penalty_factor=None,
    name=None
)

Used in the notebooks

Used in the tutorials
Generalized Linear Models

Args
`model_matrix`	(Batch of) `float`-like, matrix-shaped `Tensor` where each row represents a sample's features.
`response`	(Batch of) vector-shaped `Tensor` where each element represents a sample's observed response (to the corresponding row of features). Must have same `dtype` as `model_matrix`.
`model`	`tfp.glm.ExponentialFamily`-like instance which implicitly characterizes a negative log-likelihood loss by specifying the distribuion's `mean`, `gradient_mean`, and `variance`.
`model_coefficients_start`	Optional (batch of) vector-shaped `Tensor` representing the initial model coefficients, one for each column in `model_matrix`. Must have same `dtype` as `model_matrix`. Default value: Zeros.
`predicted_linear_response_start`	Optional `Tensor` with `shape`, `dtype` matching `response`; represents `offset` shifted initial linear predictions based on `model_coefficients_start`. Default value: `offset` if `model_coefficients is None`, and `tf.linalg.matvec(model_matrix, model_coefficients_start) + offset` otherwise.
`l2_regularizer`	Optional scalar `Tensor` representing L2 regularization penalty, i.e., `loss(w) = sum{-log p(y[i]\|x[i],w) : i=1..n} + l2_regularizer \|\|w\|\|_2^2`. Default value: `None` (i.e., no L2 regularization).
`dispersion`	Optional (batch of) `Tensor` representing `response` dispersion, i.e., as in, `p(y\|theta) := exp((y theta - A(theta)) / dispersion)`. Must broadcast with rows of `model_matrix`. Default value: `None` (i.e., "no dispersion").
`offset`	Optional `Tensor` representing constant shift applied to `predicted_linear_response`. Must broadcast to `response`. Default value: `None` (i.e., `tf.zeros_like(response)`).
`convergence_criteria_fn`	Python `callable` taking: `is_converged_previous`, `iter_`, `model_coefficients_previous`, `predicted_linear_response_previous`, `model_coefficients_next`, `predicted_linear_response_next`, `response`, `model`, `dispersion` and returning a `bool` `Tensor` indicating that Fisher scoring has converged. See `convergence_criteria_small_relative_norm_weights_change` as an example function. Default value: `None` (i.e., `convergence_criteria_small_relative_norm_weights_change`).
`learning_rate`	Optional (batch of) scalar `Tensor` used to dampen iterative progress. Typically only needed if optimization diverges, should be no larger than `1` and typically very close to `1`. Default value: `None` (i.e., `1`).
`fast_unsafe_numerics`	Optional Python `bool` indicating if faster, less numerically accurate methods can be employed for computing the weighted least-squares solution. Default value: `True` (i.e., "fast but possibly diminished accuracy").
`maximum_iterations`	Optional maximum number of iterations of Fisher scoring to run; "and-ed" with result of `convergence_criteria_fn`. Default value: `None` (i.e., `infinity`).
`l2_regularization_penalty_factor`	Optional (batch of) vector-shaped `Tensor`, representing a separate penalty factor to apply to each model coefficient, length equal to columns in `model_matrix`. Each penalty factor multiplies l2_regularizer to allow differential regularization. Can be 0 for some coefficients, which implies no regularization. Default is 1 for all coefficients. `loss(w) = sum{-log p(y[i]\|x[i],w) : i=1..n} + l2_regularizer \|\|w * l2_regularization_penalty_factor\|\|_2^2` Default value: `None` (i.e., no per coefficient regularization).
`name`	Python `str` used as name prefix to ops created by this function. Default value: `"fit"`.

Returns
`model_coefficients`	(Batch of) vector-shaped `Tensor`; represents the fitted model coefficients, one for each column in `model_matrix`.
`predicted_linear_response`	`response`-shaped `Tensor` representing linear predictions based on new `model_coefficients`, i.e., `tf.linalg.matvec(model_matrix, model_coefficients) + offset`.
`is_converged`	`bool` `Tensor` indicating that the returned `model_coefficients` met the `convergence_criteria_fn` criteria within the `maximum_iterations` limit.
`iter_`	`int32` `Tensor` indicating the number of iterations taken.

Example

  import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

def make_dataset(n, d, link, scale=1., dtype=np.float32):
  model_coefficients = tfd.Uniform(
      low=np.array(-1, dtype),
      high=np.array(1, dtype)).sample(d, seed=42)
  radius = np.sqrt(2.)
  model_coefficients *= radius / tf.linalg.norm(model_coefficients)
  model_matrix = tfd.Normal(
      loc=np.array(0, dtype),
      scale=np.array(1, dtype)).sample([n, d], seed=43)
  scale = tf.convert_to_tensor(scale, dtype)
  linear_response = tf.tensordot(
      model_matrix, model_coefficients, axes=[[1], [0]])
  if link == 'linear':
    response = tfd.Normal(loc=linear_response, scale=scale).sample(seed=44)
  elif link == 'probit':
    response = tf.cast(
        tfd.Normal(loc=linear_response, scale=scale).sample(seed=44) > 0,
        dtype)
  elif link == 'logit':
    response = tfd.Bernoulli(logits=linear_response).sample(seed=44)
  else:
    raise ValueError('unrecognized true link: {}'.format(link))
  return model_matrix, response, model_coefficients

X, Y, w_true = make_dataset(n=int(1e6), d=100, link='probit')

w, linear_response, is_converged, num_iter = tfp.glm.fit(
    model_matrix=X,
    response=Y,
    model=tfp.glm.BernoulliNormalCDF())
log_likelihood = tfp.glm.BernoulliNormalCDF().log_prob(Y, linear_response)

print('is_converged: ', is_converged.numpy())
print('    num_iter: ', num_iter.numpy())
print('    accuracy: ', np.mean((linear_response > 0.) == tf.cast(Y, bool)))
print('    deviance: ', 2. * np.mean(log_likelihood))
print('||w0-w1||_2 / (1+||w0||_2): ', (np.linalg.norm(w_true - w, ord=2) /
                                       (1. + np.linalg.norm(w_true, ord=2))))

# ==>
# is_converged:  True
#     num_iter:  6
#     accuracy:  0.804382
#     deviance:  -0.820746600628
# ||w0-w1||_2 / (1+||w0||_2):  0.00619245105309

tfp.glm.fit Stay organized with collections Save and categorize content based on your preferences.

Used in the notebooks

Args

Returns

Example

tfp.glm.fit
Stay organized with collections Save and categorize content based on your preferences.