Module: tf_agents.bandits.agents.neural_linucb_agent
Stay organized with collections
Save and categorize content based on your preferences.
Implements the Neural + LinUCB bandit algorithm.
Applies LinUCB on top of an encoding network.
Since LinUCB is a linear method, the encoding network is used to capture the
non-linear relationship between the context features and the expected rewards.
The encoding network may be already trained or not; if not trained, the
method can optionally train it using epsilon greedy.
Reference:
Carlos Riquelme, George Tucker, Jasper Snoek,
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep
Networks for Thompson Sampling
, ICLR 2018.
Classes
class NeuralLinUCBAgent
: An agent implementing the LinUCB algorithm on top of a neural network.
class NeuralLinUCBVariableCollection
: A collection of variables used by NeuralLinUCBAgent
.
Other Members |
absolute_import
|
Instance of __future__._Feature
|
division
|
Instance of __future__._Feature
|
print_function
|
Instance of __future__._Feature
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[],null,["# Module: tf_agents.bandits.agents.neural_linucb_agent\n\n\u003cbr /\u003e\n\n|----------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/agents/blob/v0.19.0/tf_agents/bandits/agents/neural_linucb_agent.py) |\n\nImplements the Neural + LinUCB bandit algorithm.\n\nApplies LinUCB on top of an encoding network.\nSince LinUCB is a linear method, the encoding network is used to capture the\nnon-linear relationship between the context features and the expected rewards.\nThe encoding network may be already trained or not; if not trained, the\nmethod can optionally train it using epsilon greedy.\n\n#### Reference:\n\nCarlos Riquelme, George Tucker, Jasper Snoek,\n`Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep\nNetworks for Thompson Sampling`, ICLR 2018.\n\nClasses\n-------\n\n[`class NeuralLinUCBAgent`](../../../tf_agents/bandits/agents/neural_linucb_agent/NeuralLinUCBAgent): An agent implementing the LinUCB algorithm on top of a neural network.\n\n[`class NeuralLinUCBVariableCollection`](../../../tf_agents/bandits/agents/neural_linucb_agent/NeuralLinUCBVariableCollection): A collection of variables used by `NeuralLinUCBAgent`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Other Members ------------- ||\n|-----------------|-----------------------------------|\n| absolute_import | Instance of `__future__._Feature` |\n| division | Instance of `__future__._Feature` |\n| print_function | Instance of `__future__._Feature` |\n\n\u003cbr /\u003e"]]