tfr.extension.pipeline.RankingPipeline
Stay organized with collections
Save and categorize content based on your preferences.
Class to set up the input, train and eval processes for a TF Ranking model.
tfr.extension.pipeline.RankingPipeline(
context_feature_columns,
example_feature_columns,
hparams,
estimator,
label_feature_name='relevance',
label_feature_type=tf.int64,
dataset_reader=tfr.keras.pipeline.DatasetHparams.dataset_reader
,
best_exporter_metric=None,
best_exporter_metric_higher_better=True,
size_feature_name=None
)
An example use case is provided below:
import tensorflow as tf
import tensorflow_ranking as tfr
context_feature_columns = {
"c1": tf.feature_column.numeric_column("c1", shape=(1,))
}
example_feature_columns = {
"e1": tf.feature_column.numeric_column("e1", shape=(1,))
}
hparams = dict(
train_input_pattern="/path/to/train/files",
eval_input_pattern="/path/to/eval/files",
train_batch_size=8,
eval_batch_size=8,
checkpoint_secs=120,
num_checkpoints=1000,
num_train_steps=10000,
num_eval_steps=100,
loss="softmax_loss",
list_size=10,
listwise_inference=False,
convert_labels_to_binary=False,
model_dir="/path/to/your/model/directory")
# See `tensorflow_ranking.estimator` for details about creating an estimator.
estimator = <create your own estimator>
ranking_pipeline = tfr.ext.pipeline.RankingPipeline(
context_feature_columns,
example_feature_columns,
hparams,
estimator=estimator,
label_feature_name="relevance",
label_feature_type=tf.int64)
ranking_pipeline.train_and_eval()
Note that you may |
- pass
best_exporter_metric and best_exporter_metric_higher_better for
different model export strategies.
- pass
dataset_reader for reading different tf.Dataset s. We recommend
using TFRecord files and storing your data in tfr.data.ELWC format.
|
If you want to further customize certain RankingPipeline
behaviors, please
create a subclass of RankingPipeline
, and overwrite related functions. We
recommend only overwriting the following functions:
_make_dataset
which builds the tf.dataset for a tf-ranking model.
_make_serving_input_fn
that defines the input function for serving.
_export_strategies
if you have more advanced needs for model exporting.
For example, if you want to remove the best exporters, you may overwrite:
class NoBestExporterRankingPipeline(tfr.ext.pipeline.RankingPipeline):
def _export_strategies(self, event_file_pattern):
del event_file_pattern
latest_exporter = tf.estimator.LatestExporter(
"latest_model",
serving_input_receiver_fn=self._make_serving_input_fn())
return [latest_exporter]
ranking_pipeline = NoBestExporterRankingPipeline(
context_feature_columns,
example_feature_columns,
hparams
estimator=estimator)
ranking_pipeline.train_and_eval()
if you want to customize your dataset reading behaviors, you may overwrite:
class CustomizedDatasetRankingPipeline(tfr.ext.pipeline.RankingPipeline):
def _make_dataset(self,
batch_size,
list_size,
input_pattern,
randomize_input=True,
num_epochs=None):
# Creates your own dataset, plese follow `tfr.data.build_ranking_dataset`.
dataset = build_my_own_ranking_dataset(...)
...
return dataset.map(self._features_and_labels)
ranking_pipeline = CustomizedDatasetRankingPipeline(
context_feature_columns,
example_feature_columns,
hparams
estimator=estimator)
ranking_pipeline.train_and_eval()
Args |
context_feature_columns
|
(dict) Context (aka, query) feature columns.
|
example_feature_columns
|
(dict) Example (aka, document) feature columns.
|
hparams
|
(dict) A dict containing model hyperparameters.
|
estimator
|
(Estimator ) An Estimator instance for model train and eval.
|
label_feature_name
|
(str) The name of the label feature.
|
label_feature_type
|
(tf.dtype ) The value type of the label feature.
|
dataset_reader
|
(tf.Dataset ) The dataset format for the input files.
|
best_exporter_metric
|
(str) Metric key for exporting the best model. If
None, exports the model with the minimal loss value.
|
best_exporter_metric_higher_better
|
(bool) If a higher metric is better.
This is only used if best_exporter_metric is not None.
|
size_feature_name
|
(str) If set, populates the feature dictionary with
this name and the coresponding value is a tf.int32 Tensor of shape
[batch_size] indicating the actual sizes of the example lists before
padding and truncation. If None, which is default, this feature is not
generated.
|
Methods
train_and_eval
View source
train_and_eval(
local_training=True
)
Launches train and evaluation jobs locally.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-08-18 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2023-08-18 UTC."],[],[],null,["# tfr.extension.pipeline.RankingPipeline\n\n\u003cbr /\u003e\n\n|------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/ranking/blob/v0.5.3/tensorflow_ranking/extension/pipeline.py#L32-L424) |\n\nClass to set up the input, train and eval processes for a TF Ranking model. \n\n tfr.extension.pipeline.RankingPipeline(\n context_feature_columns,\n example_feature_columns,\n hparams,\n estimator,\n label_feature_name='relevance',\n label_feature_type=tf.int64,\n dataset_reader=../../../tfr/keras/pipeline/DatasetHparams/dataset_reader,\n best_exporter_metric=None,\n best_exporter_metric_higher_better=True,\n size_feature_name=None\n )\n\nAn example use case is provided below: \n\n import tensorflow as tf\n import tensorflow_ranking as tfr\n\n context_feature_columns = {\n \"c1\": tf.feature_column.numeric_column(\"c1\", shape=(1,))\n }\n example_feature_columns = {\n \"e1\": tf.feature_column.numeric_column(\"e1\", shape=(1,))\n }\n\n hparams = dict(\n train_input_pattern=\"/path/to/train/files\",\n eval_input_pattern=\"/path/to/eval/files\",\n train_batch_size=8,\n eval_batch_size=8,\n checkpoint_secs=120,\n num_checkpoints=1000,\n num_train_steps=10000,\n num_eval_steps=100,\n loss=\"softmax_loss\",\n list_size=10,\n listwise_inference=False,\n convert_labels_to_binary=False,\n model_dir=\"/path/to/your/model/directory\")\n\n # See `tensorflow_ranking.estimator` for details about creating an estimator.\n estimator = \u003ccreate your own estimator\u003e\n\n ranking_pipeline = tfr.ext.pipeline.RankingPipeline(\n context_feature_columns,\n example_feature_columns,\n hparams,\n estimator=estimator,\n label_feature_name=\"relevance\",\n label_feature_type=tf.int64)\n ranking_pipeline.train_and_eval()\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Note that you may ----------------- ||\n|---|---|\n| \u003cbr /\u003e - pass `best_exporter_metric` and `best_exporter_metric_higher_better` for different model export strategies. - pass `dataset_reader` for reading different `tf.Dataset`s. We recommend using TFRecord files and storing your data in [`tfr.data.ELWC`](../../../tfr/data#ELWC) format. ||\n\n\u003cbr /\u003e\n\nIf you want to further customize certain `RankingPipeline` behaviors, please\ncreate a subclass of `RankingPipeline`, and overwrite related functions. We\nrecommend only overwriting the following functions:\n\n- `_make_dataset` which builds the tf.dataset for a tf-ranking model.\n- `_make_serving_input_fn` that defines the input function for serving.\n- `_export_strategies` if you have more advanced needs for model exporting.\n\nFor example, if you want to remove the best exporters, you may overwrite: \n\n class NoBestExporterRankingPipeline(tfr.ext.pipeline.RankingPipeline):\n def _export_strategies(self, event_file_pattern):\n del event_file_pattern\n latest_exporter = tf.estimator.LatestExporter(\n \"latest_model\",\n serving_input_receiver_fn=self._make_serving_input_fn())\n return [latest_exporter]\n\n ranking_pipeline = NoBestExporterRankingPipeline(\n context_feature_columns,\n example_feature_columns,\n hparams\n estimator=estimator)\n ranking_pipeline.train_and_eval()\n\nif you want to customize your dataset reading behaviors, you may overwrite: \n\n class CustomizedDatasetRankingPipeline(tfr.ext.pipeline.RankingPipeline):\n def _make_dataset(self,\n batch_size,\n list_size,\n input_pattern,\n randomize_input=True,\n num_epochs=None):\n # Creates your own dataset, plese follow `tfr.data.build_ranking_dataset`.\n dataset = build_my_own_ranking_dataset(...)\n ...\n return dataset.map(self._features_and_labels)\n\n ranking_pipeline = CustomizedDatasetRankingPipeline(\n context_feature_columns,\n example_feature_columns,\n hparams\n estimator=estimator)\n ranking_pipeline.train_and_eval()\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `context_feature_columns` | (dict) Context (aka, query) feature columns. |\n| `example_feature_columns` | (dict) Example (aka, document) feature columns. |\n| `hparams` | (dict) A dict containing model hyperparameters. |\n| `estimator` | (`Estimator`) An `Estimator` instance for model train and eval. |\n| `label_feature_name` | (str) The name of the label feature. |\n| `label_feature_type` | (`tf.dtype`) The value type of the label feature. |\n| `dataset_reader` | (`tf.Dataset`) The dataset format for the input files. |\n| `best_exporter_metric` | (str) Metric key for exporting the best model. If None, exports the model with the minimal loss value. |\n| `best_exporter_metric_higher_better` | (bool) If a higher metric is better. This is only used if `best_exporter_metric` is not None. |\n| `size_feature_name` | (str) If set, populates the feature dictionary with this name and the coresponding value is a [`tf.int32`](https://www.tensorflow.org/api_docs/python/tf#int32) Tensor of shape \\[batch_size\\] indicating the actual sizes of the example lists before padding and truncation. If None, which is default, this feature is not generated. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `train_and_eval`\n\n[View source](https://github.com/tensorflow/ranking/blob/v0.5.3/tensorflow_ranking/extension/pipeline.py#L417-L424) \n\n train_and_eval(\n local_training=True\n )\n\nLaunches train and evaluation jobs locally."]]