Decorator: creates a component from a typehint-annotated Python function.
tfx.v1.dsl.components.component(
func,
/,
*,
component_annotation: Optional[Type[system_executions.SystemExecution]] = None,
use_beam: bool = False
) -> Union[BaseFunctionalComponentFactory, Callable[[types.FunctionType],
BaseFunctionalComponentFactory]]
Used in the notebooks
Used in the tutorials |
---|
This decorator creates a component based on typehint annotations specified for
the arguments and return value for a Python function. The decorator can be
supplied with a parameter component_annotation
to specify the annotation for
this component decorator. This annotation hints which system execution type
this python function-based component belongs to.
Specifically, function arguments can be annotated with the following types and
associated semantics:
Parameter[T]
whereT
isint
,float
,str
, orbool
: indicates that a primitive type execution parameter, whose value is known at pipeline construction time, will be passed for this argument. These parameters will be recorded in ML Metadata as part of the component's execution record. Can be an optional argument.int
,float
,str
,bytes
,bool
,Dict
,List
: indicates that a primitive type value will be passed for this argument. This value is tracked as anInteger
,Float
,String
,Bytes
,Boolean
orJsonValue
artifact (seetfx.types.standard_artifacts
) whose value is read and passed into the given Python component function. Can be an optional argument.InputArtifact[ArtifactType]
: indicates that an input artifact object of typeArtifactType
(deriving fromtfx.types.Artifact
) will be passed for this argument. This artifact is intended to be consumed as an input by this component (possibly reading from the path specified by its.uri
). Can be an optional argument by specifying a default value ofNone
.OutputArtifact[ArtifactType]
: indicates that an output artifact object of typeArtifactType
(deriving fromtfx.types.Artifact
) will be passed for this argument. This artifact is intended to be emitted as an output by this component (and written to the path specified by its.uri
). Cannot be an optional argument.
The return value typehint should be either empty or None
, in the case of a
component function that has no return values, or a TypedDict
of primitive
value types (int
, float
, str
, bytes
, bool
, dict
or list
; or
Optional[T]
, where T is a primitive type value, in which case None
can be
returned), to indicate that the return value is a dictionary with specified
keys and value types.
Note that output artifacts should not be included in the return value
typehint; they should be included as OutputArtifact
annotations in the
function inputs, as described above.
The function to which this decorator is applied must be at the top level of its Python module (it may not be defined within nested classes or function closures).
This is example usage of component definition using this decorator:
from tfx import v1 as tfx
InputArtifact = tfx.dsl.components.InputArtifact
OutputArtifact = tfx.dsl.components.OutputArtifact
Parameter = tfx.dsl.components.Parameter
Examples = tfx.types.standard_artifacts.Examples
Model = tfx.types.standard_artifacts.Model
class MyOutput(TypedDict):
loss: float
accuracy: float
@component(component_annotation=tfx.dsl.standard_annotations.Train)
def MyTrainerComponent(
training_data: InputArtifact[Examples],
model: OutputArtifact[Model],
dropout_hyperparameter: float,
num_iterations: Parameter[int] = 10
) -> MyOutput:
'''My simple trainer component.'''
records = read_examples(training_data.uri)
model_obj = train_model(records, num_iterations, dropout_hyperparameter)
model_obj.write_to(model.uri)
return {
'loss': model_obj.loss,
'accuracy': model_obj.accuracy
}
Example:usage in a pipeline graph definition:
# ...
trainer = MyTrainerComponent(
training_data=example_gen.outputs['examples'],
dropout_hyperparameter=other_component.outputs['dropout'],
num_iterations=1000)
pusher = Pusher(model=trainer.outputs['model'])
# ...
When the parameter component_annotation
is not supplied, the default value
is None. This is another example usage with component_annotation
= None:
@component
def MyTrainerComponent(
training_data: InputArtifact[standard_artifacts.Examples],
model: OutputArtifact[standard_artifacts.Model],
dropout_hyperparameter: float,
num_iterations: Parameter[int] = 10
) -> Output:
'''My simple trainer component.'''
records = read_examples(training_data.uri)
model_obj = train_model(records, num_iterations, dropout_hyperparameter)
model_obj.write_to(model.uri)
return {
'loss': model_obj.loss,
'accuracy': model_obj.accuracy
}
When the parameter use_beam
is True, one of the parameters of the decorated
function type-annotated by BeamComponentParameter[beam.Pipeline] and the
default value can only be None. It will be replaced by a beam Pipeline made
with the tfx pipeline's beam_pipeline_args that's shared with other beam-based
components:
@component(use_beam=True)
def DataProcessingComponent(
input_examples: InputArtifact[standard_artifacts.Examples],
output_examples: OutputArtifact[standard_artifacts.Examples],
beam_pipeline: BeamComponentParameter[beam.Pipeline] = None,
) -> None:
'''My simple trainer component.'''
records = read_examples(training_data.uri)
with beam_pipeline as p:
...
Returns | |
---|---|
An object that:
|
Raises | |
---|---|
EnvironmentError
|
if the current Python interpreter is not Python 3. |