TFX-如何检查csvexample的记录
问题
如何检查加载到TFX CSVEXAMPLE中的数据?
CSV
从 california_housing_train.csv 从下面看。
经度 | 纬度 | housing_median_age | total_bedrooms | total_bedroom | suppers | toper_income | median_value | 344700 |
---|---|---|---|---|---|---|---|---|
-122.05 | 37.37 | 27 | 3885 | 661 | 1537 | 606 | 6.6085 | 4 |
csvexample | 将 | 中 | 270500 | csvexample | 。 | 到 | 加载 | csv |
CSV | | | | | | | | |
在我的理解中,XXXExample Gen是生成TF.RECORD实例,因此我想知道是否有一种方法可以通过CSVEXAMPLE的记录进行迭代。
from tfx.components import (
CsvExampleGen
)
housing = CsvExampleGen("sample_data/california_housing_train.csv")
housing
----------
CsvExampleGen(
spec: <tfx.types.standard_component_specs.FileBasedExampleGenSpec object at 0x7fcd90435450>,
executor_spec: <tfx.dsl.components.base.executor_spec.BeamExecutorSpec object at 0x7fcd90435850>,
driver_class: <class 'tfx.components.example_gen.driver.FileBasedDriver'>,
component_id: CsvExampleGen,
inputs: {},
outputs: {
'examples': OutputChannel(artifact_type=Examples,
producer_component_id=CsvExampleGen,
output_key=examples,
additional_properties={},
additional_custom_properties={})
}
)
实验
for record in housing.outputs['examples']:
print(record)
TypeError Trackback(最近的最新电话) 在 ----&gt; 1在housing中记录。输出['示例']: 2打印(记录)
TypeError:“ outputchannel”对象是不可能的
Question
How to inspect the data loaded into TFX CsvExampleGen?
CSV
Top 3 rows from the california_housing_train.csv looks below.
longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | median_house_value |
---|---|---|---|---|---|---|---|---|
-122.05 | 37.37 | 27 | 3885 | 661 | 1537 | 606 | 6.6085 | 344700 |
-118.3 | 34.26 | 43 | 1510 | 310 | 809 | 277 | 3.599 | 176500 |
-117.81 | 33.78 | 27 | 3589 | 507 | 1484 | 495 | 5.7934 | 270500 |
CsvExampleGen
The CSV is loaded into CsvExampleGen. In my understanding, XXXExampleGen is to generate tf.Record instances, hence I wonder if there is a way to iterate through the records from CsvExampleGen.
from tfx.components import (
CsvExampleGen
)
housing = CsvExampleGen("sample_data/california_housing_train.csv")
housing
----------
CsvExampleGen(
spec: <tfx.types.standard_component_specs.FileBasedExampleGenSpec object at 0x7fcd90435450>,
executor_spec: <tfx.dsl.components.base.executor_spec.BeamExecutorSpec object at 0x7fcd90435850>,
driver_class: <class 'tfx.components.example_gen.driver.FileBasedDriver'>,
component_id: CsvExampleGen,
inputs: {},
outputs: {
'examples': OutputChannel(artifact_type=Examples,
producer_component_id=CsvExampleGen,
output_key=examples,
additional_properties={},
additional_custom_properties={})
}
)
Experiment
for record in housing.outputs['examples']:
print(record)
TypeError Traceback (most recent call last)
in
----> 1 for record in housing.outputs['examples']:
2 print(record)
TypeError: 'OutputChannel' object is not iterable
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您是否有机会查看此 tutorials in Tuteach ,其中说明了如何显示
示例
组件的工件?您可以修改以下代码(来源: tfx tutorial 。Have you got a chance to take a look at this section in tutorials, which explains how to display the artifacts of
ExampleGen
component? You can modify the code below (Source: TFX Tutorial) to achieve the same.