attributeError:' mapdataset'对象没有属性' preprocess'在tensorflow_federated TFF中
我正在测试本教程,并以非IID分布进行联合学习: https://www.tensorflow.org/federated/tutorials/tff_for_federated_learning_research_compression
In this posted问题 tensorflow> tensorflow'数据集?建议使用tff.simulation.datasets.build_single_label_dataset()作为生成数据集的非IID分布的一种方式。
我尝试先应用该(请参阅代码)并出现错误!
emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data(
only_digits=False)
emnist_train1 = tff.simulation.datasets.build_single_label_dataset(
emnist_train.create_tf_dataset_from_all_clients(),
label_key='label', desired_label=1)
print(emnist_train1.element_spec)
OrderDict([[([('label',tensorsPec(shape =(),dtype = tf.int32,name = none)),('pixels',tensorspec(shape =(28,28),dtype = ttype = tf.float3232 ,name = none))))
print(next(iter(emnist_train1))['label'])
tf.tensor(1,shape =(),dtype = int32)
MAX_CLIENT_DATASET_SIZE = 418
CLIENT_EPOCHS_PER_ROUND = 1
CLIENT_BATCH_SIZE = 20
TEST_BATCH_SIZE = 500
def reshape_emnist_element(element):
return (tf.expand_dims(element['pixels'], axis=-1), element['label'])
def preprocess_train_dataset(dataset):
return (dataset
.shuffle(buffer_size=MAX_CLIENT_DATASET_SIZE)
.repeat(CLIENT_EPOCHS_PER_ROUND)
.batch(CLIENT_BATCH_SIZE, drop_remainder=False)
.map(reshape_emnist_element))
emnist_train1 = emnist_train1.preprocess(preprocess_train_dataset)
>> ---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-cda96c33a0f6> in <module>()
15 .map(reshape_emnist_element))
16
---> 17 emnist_train1 = emnist_train1.preprocess(preprocess_train_dataset)
AttributeError: 'MapDataset' object has no attribute 'preprocess'
由于数据集被过滤,因此无法进行预处理! 因此,在这种情况下,它是根据哪个标签过滤的?
... label_key='label', desired_label=1)
所需的标签= 1 emnist中的哪个标签?
我的问题是:
如何应用此函数tff.simulation.datasets.build_single_single_label_dataset() 要获取非IID数据集 (每个客户端的样本数量不同) 在本特定教程中! https://www.tensorflow.org/federated/tutorials/tff_for_federated_learning_research_compression in details without关于过滤数据集的错误!
感谢任何帮助!
多谢!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
可能存在
tff.simulation。 datasets.clientdata
和 对覆盖有用的API。tf.data.dataset
di 有preprocess
方法,tff.simulation.datasets.clientdata.preprocess
确实存在。
但是, 使用
tf.data.DataSet
实例:输入参数和输出结果为tf.data.data.dataset
实例。在这种情况下,emnist_train1
是tf.data.dataset
,它 not 具有preprocess> preprocess
方法。但是,一切都没有丢失!
preprocess_train_dataset
函数获取tf.data.dataset
参数,并返回tf.data.data.dataset
结果。这应该意味着替换:使用
将创建
tf.data.dataset
仅使用一个单个标签(“标签非IID”),该标签被改组,重复,批处理和重塑。请注意,单个tf.data.dataset
通常用于表示联合算法中的一个用户。为了创建更多的批次,类似以下批次可以起作用:Possibly there is some confusion between the
tff.simulation.datasets.ClientData
andtf.data.Dataset
APIs that would be useful to cover.tf.data.Dataset
does not have apreprocess
method, withtff.simulation.datasets.ClientData.preprocess
does exist.However,
tff.simulation.datasets.build_single_label_dataset
usestf.data.Dataset
instances: both the input argument and the output result astf.data.Dataset
instances. In this case,emnist_train1
is atf.data.Dataset
which does not have apreprocess
method.However, all is not lost! The
preprocess_train_dataset
function takes atf.data.Dataset
argument, and returns atf.data.Dataset
result. This should mean that replacing:with
will create a
tf.data.Dataset
with only a single label ("label non-IID") that is shuffled, repeated, batched, and reshaped. Note that a singletf.data.Dataset
is generally used to represent one user in the federated algorithm. To create more, with a random number of batches, something like the following could work: