tensorflow Federated的型号被卡在0.1精度上
我正在尝试培训MNIST数据集的联合模型。我正在使用 https://www.tensorflow.org.org.org/tuterateys/tutorials/simustorials/simustoratiors/simulations/simustions/simulations https://www.tensorflow.org/federated/tutorials/simulations 设置。 所使用的数据集版本是KERAS(不是TFF中使用的LEAF的联合版本)。我正在对其进行分区,将其保存在字典上,并使用tff.simulation.datasets.testclientdata
。
实现我的客户端实例 应用此更改效果很好。但是,如果我从模拟中更改模型,则每回合都会给我〜0.1的精度。
教程中的模型尽可能简单,输入层为28*28 = 784神经元在DIM 10的输出层上堆叠,
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape=(784,)),
tf.keras.layers.Dense(units=10, kernel_initializer='zeros'),
tf.keras.layers.Softmax(),
])
并激活了softmax激活:新模型是CNN:
model = tf.keras.Sequential(
[
tf.keras.layers.Conv2D(
16,
8,
strides=2,
padding="same",
activation="relu",
input_shape=(28, 28, 1),
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Conv2D(
32, 4, strides=2, padding="valid", activation="relu"
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(10),
]
)
准确性从一轮变为圆,第一种情况增加,达到0.94的速度很快。 在第二种情况下,我用3个固定客户端进行了大约240发子弹,每个20k元素,10个时代,批量尺寸32
。 。我已经在使用花卉框架达到0.99精确度的Flower Framework上进行了偏心版和联合版本进行了测试。但是由于某种原因,我无法在TFF上使用它。
环境: Macos Bigsur TensorFlow == 2.8.0 TensorFlow ferated == 0.22.0
我希望指标和损失会更改。可能是使用其他型号存在问题吗?
完整代码:
from tensorflow.keras.datasets import cifar10, mnist
import numpy as np
EPOCHS = 10
BATCH_SIZE = 32
# ROUND_CLIENTS <= NUM_CLIENTS
ROUND_CLIENTS = 3
NUM_CLIENTS = 3
NUM_ROUNDS = 400
def make_client(num_clients,X, y):
total_image_count = len(X)
image_per_set = int(np.floor(total_image_count/num_clients))
client_train_dataset = collections.OrderedDict()
for i in range(1, num_clients+1):
client_name = i-1
start = image_per_set * (i-1)
end = image_per_set * i
print(f"Adding data from {start} to {end} for client : {client_name}")
data = collections.OrderedDict((('label', y[start:end]), ('pixels', X[start:end])))
client_train_dataset[client_name] = data
train_dataset = tff.simulation.datasets.TestClientData(client_train_dataset)
return train_dataset
def preprocess(X: np.ndarray, y: np.ndarray):
"""Basic preprocessing for MNIST dataset."""
X = np.array(X, dtype=np.float32) / 255
X = X.reshape((X.shape[0], 28, 28, 1))
y = np.array(y, dtype=np.int32)
y = tf.keras.utils.to_categorical(y, num_classes=10)
return X, y
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(X_train, y_train) = preprocess(X_train, y_train)
(X_test, y_test) = preprocess(X_test, y_test)
mnistFedTrain = make_client(NUM_CLIENTS,X_train,y_train)
def map_fn(example):
return collections.OrderedDict(
x=example['pixels'],
y=example['label'])
def client_data(client_id):
ds = mnistFedTrain.create_tf_dataset_for_client(mnistFedTrain.client_ids[client_id])
return ds.repeat(EPOCHS).shuffle(500).batch(BATCH_SIZE).map(map_fn)
train_data = [client_data(n) for n in range(ROUND_CLIENTS)]
element_spec = train_data[0].element_spec
def create_cnn_model() -> tf.keras.Model:
"""Returns a sequential keras CNN Model."""
return tf.keras.Sequential(
[
tf.keras.layers.Conv2D(
16,
8,
strides=2,
padding="same",
activation="relu",
input_shape=(28, 28, 1),
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Conv2D(
32, 4, strides=2, padding="valid", activation="relu"
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(10),
]
)
def model_fn():
model = create_cnn_model()
return tff.learning.from_keras_model(
model,
input_spec=element_spec,
loss=tf.keras.losses.CategoricalCrossentropy(
from_logits=True, reduction=tf.losses.Reduction.NONE
),
metrics=[tf.keras.metrics.CategoricalAccuracy()]
)
trainer = tff.learning.build_federated_averaging_process(
model_fn, client_optimizer_fn=lambda: tf.keras.optimizers.SGD(0.02))
def evaluate(num_rounds=NUM_ROUNDS):
state = trainer.initialize()
for i in range(num_rounds):
t1 = time.time()
state, metrics = trainer.next(state, train_data)
t2 = time.time()
print('\n Round {r}: metrics {m}, round time {t:.2f} seconds'.format(
m=metrics['train'], r=i, t=t2 - t1))
t1 = time.time()
evaluate(NUM_ROUNDS)
t2 = time.time()
print('Seconds:',t2 - t1,' = Minutes:', (t2 - t1)/60)
我在其他模型中也有类似的问题,例如在TF中为CIFAR10实施的MobilenetV2: `模型= tf.keras.applications.mobilenetv2((32,32,3),类= 10,strige = none)
I'm trying train a federated model for the mnist dataset. I am using the code avaible at https://www.tensorflow.org/federated/tutorials/simulations for the setup.
The dataset version being used is the the one from keras (not the federated version from leaf that is used in tff). I'm making a partition of it, saving it on a dictionary and implementing my ClientData instance with tff.simulation.datasets.TestClientData
.
Applying this change works just fine. However, if I change the model from the simulation, every round gives me a ~0.1 accuracy.
The model in the tutorial is as simple as it can get, an input layer of 28*28=784 neurons stacked over an output layer of dim 10 with Softmax activation:
model = tf.keras.models.Sequential([
tf.keras.layers.InputLayer(input_shape=(784,)),
tf.keras.layers.Dense(units=10, kernel_initializer='zeros'),
tf.keras.layers.Softmax(),
])
And the new model is a cnn:
model = tf.keras.Sequential(
[
tf.keras.layers.Conv2D(
16,
8,
strides=2,
padding="same",
activation="relu",
input_shape=(28, 28, 1),
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Conv2D(
32, 4, strides=2, padding="valid", activation="relu"
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(10),
]
)
Accuracy changed from round to round on the first case, increasing, reaching 0.94 quite fast.
On the second case I ran it for about 240 rounds with 3 fixed clients, 20k elements each, 10 epochs, batch size 32. Still couldn't get out of the ~0.1 accuracy and loss of ~2.3
The model works fine for this dataset. I already tested it on a centrilized version and a federated version using Flower framework reaching 0.99 accuracy. But for some reason I can't make it work on tff.
Environment:
MacOs BigSur
tensorflow==2.8.0
tensorflow-federated==0.22.0
I expect the metrics and loss to change more. Could it be that there is a problem with using other Models?
Full code:
from tensorflow.keras.datasets import cifar10, mnist
import numpy as np
EPOCHS = 10
BATCH_SIZE = 32
# ROUND_CLIENTS <= NUM_CLIENTS
ROUND_CLIENTS = 3
NUM_CLIENTS = 3
NUM_ROUNDS = 400
def make_client(num_clients,X, y):
total_image_count = len(X)
image_per_set = int(np.floor(total_image_count/num_clients))
client_train_dataset = collections.OrderedDict()
for i in range(1, num_clients+1):
client_name = i-1
start = image_per_set * (i-1)
end = image_per_set * i
print(f"Adding data from {start} to {end} for client : {client_name}")
data = collections.OrderedDict((('label', y[start:end]), ('pixels', X[start:end])))
client_train_dataset[client_name] = data
train_dataset = tff.simulation.datasets.TestClientData(client_train_dataset)
return train_dataset
def preprocess(X: np.ndarray, y: np.ndarray):
"""Basic preprocessing for MNIST dataset."""
X = np.array(X, dtype=np.float32) / 255
X = X.reshape((X.shape[0], 28, 28, 1))
y = np.array(y, dtype=np.int32)
y = tf.keras.utils.to_categorical(y, num_classes=10)
return X, y
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(X_train, y_train) = preprocess(X_train, y_train)
(X_test, y_test) = preprocess(X_test, y_test)
mnistFedTrain = make_client(NUM_CLIENTS,X_train,y_train)
def map_fn(example):
return collections.OrderedDict(
x=example['pixels'],
y=example['label'])
def client_data(client_id):
ds = mnistFedTrain.create_tf_dataset_for_client(mnistFedTrain.client_ids[client_id])
return ds.repeat(EPOCHS).shuffle(500).batch(BATCH_SIZE).map(map_fn)
train_data = [client_data(n) for n in range(ROUND_CLIENTS)]
element_spec = train_data[0].element_spec
def create_cnn_model() -> tf.keras.Model:
"""Returns a sequential keras CNN Model."""
return tf.keras.Sequential(
[
tf.keras.layers.Conv2D(
16,
8,
strides=2,
padding="same",
activation="relu",
input_shape=(28, 28, 1),
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Conv2D(
32, 4, strides=2, padding="valid", activation="relu"
),
tf.keras.layers.MaxPool2D(2, 1),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(32, activation="relu"),
tf.keras.layers.Dense(10),
]
)
def model_fn():
model = create_cnn_model()
return tff.learning.from_keras_model(
model,
input_spec=element_spec,
loss=tf.keras.losses.CategoricalCrossentropy(
from_logits=True, reduction=tf.losses.Reduction.NONE
),
metrics=[tf.keras.metrics.CategoricalAccuracy()]
)
trainer = tff.learning.build_federated_averaging_process(
model_fn, client_optimizer_fn=lambda: tf.keras.optimizers.SGD(0.02))
def evaluate(num_rounds=NUM_ROUNDS):
state = trainer.initialize()
for i in range(num_rounds):
t1 = time.time()
state, metrics = trainer.next(state, train_data)
t2 = time.time()
print('\n Round {r}: metrics {m}, round time {t:.2f} seconds'.format(
m=metrics['train'], r=i, t=t2 - t1))
t1 = time.time()
evaluate(NUM_ROUNDS)
t2 = time.time()
print('Seconds:',t2 - t1,' = Minutes:', (t2 - t1)/60)
I've had a similar problem with other models as well, e.g. MobileNetV2 implemented in tf for cifar10:
`model = tf.keras.applications.MobileNetV2((32, 32, 3), classes=10, weights=None)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论