LSTM随培训时变化的批量变化

发布于 2025-01-19 18:58:25 字数 258 浏览 2 评论 0原文

我正在尝试在不同用户的应用程序数据上构建LSTM。我有一个由用户堆叠的应用程序记录组成的大数据框架,因此,例如,前1500行适用于用户1,以下500用于用户2等。每个用户之后重新更新权重的方式,这意味着每次更新后更改批处理大小。为了更好地理解:我希望LSTM首先使用用户1的所有记录,该记录是1500行,并在处理后进行重量更新,此后,它应该占用500行的用户2,并应在此之后更新权重。处理它们等。

我正在与Keras一起建造LSTM。

有可能这样做吗?

谢谢!

I'm trying to build an LSTM on app-log data from different users. I have one big dataframe consisting of stacked app records of the users, so for example the first 1500 rows are for user 1, the following 500 for user 2 etc. I'm now wondering if it is possible to train the LSTM in such a way that the weights are updated after each user which would mean changing the batch size after each update. For a better understanding: I want the LSTM to first take all records of user 1 which are 1500 rows and make an update of weights after processing them, after that it should take the 500 rows of user 2 and should make an update of weights after processing them etc.

I'm building the LSTM with Keras.

Is there a possibility to do so?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

与酒说心事 2025-01-26 18:58:26

我不知道你的具体应用场景,但我假设它是时间序列预测。

构建LSTM模型:

class LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size, batch_size):
        super().__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.output_size = output_size
        self.num_directions = 1
        self.batch_size = batch_size
        self.lstm = nn.LSTM(self.input_size, self.hidden_size, self.num_layers, batch_first=True)
        self.linear = nn.Linear(self.hidden_size, self.output_size)

    def forward(self, input_seq):
        h_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
        c_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
        # print(input_seq.size())
        seq_len = input_seq.shape[1]
        # input(batch_size, seq_len, input_size)
        input_seq = input_seq.view(self.batch_size, seq_len, self.input_size)
        # output(batch_size, seq_len, num_directions * hidden_size)
        output, _ = self.lstm(input_seq, (h_0, c_0))
        # print('output.size=', output.size())
        # print(self.batch_size * seq_len, self.hidden_size)
        output = output.contiguous().view(self.batch_size * seq_len, self.hidden_size)  # (5 * 30, 64)
        pred = self.linear(output)  # pred()
        # print('pred=', pred.shape)
        pred = pred.view(self.batch_size, seq_len, -1)
        pred = pred[:, -1, :]
        return pred

您可以使用DataLoader以不同大小的batch size处理来自不同用户的数据,以获得多个用户的数据集。

像这样:

class MyDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, item):
        return self.data[item]

    def __len__(self):
        return len(self.data)

Dtr = DataLoader(dataset=train, batch_size=B, shuffle=False, num_workers=0)
Dte = DataLoader(dataset=test, batch_size=B, shuffle=False, num_workers=0)

然后,我们开始训练:

for t in range(len(users)):
    # change batch size
    b = batchsizes[t]   # Store batch_size of each user in batchsizes
    model = LSTM(input_size, hidden_size, num_layers, output_size, batch_size=b).to(device)
    if t != 0:
        model.load_state_dict(torch.load(LSTM_PATH)['model'])
    model.train()
    Dtr = Dtrs[t]  # Store train data of each user in Dtrs
    for i in range(epochs):
        cnt = 0
        for (seq, label) in Dtr:
            cnt += 1
            seq = seq.to(device)
            label = label.to(device)
            y_pred = model(seq)
            loss = loss_function(y_pred, label)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            if cnt % 100 == 0:
                print('epoch', i, ':', cnt - 100, '~', cnt, loss.item())
    # Save the current user's model after training
    state = {'model': model.state_dict(), 'optimizer': optimizer.state_dict()}
    torch.save(state, LSTM_PATH)

很抱歉上面的代码不能直接运行,因为我不知道你的数据情况,所以我只是给你提供一个大概的框架。

I don't know your specific application scenario, but I'm assuming it's time series forecasting.

Build the LSTM model:

class LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size, batch_size):
        super().__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.output_size = output_size
        self.num_directions = 1
        self.batch_size = batch_size
        self.lstm = nn.LSTM(self.input_size, self.hidden_size, self.num_layers, batch_first=True)
        self.linear = nn.Linear(self.hidden_size, self.output_size)

    def forward(self, input_seq):
        h_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
        c_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
        # print(input_seq.size())
        seq_len = input_seq.shape[1]
        # input(batch_size, seq_len, input_size)
        input_seq = input_seq.view(self.batch_size, seq_len, self.input_size)
        # output(batch_size, seq_len, num_directions * hidden_size)
        output, _ = self.lstm(input_seq, (h_0, c_0))
        # print('output.size=', output.size())
        # print(self.batch_size * seq_len, self.hidden_size)
        output = output.contiguous().view(self.batch_size * seq_len, self.hidden_size)  # (5 * 30, 64)
        pred = self.linear(output)  # pred()
        # print('pred=', pred.shape)
        pred = pred.view(self.batch_size, seq_len, -1)
        pred = pred[:, -1, :]
        return pred

You can use DataLoader to process data from different users with batch sizes of different sizes to get data sets of multiple users.

Like this:

class MyDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, item):
        return self.data[item]

    def __len__(self):
        return len(self.data)

Dtr = DataLoader(dataset=train, batch_size=B, shuffle=False, num_workers=0)
Dte = DataLoader(dataset=test, batch_size=B, shuffle=False, num_workers=0)

Then, we start training:

for t in range(len(users)):
    # change batch size
    b = batchsizes[t]   # Store batch_size of each user in batchsizes
    model = LSTM(input_size, hidden_size, num_layers, output_size, batch_size=b).to(device)
    if t != 0:
        model.load_state_dict(torch.load(LSTM_PATH)['model'])
    model.train()
    Dtr = Dtrs[t]  # Store train data of each user in Dtrs
    for i in range(epochs):
        cnt = 0
        for (seq, label) in Dtr:
            cnt += 1
            seq = seq.to(device)
            label = label.to(device)
            y_pred = model(seq)
            loss = loss_function(y_pred, label)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            if cnt % 100 == 0:
                print('epoch', i, ':', cnt - 100, '~', cnt, loss.item())
    # Save the current user's model after training
    state = {'model': model.state_dict(), 'optimizer': optimizer.state_dict()}
    torch.save(state, LSTM_PATH)

I'm sorry that the above code is not working directly, because I don't know your data situation, so I just provide you with a general framework.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文