我有一个CSV文件,其中包含许多患者的健康测量数据。每个患者都有不同数量的测量值。 (有些患者经常出现,有些患者不会。)我正在尝试进行下一个价值预测模型,以预测患者的特定事件风险。
由于这些值都是时间顺序,因此我尝试使用LSTM进行预测。另外,我将所有患者的健康数据加入到一个长列中。 (请参阅附件)
我要喂入LSTM
,我的LSTM模型像库存一样生成库存结果价格预测。
类似我的结果
,但我想知道有更好的方法。我认为我当前连接所有患者数据的方法很奇怪。由于所有患者都有不同数量的测量值,因此我不确定是否可以并行将其喂入LSTM模型。也许我应该使用随机森林,因为每个患者的数据具有独特的分布?谢谢你!
I have a csv file with many patients' health measurement data. Each patient has a different number of measurements. (Some patients come frequently, some don't.) I am trying to do a next value prediction model to predict the patients' risk of specific incidences.
Since the values are all in time sequence, I've tried to use LSTM to make predictions. Also, I am concatenating all the patients' health data together into a long column. (Please see attachment)
what I am feeding into the LSTM
And my LSTM model generates results like stock price prediction.
kind of like my result
But I wonder if there are better ways. I think my current method of concatenating all my patients' data is strange. Since all the patients have a different number of measurements, I am not sure if can feed them to the LSTM model in parallel. Or maybe I should use random forest because each patient's data has unique distribution? Thank you!
发布评论
评论(1)
关于数据的不同长度,您可以使用填充和掩模使数据均匀长度(。使用LSTMS预测基于序列的数据通常是一个好方法,但是我建议您查看GRUS而不是LSTMS,以及Transformer Architectures,因为到目前为止,它们对LSTM具有许多优势。
Regarding the different lengths of your data, you can use Padding and Masking to make your data evenly lengthed (Description of Padding/Masking with Tensorflow). Predicting sequence based data using LSTMs is generally a good way, but I would advise you to look in GRUs instead of LSTMs and also into Transformer architectures, becuase by now they have many advantages to LSTMs.