哪种模型可以预测罕见发生？例如，当有人服药时

发布于 2025-02-02 23:51:50 字数 2100 浏览 2 评论 0原文

我试图建立一个预测模型，这将使我有可能在某些条件下服药的人。我最感兴趣的是，该模型将相对准确地预测某人何时服药。我有一个有1400行的数据框架，其中大约134行是用户服用药物的行。我有一个DF，看起来有些类似于下面的示例。

df = pd.DataFrame({'time_hour': ['6', '12', '18'], 
               'weekday': [6, 1, 3],
               'previous_action': ['eat', 'sleep', 'eat'],
               'take_medicine': [0, 1, 1]})

我已经尝试使用逻辑回归和伯诺利天真的贝叶斯来解决此问题，但是每个人都只押注最常见的结果，即不服药的人。我尝试搜索如何解决这个问题而没有成功。

我已经查看了数据，该人每天在12和18中服药，所以我很好奇结果为什么如此糟糕。还有另一个模型可以更好地适合这种问题，还是我应该做些不同的事情？

做过的示例

predictors = ['time_hour', 'weekday', 'previous_action']
X = df[predictors]
y = df['take_medicine']
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=0)

from sklearn.naive_bayes import BernoulliNB
bert = BernoulliNB()
bert.fit(X_train, y_train)
y_pred = bert.predict(X_test)
y_pred

这是我以前

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

原文

I've tried to build a prediction model that would give me a probability of a person taking medicine given certain conditions. What I'm most interested about is that the model would relatively accurately be able to predict when someone takes medicine. I have a dataframe that has 1400 rows, where about 134 rows are those where the user takes medicine. I have a df that looks somewhat like the example below.

df = pd.DataFrame({'time_hour': ['6', '12', '18'], 
               'weekday': [6, 1, 3],
               'previous_action': ['eat', 'sleep', 'eat'],
               'take_medicine': [0, 1, 1]})

I've tried solving this with logistic regression and bernoulli naive bayes, but each of them only bet on the most common outcome, which is the person not taking medicine. I've tried googling how to solve this without success.

I've looked at the data and the person takes medicine daily at 12 and 18, so I'm curious why the results are so bad. Is there another model that would suit this kind of problem better or should I be doing something differently?

Here is an example what I've done previously

predictors = ['time_hour', 'weekday', 'previous_action']
X = df[predictors]
y = df['take_medicine']
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=0)

from sklearn.naive_bayes import BernoulliNB
bert = BernoulliNB()
bert.fit(X_train, y_train)
y_pred = bert.predict(X_test)
y_pred

Which returns

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

分享到QQ

分享到微博