如何以日期格式的日期范围的第一个和最后一个日期从日期范围的第一个和最后一个日期中获取所有值(ID,日期和卡路里)?

发布于 2025-01-25 03:00:57 字数 4545 浏览 3 评论 0原文

首先,Stackoverflow一直在说已经有答案了,但是我一直在寻找2.5个小时,没有任何可用。

我正在尝试从具有940行的数据框中查看值。我想查看试验的第一个和最后一个日期中与用户ID相关的卡路里。

            Id ActivityDay Calories
1   1503960366  2016-04-12     1985
2   1624580081  2016-04-12     1432
3   1644430081  2016-04-12     3199
4   1844505072  2016-04-12     2030
5   1927972279  2016-04-12     2220
6   2022484408  2016-04-12     2390
7   2026352035  2016-04-12     1459
8   2320127002  2016-04-12     2124
9   2347167796  2016-04-12     2344
10  2873212765  2016-04-12     1982
11  3372868164  2016-04-12     1788
12  3977333714  2016-04-12     1450
13  4020332650  2016-04-12     3654
14  4057192912  2016-04-12     2286
15  4319703577  2016-04-12     2115
16  4388161847  2016-04-12     2955
17  4445114986  2016-04-12     2113
18  4558609924  2016-04-12     1909
19  4702921684  2016-04-12     2947
20  5553957443  2016-04-12     2026
21  5577150313  2016-04-12     3405
22  6117666160  2016-04-12     1496
23  6290855005  2016-04-12     2560
24  6775888955  2016-04-12     1841
25  6962181067  2016-04-12     1994
26  7007744171  2016-04-12     2937
27  7086361926  2016-04-12     2772
28  8053475328  2016-04-12     3186
29  8253242879  2016-04-12     2044
30  8378563200  2016-04-12     3635
31  8583815059  2016-04-12     2650
32  8792009665  2016-04-12     2044
33  8877689391  2016-04-12     3921
34  1503960366  2016-04-13     1797
35  1624580081  2016-04-13     1411
36  1644430081  2016-04-13     2902
37  1844505072  2016-04-13     1860
38  1927972279  2016-04-13     2151
39  2022484408  2016-04-13     2601
40  2026352035  2016-04-13     1521
41  2320127002  2016-04-13     2003
42  2347167796  2016-04-13     2038
43  2873212765  2016-04-13     2004
44  3372868164  2016-04-13     2093
45  3977333714  2016-04-13     1495
46  4020332650  2016-04-13     1981
47  4057192912  2016-04-13     2306
48  4319703577  2016-04-13     2135
49  4388161847  2016-04-13     3092
50  4445114986  2016-04-13     2095
51  4558609924  2016-04-13     1722
52  4702921684  2016-04-13     2898

这是示例数据...注销其他近900行... 我只想保留2016-04-12和2016-05-12的日期。这就是数据获取的范围。我想查看用户的ID及其仅来自这两个日期的卡路里。

我已经尝试了大约50个代码...这是我现在所处的位置:

Daily_Calories %>% 
  group_by(Id, Calories) %>%
  arrange(ActivityDay) %>% 
  as.data.frame()

我没有保存我尝试过的所有代码,因为我是新手有点迷路。

我也尝试了:

Daily_Calories %>% 
  group_by(Id, Calories) %>%
  group_by(min(ActivityDay), max(ActivityDay)) %>% 
  arrange(ActivityDay) %>%
  as.data.frame()

得到了:

            Id ActivityDay Calories min(ActivityDay) max(ActivityDay)
1   1503960366  2016-04-12     1985       2016-04-12       2016-05-12
2   1624580081  2016-04-12     1432       2016-04-12       2016-05-12
3   1644430081  2016-04-12     3199       2016-04-12       2016-05-12
4   1844505072  2016-04-12     2030       2016-04-12       2016-05-12
5   1927972279  2016-04-12     2220       2016-04-12       2016-05-12
6   2022484408  2016-04-12     2390       2016-04-12       2016-05-12
7   2026352035  2016-04-12     1459       2016-04-12       2016-05-12
8   2320127002  2016-04-12     2124       2016-04-12       2016-05-12
9   2347167796  2016-04-12     2344       2016-04-12       2016-05-12
10  2873212765  2016-04-12     1982       2016-04-12       2016-05-12
11  3372868164  2016-04-12     1788       2016-04-12       2016-05-12
12  3977333714  2016-04-12     1450       2016-04-12       2016-05-12

然后尝试了:

Daily_Calories %>% 
  group_by(Id, Calories) %>%
  arrange(ActivityDay) %>%
  summarise(min(ActivityDay), max(ActivityDay)) %>% 
  as.data.frame()

得到了:

            Id Calories min(ActivityDay) max(ActivityDay)
1   1503960366        0       2016-05-12       2016-05-12
2   1503960366     1728       2016-04-17       2016-04-17
3   1503960366     1740       2016-05-08       2016-05-08
4   1503960366     1745       2016-04-15       2016-04-15
5   1503960366     1775       2016-04-21       2016-04-21
6   1503960366     1776       2016-04-14       2016-04-14
7   1503960366     1783       2016-05-11       2016-05-11
8   1503960366     1786       2016-04-20       2016-04-20
9   1503960366     1788       2016-04-24       2016-04-24

我不是在寻找最小和最大卡路里,简单地说,“最低”和“最大”日期...含义2016-04 -12和2016-05-12。 我刚刚尝试的所有这三个代码都从结果中省略了700多行,这表示它们是错误的。有33个用户和2个日期,因此结果应该有66行。

我希望这能得到足够的解释,我正在努力提出问题。感谢时间和帮助。

几乎忘记了,我不想创建一个新的数据框架,只需看到结果即可。这就是为什么我的代码仅以数据框开始。有区别吗?我更喜欢在控制台中查看的结果。干杯!

First off, StackOverFlow keeps saying there are answers already, but I've been looking for 2.5 hours now and nothing is available.

I'm attempting to view values from a dataframe with 940 rows. I would like to view the calories associated to the user IDs from the first and last dates of the trial.

            Id ActivityDay Calories
1   1503960366  2016-04-12     1985
2   1624580081  2016-04-12     1432
3   1644430081  2016-04-12     3199
4   1844505072  2016-04-12     2030
5   1927972279  2016-04-12     2220
6   2022484408  2016-04-12     2390
7   2026352035  2016-04-12     1459
8   2320127002  2016-04-12     2124
9   2347167796  2016-04-12     2344
10  2873212765  2016-04-12     1982
11  3372868164  2016-04-12     1788
12  3977333714  2016-04-12     1450
13  4020332650  2016-04-12     3654
14  4057192912  2016-04-12     2286
15  4319703577  2016-04-12     2115
16  4388161847  2016-04-12     2955
17  4445114986  2016-04-12     2113
18  4558609924  2016-04-12     1909
19  4702921684  2016-04-12     2947
20  5553957443  2016-04-12     2026
21  5577150313  2016-04-12     3405
22  6117666160  2016-04-12     1496
23  6290855005  2016-04-12     2560
24  6775888955  2016-04-12     1841
25  6962181067  2016-04-12     1994
26  7007744171  2016-04-12     2937
27  7086361926  2016-04-12     2772
28  8053475328  2016-04-12     3186
29  8253242879  2016-04-12     2044
30  8378563200  2016-04-12     3635
31  8583815059  2016-04-12     2650
32  8792009665  2016-04-12     2044
33  8877689391  2016-04-12     3921
34  1503960366  2016-04-13     1797
35  1624580081  2016-04-13     1411
36  1644430081  2016-04-13     2902
37  1844505072  2016-04-13     1860
38  1927972279  2016-04-13     2151
39  2022484408  2016-04-13     2601
40  2026352035  2016-04-13     1521
41  2320127002  2016-04-13     2003
42  2347167796  2016-04-13     2038
43  2873212765  2016-04-13     2004
44  3372868164  2016-04-13     2093
45  3977333714  2016-04-13     1495
46  4020332650  2016-04-13     1981
47  4057192912  2016-04-13     2306
48  4319703577  2016-04-13     2135
49  4388161847  2016-04-13     3092
50  4445114986  2016-04-13     2095
51  4558609924  2016-04-13     1722
52  4702921684  2016-04-13     2898

This is the sample data...ommiting the other nearly 900 rows...
I want to keep only the date of 2016-04-12, AND 2016-05-12. That is the range of which the data was taken from. I'd like to see the IDs of the users, and their calories from those 2 dates only.

I've tried about 50 codes...here is where I'm at right now:

Daily_Calories %>% 
  group_by(Id, Calories) %>%
  arrange(ActivityDay) %>% 
  as.data.frame()

I have not saved all the codes I've tried, as I'm new and RStudio gets messy and unorganized quickly...and then I get a bit lost.

I've also tried:

Daily_Calories %>% 
  group_by(Id, Calories) %>%
  group_by(min(ActivityDay), max(ActivityDay)) %>% 
  arrange(ActivityDay) %>%
  as.data.frame()

and got this:

            Id ActivityDay Calories min(ActivityDay) max(ActivityDay)
1   1503960366  2016-04-12     1985       2016-04-12       2016-05-12
2   1624580081  2016-04-12     1432       2016-04-12       2016-05-12
3   1644430081  2016-04-12     3199       2016-04-12       2016-05-12
4   1844505072  2016-04-12     2030       2016-04-12       2016-05-12
5   1927972279  2016-04-12     2220       2016-04-12       2016-05-12
6   2022484408  2016-04-12     2390       2016-04-12       2016-05-12
7   2026352035  2016-04-12     1459       2016-04-12       2016-05-12
8   2320127002  2016-04-12     2124       2016-04-12       2016-05-12
9   2347167796  2016-04-12     2344       2016-04-12       2016-05-12
10  2873212765  2016-04-12     1982       2016-04-12       2016-05-12
11  3372868164  2016-04-12     1788       2016-04-12       2016-05-12
12  3977333714  2016-04-12     1450       2016-04-12       2016-05-12

and then tried this:

Daily_Calories %>% 
  group_by(Id, Calories) %>%
  arrange(ActivityDay) %>%
  summarise(min(ActivityDay), max(ActivityDay)) %>% 
  as.data.frame()

and got this:

            Id Calories min(ActivityDay) max(ActivityDay)
1   1503960366        0       2016-05-12       2016-05-12
2   1503960366     1728       2016-04-17       2016-04-17
3   1503960366     1740       2016-05-08       2016-05-08
4   1503960366     1745       2016-04-15       2016-04-15
5   1503960366     1775       2016-04-21       2016-04-21
6   1503960366     1776       2016-04-14       2016-04-14
7   1503960366     1783       2016-05-11       2016-05-11
8   1503960366     1786       2016-04-20       2016-04-20
9   1503960366     1788       2016-04-24       2016-04-24

I'm not looking for the minimum and maximum calories, simply, the "minimum" and "maximum" dates...meaning, 2016-04-12, and 2016-05-12.
All three of these codes I just tried had 700+ rows omitted from the results, which signifies they are wrong. There are 33 users, and 2 dates, so there should be 66 rows for results.

I hope this is explained well enough, I'm trying to be better with my questions. I appreciate the time and help.

Almost forgot, I wasn't wanting to create a new dataframe, just see the results. That's why my code starts with just the dataframe. Does it make a difference? I'd prefer the results in the console for viewing. Cheers!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

兰花执着 2025-02-01 03:00:57

如果我正确理解您,您想将所有观察值保留在activityday的数据框架中,code> 2016-04-122016-05-12代码>,正确吗?还是您想查看它们之间的所有值?

如果是这样,请尝试:

keeps <- c("2016-04-12", "2016-05-12")

# Keep only those values
df[df$ActivityDay %in% keeps,]

# Keep value in range between 
df[as.Date(df$ActivityDay) %in% seq(min(as.Date(keeps)), max(as.Date(keeps)),1),]

这将显示您想要的日期的值。

我不清楚您的最终数据会是什么样 - 如果我误解了,请告诉我,我会修改答案。祝你好运!

If I understand you correctly, you want to keep all observations in the data frame where ActivityDay is either 2016-04-12 or 2016-05-12, correct? Or do you want to view all values in the range between them?

If so, try:

keeps <- c("2016-04-12", "2016-05-12")

# Keep only those values
df[df$ActivityDay %in% keeps,]

# Keep value in range between 
df[as.Date(df$ActivityDay) %in% seq(min(as.Date(keeps)), max(as.Date(keeps)),1),]

This will show values for the dates that you want.

I was unclear as to what your final data would look like - if I misunderstood, let me know and I will modify my answer. Good luck!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文