从R中获取API的随机样品?
我有一个班级小组项目,我们需要创建一个闪亮的应用程序。我们要使用的数据来自 NYC OpenData,它包含 6M 条记录。我们只想从中获取随机样本。我最初的想法是调用一些数据并用它进行随机采样,但前 900k 数据仅适用于 12/2020、01/2021 和 02/2021。如果我想在从 API 提取数据时获得随机月份,我该怎么办?
这是我的代码:
api_tokn <- paste0("$$app_token=",key_get("NYC_NINEONEONE"))
api_endpoint <- "https://data.cityofnewyork.us/resource/n2zq-pubd.json?"
api_limit <- "&$limit=900000"
#api_filter <- "&borough=BRONX"
nineoneone <- slice(fromJSON(paste0(api_endpoint, api_tokn, api_limit)))
class(nineoneone)
colnames(nineoneone)
glimpse(nineoneone)
sample_n(nineoneone,10000)
谢谢
I have a group project for a class and we need to create a shiny app. The data we want to use is from NYC OpenData and it contains 6M records. We just want to get a random sample from it. and my original thought is to called some data and do random sample with it, but the first 900k data were only for 12/2020, 01/2021, and 02/2021. If I want to get random months while I pull from API, what can I do?
Here is my code:
api_tokn <- paste0("$app_token=",key_get("NYC_NINEONEONE"))
api_endpoint <- "https://data.cityofnewyork.us/resource/n2zq-pubd.json?"
api_limit <- "&$limit=900000"
#api_filter <- "&borough=BRONX"
nineoneone <- slice(fromJSON(paste0(api_endpoint, api_tokn, api_limit)))
class(nineoneone)
colnames(nineoneone)
glimpse(nineoneone)
sample_n(nineoneone,10000)
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
请参阅所以答案来自Socrata API的随机行(如NYC Open Data所使用),
您可以在API查询中生成随机行索引以迭代:
please see this SO answer to get row count and random rows from a Socrata API (as used by NYC open data)
you can generate random row indices to iterate over in your API query like this: