如何及时了解 R 包中的已知错误和错误修复?
是否有标准的 R 社区资源可以及时了解包的已知错误或错误修复?我目前的方法是相当手动的。 (注意:我将其限制为 CRAN - 请参阅注释 1。)
我的用例基本上是错误监视和软件包更新管理。一段时间以来,我平均每个月都会发现几个错误(我会及时向作者报告;-))。由于我的很多工作都是通过虚拟机完成的,因此当我很好地掌握了必要软件包的错误状态时,我倾向于更新虚拟机映像。当修复了一堆错误后,我可以删除我的解决方法,这很棒,并且我更新了图像。当我发现错误爆发时,我不会创建新映像。
以下是我当前使用的来源:
- 新闻文件:许多(但不是全部)软件包都有新闻文件。这些无疑是一个有用的起点。
- 软件包主页:有些软件包在 CRAN 上没有 NEWS 文件,而是在作者的网站上单独发布更改日志。
- R 项目托管的邮件列表
- Google 包群组
- 与包作者的个人交流
- 包的错误跟踪(例如,开发人员可能使用 Bugzilla)
第一个发现错误是一回事(我承认错误发生在我们所有人身上),迟来发现一个已知的错误或者更好的是已经修复的错误是另一回事。两者都会减慢我自己的编码速度,但更好的错误监控(也许我们需要一个cdc4R
包:))会显着减少影响。如果没有标准的更新警报系统(例如,update.packages()
的扩展来报告哪些软件包可以更新并链接到有关已更改内容的信息),则用户的工作就是寻找此信息。
作为这样的用户,试图寻找这些信息,上面的列表中是否有一些我忽略的标准资源?例如,是否有一个 R 邮件列表,开发人员通常可以在其中发布他们的更改和错误修复?或者是否有一个网站聚合此类帖子、发布测试(看起来是 CRAN 帖子 R CMD CHECK 输出),或者提供其他一些反馈?
关于其他资源的一些附加注释,为了其他人的利益:
- 我看到 CRANberry 有一个简洁的
diff
关于软件包的摘要,这对我来说是新的。 (我受到启发,考虑在 diff 输出中使用 grep 来查找bug
或fix
。) - R 中的
bug.report()
是一个好方法向 R Core 或包维护者的电子邮件地址发送消息。 - 值得考虑的几个测试包是:
testthat
,RUnit
和svUnit
。 - 我个人的“快速测试”是简单地使用
digest
验证结果是否匹配,而不必测试非常大的对象的相等性。
注 1:我标记此 cran 因为它是管理所有 R 包的宇宙是不可能的。对于单个包作者来说,可以将包分发到他们想要的任何地方,使用他们喜欢的任何邮件列表或错误跟踪系统等。但是,这超出了 R 的“主流”范围。我是否要发布包并提醒用户对于更改、错误、错误修复,我会选择 CRAN + NEWS + Bugzilla + Google Groups + R-Forge(和/或 RForge)等,但是此列表中是否缺少其他标准报告机制?
从某种意义上说,这篇文章也可以询问是否存在鼓励开发人员使用的机制。我怀疑没有标准,因为 R Core 成员的软件包似乎在错误和更改报告方面做了许多不同的事情。
注 2:我还添加了 administration (尽管其他东西可能更合适),因为这也与 R 的管理有关。为了再现性,包的管理非常重要;当有多个用户或更多移动部件时,了解错误和修复成为一项管理任务,也是依赖外部包的开发的重要考虑因素。如果是另一个标签,例如 system-administration 更合适,我愿意改变。
Is there a standard R community resource for keeping up to date on known bugs or bug fixes for packages? My current approach is rather manual. (NB: I'm restricting this to CRAN - see Note 1.)
My use case is basically bug surveillance and the management of package updates. I've been averaging a couple of bug discoveries each month for awhile (which I duly report to the authors ;-)). Since a lot of my work is done with virtual machines, I tend to update the VM images when I have a good handle on the bug status for necessary packages. When a bunch of bugs are fixed, I can remove my workarounds, which is great, and I update the images. When I discover an outbreak of bugs, I don't create a new image.
Here are the sources I'm currently using:
- NEWS files: Many, but not all, packages have NEWS files. These are certainly a helpful place to start.
- Package home page: Some packages do not have a NEWS file on CRAN, but separately post a change log on the author's site.
- R project-hosted mailing lists
- Google Groups for packages
- Personal communication with package authors
- Bug tracking for packages (e.g. a developer may use Bugzilla)
It's one thing to be the first to discover a bug (I grant that bugs happen to all of us), it's another to belatedly discover a bug that is either already known or, better yet, already fixed. Both slow down my own coding, but better bug surveillance (maybe we need a cdc4R
package :)) would significantly reduce the impact. Without a standard update alerting system (e.g. an extension to update.packages()
that reports on which packages could be updated and links to info on what's changed), it's the user's job to seek out this information.
As such a user, trying to seek out this information, is there some standard resource I've overlooked in the list above? For instance, is there an R mailing list where it's common for developers to post their changes and bug fixes? Or is there a site that aggregates such posts, posts tests (CRAN posts R CMD CHECK
output, it seems), or that gives some other feedback?
A few additional notes on other resources, for others' benefit:
- I see that CRANberries has a terse
diff
summary on packages, which is new to me. (I am inspired to consider a grep forbug
orfix
in the diff output.) bug.report()
in R is a good way to send a message to R Core or the email address of a package maintainer.- Several testing packages worth consideration are:
testthat
,RUnit
, andsvUnit
. - My personal "quick test" is to simply use
digest
to verify that results match, without having to test equality of very large objects.
Note 1: I'm tagging this cran because it's impossible to manage the universe of all R packages. For an individual package author, one can distribute a package wherever they'd like, use whatever mailing list or bug tracking system they like, etc. However, that's outside the "mainstream" for R. Were I to release a package and alert users to changes, bugs, bugfixes, I'd go with CRAN + NEWS + Bugzilla + Google Groups + R-Forge (and/or RForge), etc., but is there another standard reporting mechanism that is missing from this list?
In some sense, this note also serves to ask if there's a mechanism that developers are encouraged to use. I suspect there is no standard, as packages by R Core members seem to do many different things regarding bug and change reporting.
Note 2: I'm also adding administration (though something else may be more apropos), since this also relates to administering R. For reproducibility, administration of packages is quite important; when there are multiple users or more moving pieces, keeping aware of bugs and fixes becomes an administrative task, as well as an important consideration for development that depends on the external packages. If another tag, e.g. system-administration is more appropriate, I'm open to a change.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不是完整的答案,但这里有一些想法。
对于
data.table
,我们跟踪错误(和功能请求)在 R-Forge 上。我想您可以查询 R-Forge 的跟踪器(以编程方式)查找那里托管的所有包。无论如何都要添加到您的列表中。该网络跟踪器是bug.report(package="data.table")
指向的位置(而不仅仅是维护人员的电子邮件地址)。此外,任何人都可以订阅任何
[电子邮件受保护]
邮件列表,用于接收每个邮件的统一差异和提交消息(在提交时) R-Forge 上的项目。不过,我不知道涵盖任何 R-Forge 项目的任何提交的通用邮件列表。在
?data.table
的顶部有一个指向 最新新闻。这就是我们在用户升级时向他们传达最新版本(和开发中)内容的方式。该链接实时更新;即“最新”是字面意思。但是,他们确实必须检查那里!Not a complete answer but here are some thoughts.
In the case of
data.table
we track bugs (and feature requests) on R-Forge here. I imagine you could query R-Forge's tracker (programatically) for all packages hosted there. To add to your list anyway. That web tracker is wherebug.report(package="data.table")
points to (not just an email address to maintainer).Also, anyone can subscribe to any
<pkgname>[email protected]
mailing list to receive a unified diff and commit message (at the time of commit) for each project on R-Forge. I'm not aware of a general mailing list spanning any commit to any R-Forge project, though.At the top of
?data.table
there is a link to up to the minute NEWS. This is how we communicate to users what is in the latest version (and in development) if they upgrade. That link updates in real-time; i.e., "up to the minute" is meant literally. But, they do have to check there!