使用 pandoc 将 ipython 笔记本文件(基于 JSON)转换为其他格式

发布于 2024-12-25 13:23:56 字数 386 浏览 4 评论 0原文

当尝试使用 pandoc 从 iPython Notebook (0.12) 转换基于 JSON 的文件 (.ipynb) 时,我收到一条错误,指出 JSON 的“bad DecodeArgs”。我怀疑这可能是由于我使用的 Ubuntu 提供的 pandoc 版本(1.8.1.1)所致。似乎获取最新的 pandoc 版本需要设置 Haskell 平台,但由于依赖性挑战,我没有成功完成(而且真的不想这样做)。如果这不是我的问题,我不想再花时间尝试安装 Haskell。

有没有办法在不重建 Ubuntu 的情况下获取最新的 pandoc 二进制文件?

鉴于 iPython Notebook 是新的(而且非常酷!!),很高兴听到有关将 JSON 转换为其他格式的经验。也许除了 pandoc 之外还有其他方法可以实现这一点。

When attempting to use pandoc to convert JSON based files (.ipynb) from iPython notebook (0.12), I receive an error stating "bad decodeArgs" for the JSON. I suspect that it may be due to the Ubuntu provided version of pandoc that I am using (1.8.1.1). It seems that getting the latest pandoc version requires setting up the Haskell platform which I was not successful doing because of dependency challenges (and really don't want to). I don't want to spend any more time trying to install Haskell if this is not my problem.

Is there a way to get the latest pandoc binaries for Ubuntu without rebuilding it?

Given that iPython notebook is new (and very cool!!), it would be nice to hear about experiences related to translating the JSON to other formats. Perhaps there is a different way to accomplish this other than pandoc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

云雾 2025-01-01 13:23:56

关于你的“与 Pandoc 保持同步”,恐怕你确实需要安装 Haskell。最好的方法是通过 Haskell Platform(“HP”)包来做到这一点,然后就像使用 Ruby 一样,使用环境的包管理器来处理依赖项比使用操作系统的依赖项要一致得多。即使在 Windoze 中,我也可以毫无困难地让它工作。 。 。

我确信向 Haskell 邮件列表提出的问题将会为 Debian/Ubuntu 这样的主流平台提供快速帮助,但您可能需要手动安装通过操作系统包管理器提供的较新版本的 HP。

一旦您启动并运行 HP,开发 Pandoc 就非常容易编译,并且 git 将使您随时了解最新的具体说明,目前正在维护:
https://github.com/jgm/pandoc /wiki/Installing-the-development-version-of-pandoc-1.9

请注意,如果您确实不想麻烦地跟上开发人员的最新动态,那么 v1.9 现已正式发布循环,但当然在那之后的相当长一段时间内你都不会在你的操作系统包管理器中得到它(无论如何我想)。

==========================
关于您尝试将 JSON 视为文档语法:

此时 Pandoc 的最佳语法输入是其本机 markdown+extensions 和 reST(特别是对于 Python 人员/环境),基本上保持功能等效,尽管可能有可用的功能前者中没有出现在后者中,因为约翰可以随时添加扩展。 AFAIK Pandoc 尚未开始支持 Sphinx 扩展(还?)

Pandoc 内部使用的 JSON 格式尚未记录(还?),但它是本机 Haskell 数据类型。正如 Thomas K 所指出的,这两个工具表示数据的方式可能有一些相似之处,但可能不足以将其中任何一个视为“只是另一种标记格式”。

但是,如果您正在处理此问题,那么很容易看到 Pandoc 以 JSON 输入的方式查找什么。

pandoc -t json

将其与

pandoc -t native

进行比较,很容易看到 Text.Pandoc.Definition 和 Text.JSON.Generic 创建的规范

使用 Pandoc 的内部数据表示形式作为输入显然比标记的文本流更稳定,和其他人表示希望对此进行记录,这将是对社区的巨大贡献。

请务必将已完成的工作告知 Pandoc 邮件列表在这个区域。那里的工作人员反应非常积极,包括直接从 John M(首席开发人员)本人那里获得快速反馈。

Regarding your "keeping up to date with Pandoc", I'm afraid you do need Haskell installed. The best way to do this via the Haskell Platform ("HP") package, and then just like with Ruby, it's a lot more consistent to use the environment's package manager for dependencies than your OS. I've had no trouble getting it working, even in Windoze. . .

I'm sure questions to the Haskell mailing list would result in quick help for a platform as mainstream as Debian/Ubuntu, but you might need to manually install a newer version of HP that what's available through the OS package manager.

Once you get HP up and running, the dev Pandoc is dead easy to compile, and git will keep you up to date with the latest - specific instructions here, currently maintained:
https://github.com/jgm/pandoc/wiki/Installing-the-development-version-of-pandoc-1.9

Note that v1.9 has now been officially released if you really don't want to go to the trouble of keeping up to date with the dev cycle, but of course again you won't get it in your OS package manager for quite some time after that (I assume anyway).

==========================
Regarding your attempts to treat JSON as a document syntax:

The best syntax inputs for Pandoc at this point are its native markdown+extensions, and reST (especially for Python people/environments), basically maintained as functionally equivalent, although there may be features available in the former that aren't represented in the latter, since John can just add extensions anytime he wants. AFAIK Pandoc hasn't begun to support the Sphinx extensions (yet?)

The JSON format used internally within Pandoc isn't documented (yet?) but it's the native Haskell data type. As the Thomas K notes, there may be some similarity between how the two tools represent data, but probably not enough to treat either as "just another markup format".

However, if you're working on this, it's easy enough to see what Pandoc looks for in the way of JSON input.

pandoc -t json

compare this to

pandoc -t native

and it's easy to see the specs created by Text.Pandoc.Definition and Text.JSON.Generic

Using Pandoc's internal data representation as input would obviously be more stable than a marked up text stream, and others have expressed a desire for documentation on this and it would be a great contribution to the community.

Please do inform the Pandoc mail list of any work done in this area. The crew there is very responsive, including getting quick feedback from John M (the lead developer) himself directly.

秋日私语 2025-01-01 13:23:56

我怀疑 pandoc 或任何其他工具是否知道如何处理 ipynb 文件(在撰写本文时,IPython 笔记本发布不到一个月)。 JSON 只是像 XML 一样的通用数据结构,而不是文档格式。

我们(IPython)正在开发将笔记本导出为其他格式的工具,但它们尚未准备好正式发布。如果您想帮助开发它,请参阅此邮件列表主题。希望它将成为下一个 IPython 版本的一部分。

I doubt pandoc or any other tool knows what to do with ipynb files yet (at the time of writing, the IPython notebook was released less than a month ago). JSON is just a generic data structure like XML, not a document format.

We're (IPython) working on tools to export notebooks to other formats, but they're not ready for a proper release yet. If you want to help develop that, see this mailing list thread. Hopefully it will be part of the next IPython release.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文