Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 10 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(2)
再看一下 iText(我知道不支持直接 PDF -> RTF 转换 - 但继续阅读可能会令人脊背发麻!)。
我们去年在 iText 中添加了 PDF 文本解析模块。 目前,这还有些初级,但它确实有效,而且很容易扩展。
iText 擅长生成 RTF。
所以...从 PDF 中解析文本并根据解析创建 RTF 应该相对简单(不容易,但很简单)。
如果你必须保留字体之类的东西,那么需要做更多的工作(PDF解析器确实提供字体信息,以及每段文本的页面位置),但我怀疑iText的RTF生成器会简化很多工作那。
如果您的 PDF 包含需要转换为 RTF 的嵌入图像,当前的 PDF 解析器实际上并没有做太多的事情 - 但它有足够的钩子来允许它通过一点苦劳来实现。
所以我想说 iText 很可能可以满足您的需求,并且将帮助您实现本地最低限度的开发工作,但我不会将其归为超级简单的类别。事实上,听起来是一个不错的挑战。
如果你最终实现了这样的东西,请在你有机会玩一下之后随时向我提出问题/想法。 如果您最终获得了相当多的转换代码,我们可能希望将其添加到 iText 中。
如果您只想将其推出,并且您有钱可以花,我确信有许多商业转换器可以满足您的需求。 可能不会便宜,但可能比您的开发时间便宜。
Take another look at iText (I know that direct PDF -> RTF conversion isn't supported - but read on for spine tingling possibilities!).
We added a PDF text parsing module to iText last year. Right now, this is somewhat rudimentary, but it does work, and is pretty easy to expand.
iText is good at generating RTF.
So... It should be relatively straightforward (not easy, but straightforward) to parse text from a PDF and create an RTF based on the parsing.
If you have to preserve things like font, it will take a little bit more work (the PDF parser does provide font info, as well as page location for each piece of text), but I suspect that iText's RTF generator will simplify a lot of that.
If your PDFs contain embedded images that you need to bring over to the RTF, the current PDF parser doesn't really do much with that - but it has sufficient hooks to allow it to happen with a bit of elbow grease.
So I would say that iText can most likely do what you are looking for, and will help you achieve a local minimum of development effort, but I wouldn't put this in the class of super easy... Sounds like a nice challenge, actually.
If you do wind up implementing something like this, feel free to ping me with questions/thoughts after you've had a chance to play a bit. If you wind up with a decent bit of converting code, we might want to get it added to iText.
If you are wanting to just get this out the door, and you have money to spend, I'm sure that there are a number of commercial converters that do what you are looking for. Probably won't be cheap, but may be cheaper than your development time.
您可以尝试查看 iText,它主要是一个 PDF 库,但它有一个 RTF 包 插件可用。
You could try looking at iText, which is primarily a PDF library, but it has an RTF package addon available.