如何在Acrobat Reader中搜索PDF并通过参数跳转到特定页面?
我们在 Web 应用程序中使用 lucene 来搜索大量 PDF 文档。
工作流程如下:
用户输入搜索词
搜索结果列表呈现给用户。
每个搜索结果代表一个 PDF 文档,并向用户显示在哪一页找到了搜索词。每个页面都表示为一个超链接。
如果用户现在点击这样的超链接,则直接跳转到该页面。
但是现在用户遇到的问题是搜索词没有在页面上突出显示。因此用户必须自己在页面上查找搜索词。
我们想要的是一种在 PDF 的特定页面上突出显示搜索词的方法。
Acrobat Reader 的开放参数允许搜索 PDF文档(带有命中突出显示)或跳转到特定页面。但是我们需要的两个参数的组合不起作用。
有谁知道如何跳转到 PDF 文档中的页面并突出显示搜索词? 我查看了 Acrobat SDK,但不知道如何使用它(它的文档非常多)。
We are using lucene within a web application to search in a great number of PDF documents.
The workflow is like this:
A user enters a search term
A list of search results is presented to the user.
Each search result represents one PDF document and shows the user on which page the search term was found. Each of these pages is represented as a hyperlink.
If the user now clicks on such a hyperlink, he directly jumps to that page.
But now the user has the problem that the search term isn't highlighted on the page. Therefore the user has to look on his own to find the search term on the page.
What we wanted is a way to highlight the search term on the specific page in the PDF.
The open parameters for Acrobat Reader allow for either searching a PDF document (with hit highlighting) OR jumping to a specific page. But the combination of both parameters - which we would need - doesn't work.
Does anyone have an idea how jumping to a page and highlighting a search term in a PDF document could work?
I had a look at the Acrobat SDK but don't see how we can use it (it's terribly documented).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
acrobat 使用插件来 hilite 术语,并需要一个 fdf 流来指示 hilite 的单词。
请参阅此处的指针:
support.dtsearch.com/dts0152.htm
更新:
假设您知道页面上要突出显示的页号和字号,这里是一种方法:
在网页上:
PDF 将出现在框架中,它将显示工具栏,隐藏导航窗格和状态栏并使页面适合水平。然后它将查询网站以获取用于高亮显示的 xfdf 数据: http://example.com/hilite.aspx?hilite=8e3302ee-ff88-41ee-bdfb-9e8df87cc3ad
这里我使用了之前在会话中使用 hilite xfdf 值保存的 guid 密钥。
hilite.aspx 页面将向文档中的 hilite 单词返回类似以下内容:
这将突出显示第 15 页上从位置 3583 开始的 5 个字符。(注意:尽管相似,xfdf 并不是真正的“XML”)
请注意,acrobat reader 会必须在首选项中选中“启用从外部突出显示服务器搜索突出显示”选项。
acrobat uses a plugin to hilite terms, and requires a fdf stream to indicate the words to hilite.
See here for pointers:
support.dtsearch.com/dts0152.htm
update:
assuming you know the page# and word# on the page to hilight, here is one way to do it:
On web page:
The PDF will appear in the frame, it will show the toolbar, hide the navpane & status bars and fit page to horizontal. Then it will query the web site to get the xfdf data for hilighting: http://example.com/hilite.aspx?hilite=8e3302ee-ff88-41ee-bdfb-9e8df87cc3ad
Here I used a guid key that I previously saved in the session with the hilite xfdf value.
The hilite.aspx page will return something like the following to hilite words in the document:
This will hilight 5 chars on page 15 starting at position 3583. (note: xfdf is not real "XML" despite the similarity)
Note that acrobat reader will have to have the "Enable search highlights from external highlight server" option checked in preferences.
抱歉可能不是答案,但解决方法可能是将 PDF 转换为 html 并使用 Lucene 荧光笔(类似于 Google 的做法)
Sorry might not be an answer, but a workaround could be to covert the PDF to html and use Lucene highlighter (similar to what Google does)
您必须编写一段 Javascript 才能获得您正在寻找的行为。
You'd have to write a snippet of Javascript to get the behavior you are looking for.