门户上下文中反向代理和屏幕抓取的定义
我不确定这些术语在门户上下文中是否正确。
我所说的门户是指符合 JSR-286 的门户框架,例如 Liferay 或 Jetspeed,并且查询与 Liferay 提供的此 portlet 相关。
http://www.liferay.com/community/wiki //wiki/Main/Web+Proxy+Portlet
“反向代理”与“屏幕抓取”相同,因为在这两种情况下,门户都充当中介并(可选)在给出响应之前转换下游请求返回给客户端。
I'm not sure if these terminologies are correct in a Portal context.
By Portal I mean a JSR-286 compliant portal framework like Liferay or Jetspeed and the query is related to this portlet available from Liferay.
http://www.liferay.com/community/wiki/-/wiki/Main/Web+Proxy+Portlet
Is "Reverse Proxying" the same as "Screen Scraping" because in both cases the Portal acts as an intermediary and (optionally) transforms a downstream request before giving the response back to the client.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
虽然这两种行为(代理和抓取)具有共同的特征,但意图不同。屏幕抓取通常是读取页面并尝试从页面中提取数据或含义,然后再在其他地方使用该数据。这可能会导致显示的页面包含抓取的信息,但它实际上可以用于任何过程。
相反,如果您想要获取外部资源(例如页面)并将其作为 Portlet 内容插入 Liferay 生成的页面中,则您将获取整个页面内容,但 Portlet 1.0 和 2.0 规范对内容设置了限制和其他规则。可以添加(例如不能包含 html、head 或 body 标签)以及它必须符合的其他行为。实现此目的的最简单方法是将其包含为 iframe,但 portletbridge 项目旨在包装传入内容并调整内容以使其在 portlet 上下文中有效,以及管理远程页面的其他方面,例如 css、链接和身份验证,以便生成的 portlet 与页面的其余部分以及整个门户应用程序干净地集成。
While the two behaviours (proxying and scraping) have common features, the intent is different. Screen scraping is typically reading the page and attempting to extract data or meaning from the page before using that data somewhere else. That may result in a page being displayed including the scraped information, but it could be used for any process really.
If instead you want to get an external resource (such as a page) and plug that into a Liferay generated page as portlet content, you would take the entire page content, except that the Portlet 1.0 and 2.0 specs place restrictions and other rules on what can be added (eg cannot include html, head or body tags) and other behaviour that it must conform to. The easiest way to accomplish this is to include it as an iframe, but the portletbridge project aims to wrap incoming content and massage the content to be valid in a portlet context, as well as manage other aspects of the remote page such as css, links and authentication so that the resulting portlet integrates cleanly with the rest of the page and the portal application as a whole.