如何将动态站点变成可以从 CD 演示的静态站点?
我需要找到一种方法来抓取我们公司的一个 Web 应用程序,并从中创建一个静态站点,该站点可以刻录到 CD 上,供旅行销售人员用来演示该网站。 后端数据存储分布在许多系统中,因此仅在销售人员笔记本电脑上的虚拟机上运行该网站是行不通的。 而且他们在某些客户那里无法访问互联网(没有互联网,手机......原始,我知道)。
有没有人对可以处理链接清理、flash、一点ajax、css等的爬虫有什么好的建议? 我知道可能性很小,但我想在开始编写自己的工具之前我应该先把这个问题提出来。
I need to find a way to crawl one of our company's web applications and create a static site from it that can be burned to a cd and used by traveling sales people to demo the web site. The back end data store is spread across many, many systems so simply running the site on a VM on the sale person's laptop won't work. And they won't have access to the internet while at some clients (no internet, cell phone....primitive, I know).
Does anyone have any good recommendations for crawlers that can handle things like link cleanup, flash, a little ajax, css, etc? I know odds are slim, but I figured I'd throw the question out here before I jump into writing my own tool.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
只是因为没有人复制粘贴工作命令......我正在尝试......十年后。 :D
这对我来说就像一种魅力。
Just because nobody copy pasted a working command ... I am trying ... ten years later. :D
It worked like a charm for me.
通过使用 WebCrawler,例如其中之一:
By using a WebCrawler, e.g. one of these:
如果不将网络服务器刻录到 CD,您将无法处理 AJAX 请求之类的事情,我知道您已经说过这是不可能的。
wget 将为您下载该网站(使用 -r 参数表示“递归”),但是任何动态内容(例如报告等)当然都无法正常工作,您只会获得一个快照。
You're not going to be able to handle things like AJAX requests without burning a webserver to the CD, which I understand you have already said is impossible.
wget will download the site for you (use the -r parameter for "recursive"), but any dynamic content like reports and so on of course will not work properly, you'll just get a single snapshot.
wget 或curl 都可以递归地跟踪链接并镜像整个站点,因此这可能是一个不错的选择。 不过,您将无法使用网站的真正交互式部分,例如搜索引擎或任何修改数据的内容。
是否有可能创建可以从销售人员的笔记本电脑上运行且应用程序可以与之交互的虚拟后端服务?
wget or curl can both recursively follow links and mirror an entire site, so that might be a good bet. You won't be able to use truly interactive parts of the site, like search engines, or anything that modifies the data, thoguh.
Is it possible at all to create dummy backend services that can run from the sales folks' laptops, that the app can interface with?
如果您最终不得不从网络服务器上运行它,您可能需要查看:
ServerToGo
它允许您从 CD 运行 WAMPP 堆栈,并提供 mysql/php/apache 支持。 数据库在启动时被复制到当前用户的临时目录,并且可以完全运行,而无需用户安装任何东西!
If you do end up having to run it off of a webserver, you might want to take a look at:
ServerToGo
It lets you run a WAMPP stack off of a CD, complete with mysql/php/apache support. The db's are copied to the current users temp directory on launch, and can be run entirely without the user installing anything!