在 vb.net (VS2008) 中从网站检索动态文本
我希望能够从网页检索动态数据(股价)。我首先检索了 html 代码,然后我意识到,由于它是实时数据,因此 html 代码没什么用处。虽然我希望捕获特定数据,但我想做的只是处理我指定的网页,该网页将返回该网站的文本而不是 HTML 代码。基本上整个页面的复制和粘贴会很棒.. 任何想法将非常感激!
I want to be able to retrieve dynamic data from a web page (share prices). I started out by retrieving the html code before I realised that as it is live data, the html code will be of little use. Although I am looking to capture specific data, all i wish to do is process a webpage that I specify which will return the text off that website and not the HTML code. Basically a copy and paste of the entire page would be great..
Any ideas would be really appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
通过解析 HTML 进行“屏幕抓取”是 2000 年代初的事了……我要做的就是阅读 Amazon's Mechnical土耳其人。您可以开发一个排队架构,在其中向该 Mechnical Turk 服务提交 URL。该服务会自动将这些工作分配给用户,然后用户将完成复制和粘贴您所需的有价值的股票报价信息的肮脏任务。世界各地的用户都会焦急地等待将下一个 URL 发送到他们的 Mechanical Turk 收件箱......等待有机会为您的应用程序复制/粘贴另一个股价。当然,更新您的价格可能需要几分钟,但是嘿,它们将由世界各地的真人手工解析!想想可能性吧!
'Screen Scraping' by parsing HTML is so early 2000s...what I would do is read up on Amazon's Mechnical Turk. You can develop a queued architecture where you submit urls to this Mechnical Turk service. The service would automatically distribute these bits of work to users who would then do the dirty task of copying and pasting out the valuable stock quote information you require. Users around the world would anxiously await delivery of the next URL to their Mechanical Turk inbox...pinning for the opportunity to copy/paste out another share price for your application. Sure, it might take a few minutes to update your prices, but hey, they would be HAND parsed by REAL people around the globe! Just think of the possibilities!
嗯,HTML 包含网站的文本,因此您“只”需要解析 HTML。
编辑:如果数据不在 HTML 中而是动态加载,则情况有所不同。正如我所看到的,您有两个选择:
Well, the HTML contains the text of the website, so you "just" need to parse the HTML.
EDIT: If the data is not in the HTML but loaded dynamically, the situation is different. As I can see, you have two options:
是否有可能找到以即用型格式提供的相同数据,而不是为其抓取 HTML?似乎可能有股票报价的公共网络服务。
例如:快速搜索“Stock Price webservice”会出现 http://www.webservicex.net/股票报价.asmx;易于在 .NET 中使用的 ASMX Web 服务。
在您的 Visual Studio 项目中,您应该通过“添加 Web 引用”命令添加对此服务的引用;给出的对话框会有所不同,具体取决于您的项目是针对 .NET 2.0 还是 .NET 3.0/3.5。
我添加了对名为
StockPriceProxy
的服务的引用:Is it possible to find this same data provided in a ready-to-consume format rather than scraping HTML for it? It seems like there's probably public web-services for stock quotes.
For example: A quick search for "Stock price webservice" turned up http://www.webservicex.net/stockquote.asmx; an ASMX web-service that is easy to consume in .NET.
In your Visual Studio project you should be add a reference to this service via the "Add Web Reference" command; the dialog you're given varies depending on whether your project is targeting for .NET 2.0 or .NET 3.0/3.5.
I added a reference to the service named
StockPriceProxy
: