大量数据的第三方数据传输

发布于 2024-07-07 20:04:51 字数 368 浏览 8 评论 0原文

有谁知道实时提供大量数据的网站是如何工作的? 我指的是像股票网站这样的东西,他们可以实时告诉你(嗯,大多数情况下有 20 分钟的延迟,但仍然是实时的 - 据我所知是 20 分钟)。

他们每秒都有数千个数据块传送给他们,我想:MSFT 25.00 +.23 VOL 12000 ???? 对于在某个时间间隔内发生变化的每只股票。

那么,是否只是持续不断地进行小推力呢? 或者您是否认为网站会从拥有真实数据的地方提取数据并说“给我自 12:23:45 CST 到现在的所有更改”类型的查询?

我问这个问题是因为在工作中我们可能会遇到这样的情况,我们需要在应用程序的指尖获得这样的实时信息,并且每秒一遍又一遍地点击我们的第三方提供商是没有意义的......

Does anyone know how sites that have a real-time feed of a lot of data work? I am referring to something like a stock site, where they can tell you in real time (well, 20 minute delay mostly, but still real-time - 20 minutes as I understand it).

They have thousands of data pieces delivered to them every second, I would imagine: MSFT 25.00 +.23 VOL 12000 ???? for each stock that had a change during some interval.

So, is there just a constant feed of small pushes going on? Or do you think a site will pull from the place that has the real data and say "give me all changes since 12:23:45 CST to now" type query?

I ask this because at work we might have a situation where we need to have at our application's fingertips real time information like this, and it won't make sense to hit our third party provider over and over and over again every second...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

有深☉意 2024-07-14 20:04:52

我已经这样做了,尝试从源检索股票报价,并在主源失败或超时时回退到报价的带时间戳的磁盘缓存。

I've done this making an attempt to retrieve the stock quote from the source, and falling back to a timestamped on-disk cache of the quote when the main source fails or times out.

不如归去 2024-07-14 20:04:51

通常,两方之间定义了服务器/客户端协议。 在我工作的公司里,这种联系始终保持着。

以下是与您的股票示例相匹配的实时数据源的信息

NYSENASDAQ

数据提供商通常也拥有包含(延迟)批量数据的 FTP 站点。 我想到的一个是 NWS EMWIN

Generally there is a server/client protocol defined between the 2 parties. In the company I work for the connection is maintained at all times.

Here is info on real time data feeds to go with your stock example

NYSE,NASDAQ

It is common for data providers to also have FTP sites with (delayed) batched data. One that comes to mind is the NWS EMWIN

白云悠悠 2024-07-14 20:04:51

Twitter 等网站通过 XMPPWiki 链接)。

Sites like Twitter feed data to certain approved sites in real-time via XMPP (Wiki link).

你没皮卡萌 2024-07-14 20:04:51

从最广泛的角度来看,推送模型将是实现“实时”传输的最佳方式,特别是当您谈论大量数据时。

然而,当使用纯推送模型时,您总是会遇到如何从丢失的数据中恢复的问题。

根据数据的性质,这可能不是问题(将视频传输视为模拟,其中数据量巨大,但有足够的冗余来从丢失的数据中恢复)。 如果您对数据有任何控制权,您也许可以构建一些冗余。例如,在每个更改事件中,您可以提供绝对值而不是更改,或者以前的值和新值。

In the broadest terms, a push model is going to be the best way of achieving "real time" transfer, particularly if you're talking about a large amount of data.

However you do always have a problem when using a purely push model of how to recover from missed data.

Depending on the nature of your data that may not be a problem (thinking of video delivery as an analogue, where the amount of data is huge but there is sufficient redundancy for it to recover from missing data). And if you have any control over the data you may be able to build some redundancy in. For example, on every change event you can provide absolute values rather than changes, or previous value and new value.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文