绘制数百万行图表
有一个关于图表的快速问题。
需要: 我需要为我的客户实现图表,并且客户数据集包含数百万行。每 10 秒左右就会收集一次有关目标的数据,并且会积累相当多的数据。我需要绘制这些数据的图表。
我查阅了 Google 财经,看看他们是如何做到的,并绘制了 MSFT 图表 http://www.google.com/finance?q=msft
看起来像,位于在任何给定时间,他们都不会绘制所有点。 根据您选择的时间范围,选择和绘制的数据会有所不同。
我想获得一些关于如何处理数百万行数据的信息,并准备好制作像谷歌那样的图表,以及如何使用处理后的数据实现图表的指导。
谢谢 肖恩
Got a quick question on Charting.
Need:
I need to implement charting for my client and the client dataset contains millions of rows. Data is collected about the target every 10 seconds or so, and it builds up quite an amount of data. I Need to chart this data.
I looked up Google Finance to see how they have done it, to chart MSFT
http://www.google.com/finance?q=msft
Looks like, at any given time, they are NOT plotting ALL the points.
Depending on the time-range you select, the data selected and plotted varies.
I would like to get some inputs on how to massage the millions of rows of data, and make it ready to do a graph like that of Google's, and pointers on how to implement the charting with the massaged data.
thanks
Sean
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对于股票图表,标准方法是选择(或计算)多个选项:
您现在可以忽略图表类型。但另外两个也很重要。
例如,如果您在一周内有 100 万个数据点。让用户可以选择按 15 分钟、1 小时或 1 天绘制该周范围内的图表。然后,您只需选择代表每个间隔的开始和结束的数据点。
例如,如果他们选择 1 天,您就选择每天的开盘价和收盘价。
For stock charts the standard way to do it is to select (or calculate) a number of options:
You can ignore chart type for now. But the other two are important.
For instance, if you have 1 million data points over a week period. Give the user the option to chart over that week range by 15 minute, 1 hour or 1 day. Then you just pick the data points that represent the start and end of each interval.
For instance, if they picked 1 day, you pick the opening and closing price of each day.
在 Google Finance 上,数据库中的数据表很神奇,他们在最近的日期上获取更多分辨率的数据,在旧日期上获取较少的分辨率数据,并且他们使用时间线图形(我知道有一些很好的开源东西)。
例如:您从数据库获得今天的一分钟分辨率、上周的一小时分辨率、过去 6 个月的一天分辨率等等。
On Google Finance the magic goes on the data table from the DB, they take more resolution data on nearest dates and less resolution data on old date and they use timeline graphics (I know that there are some good open source stuff).
For instance: You get from DB a minute resolution from today, a hour resolution on last week, a day resolution on last 6 months and so on.
我希望你知道你想向客户提供什么,或者他们知道他们想要什么(什么可以在需求规范中正式化。)
确定您想要捐赠什么可以极大地改变您想要的捐赠方式。
让我们来看一个假设的(但常用的)场景
假设您想在 XY 上显示点(x 时间 y 价格)
1. 假设用户选择粒度。
说1秒。
为用户提供每小时/每天查看的选项(如果每天,则最多查看最后 3/5 天)
2. 假设用户想要查看 1 天的数据
现在您知道您需要生成一个将返回 10 小时 * 60 分钟 * 60 秒刻度的查询
如果用户想要将天数据视为刻度,那么您可以让他选择查看周/月/年...
现在,如果用户看到一年,
如果用户更改分辨率/粒度,则更改窗口,现在您只需要返回(1年*365天)的刻度点。
另一种情况可能是一天的数据有 10 毫秒的滴答声。
在一周或更长的图表上显示 10 毫秒的刻度是没有意义的。
I hope either u know what you want to provide to client OR they know what they want (WHAT could be formalized in Requirement Spec.)
Determining what you want to give can immensely change how you want to do it.
Let take a hypothetical (but commonly used) scenarios
Lets say u want to show point on XY (x-time y-price)
1. Lets say user chooses granularity.
say 1 second.
give option to User to see hourly / daily (if daily then for last 3/5 days at most)
2. Lets say user want to see data for 1 day
Now you know that you need to generated a query that will return 10hrs*60min*60sec ticks
If user wants to see a Days data as a tick, then you give him option of seeing week/month/years...
Again now you just need to return (1yr*365day) of tick point if user is seeing a year
If user is changing the resolution/granularity change the window.
One more scenario could be 10milisec tick on one day of data.
IT is pointless to show 10millisec tick on a Week or more graph.