分析 Mnesia 查询

发布于 2024-08-13 10:54:59 字数 1213 浏览 11 评论 0原文

我们的 Mnesia DB 运行缓慢,我们认为它应该更快一些。

因此,我们需要对其进行分析并弄清楚发生了什么。

有许多选项可供选择:

  • 运行 fprof 并查看时间都花在哪里
  • 运行 cprof 并查看哪些函数被调用很多

但是,这些都是相当标准的性能监控风格工具。问题是我实际上如何进行查询分析 - 哪些查询花费的时间最长。如果我们是 Oracle 或 MySQL 商店,我们将只运行一个查询分析器,它将返回需要很长时间运行的查询类型。这似乎不是 Mnesia 可用的工具。

所以问题是:

  • 存在哪些技术来分析 Mnesia
  • 存在哪些工具来分析 Mnesia - 我认为没有,但证明我错了:)
  • 你是如何分析你的查询并优化你的 mnesia 数据库安装的

根据讨论展开的

问题之一使用 fprof 作为分析工具的原因是它仅告诉您正在查看的特定查询。所以 fprof 告诉我 X 很慢,我调整它以加快速度。然后,你瞧,操作 Y(本来就足够快)现在却慢得要命。所以我分析了 Y 并意识到让 Y 快的方法就是让 X 慢。因此,我最终进行了一系列双边权衡……

我实际上需要是一种管理多边权衡的方法。我现在记录了 2 个可以重播的实际用户活动指标负载。这些日志代表了我想要优化的内容。

SQL 数据库上的“正确”查询分析器将能够分析 SQL 语句的结构,例如具有以下形式的所有语句:

SELECT [fieldset] FROM [table] WHERE {field = *parameter*}, {field = *parameter*}

并说这种形式的 285 个查询平均运行时间为 0.37 毫秒

神奇的答案是这样的:这种形式的 17 个查询运行了 6.34 秒,并对表 X 进行了全表扫描,你应该在字段 Y 上放置索引

当我有这样的结果集时然后,通过一组有代表性的用户活动,我可以开始全面考虑权衡,并设计一个测试模式。

测试模式类似于:

  • 活动 X 将进行查询 A、C C 更快,但查询 E 和 F 较慢的
  • 测试和测量
  • 然后批准/不批准

我已经使用Erlang足够长的时间来“知道”没有这样的查询分析器,我想知道的是其他人(他们一定有这个问题)如何“推理”记忆优化。

Our Mnesia DB is running slowly and we think it should be somewhat faster.

So we need to profile it and work out what is happening.

There are a number of options that suggest themselves:

  • run fprof and see where the time is going
  • run cprof and see which functions are called a lot

However these are both fairly standard performance monitoring style tools. The question is how do I actually do query profiling - which queries are taking the longest times. If we were an Oracle or MySQL shop we would just run a query profiler which would return the sorts of queries that were taking a long time to run. This is not a tool that appears to be available for Mnesia.

So the question is:

  • what techniques exist to profile Mnesia
  • what tools exist to profile Mnesia - none I think, but prove me wrong :)
  • how did you profile your queries and optimise your mnesia database installation

Expanded In The Light Of Discussion

One of the problems with fprof as a profiling tool is that it only tells you about the particular query you are looking at. So fprof tells me that X is slow and I tweak it down to speed it up. Then, low and behold, operation Y (which was fast enough) is now dog slow. So I profile up Y and realise that the way to make Y quick is to make X slow. So I end up doing a series of bilateral trade-offs...

What I actually need is a way to manage multilateral trade-offs. I now have 2 metric shed-loads of actual user activities logged which I can replay. These logs represent what I would like to optimize.

A 'proper' query analyser on an SQL database would be able to profile the structure of SQL statements, eg all statements with the form:

SELECT [fieldset] FROM [table] WHERE {field = *parameter*}, {field = *parameter*}

and say 285 queries of this form took on average 0.37ms to run

They magic answers are when it says: 17 queries of this form took 6.34s to run and did a full table scan on table X, you should put an index on field Y

When I have a result set like this over a representative set of user-activities I can then start to reason about trade-offs in the round - and design a test pattern.

The test pattern would be something like:

  • activity X would make queries A, C
    and C faster but queries E and F
    slower
  • test and measure
  • then approve/disapprove

I have been using Erlang long enough to 'know' that there is no query analyser like this, what I would like to know is how other people (who must have had this problem) 'reason' about mnesia optimization.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

甜是你 2024-08-20 10:54:59

我犹豫了,因为我对 Erlang 或 Mnesia 不太了解,但我对性能调优了解很多,而且从到目前为止的讨论来看,这听起来很典型。

这些工具 fprof 等听起来像大多数从 gprof 获得基本方法的工具,即检测函数、计算调用、对程序计数器进行采样等。很少有人拥有长时间检查了该实践的基础。对于此类工具的用户来说,您的挫败感听起来很典型。

您可能会考虑一种不太为人所知的方法,此处概述。它基于随机抽取少量(10-20)个程序状态样本,并理解每个样本,而不是进行总结。通常,这意味着检查调用堆栈,但您可能还想检查其他信息。有不同的方法可以做到这一点,但我只是使用调试器中的暂停按钮。我并不是想获得精确的时间或调用计数。这些充其量只是间接线索。相反,我会询问每个样本“它在做什么以及为什么?”如果我发现它正在执行某些特定活动,例如执行 X 查询,其中它正在寻找 y 类型的答案,目的是 z ,并且它在多个样本上执行此操作,那么它执行此操作的样本比例是对其执行此操作的时间比例的粗略但可靠的估计。我很可能可以对此做点什么,并获得良好的加速。

以下是该方法的使用案例研究。

I hung back because I don't know much about either Erlang or Mnesia, but I know a lot about performance tuning, and from the discussion so far it sounds pretty typical.

These tools fprof etc. sound like most tools that get their fundamental approach from gprof, namely instrumenting functions, counting invocations, sampling the program counter, etc. Few people have examined the foundations of that practice for a long time. Your frustrations sound typical for users of tools like that.

There's a method that is less-known that you might consider, outlined here. It is based on taking a small number (10-20) of samples of the state of the program at random times, and understanding each one, rather than summarizing. Typically, this means examining the call stack, but you may want to examine other information as well. There are different ways to do this, but I just use the pause button in a debugger. I'm not trying to get precise timing or invocation counts. Those are indirect clues at best. Instead I ask of each sample "What is it doing and why?" If I find that it is doing some particular activity, such as performing the X query where it's looking for y type answer for the purpose z, and it's doing it on more than one sample, then the fraction of samples it's doing it on is a rough but reliable estimate of what fraction of the time it is doing that. Chances are good that it is something I can do something about, and get a good speedup.

Here's a case study of the use of the method.

青芜 2024-08-20 10:54:59

由于 Mnesia 查询只是 erlang 函数,我想您可以像分析自己的 erlang 代码一样分析它们。 http://www.erlang.org/doc/efficiency_guide/profiling.html# id2266192 提供了有关可用 erlang 分析工具的更多信息。

更新 作为一项测试,我在家里的测试 mnesia 实例上运行了此命令,并使用 fprof 跟踪 mnesia qlc 查询返回的输出,我在下面包含了该示例。因此它肯定包含比查询调用更多的信息。

....
{[{{erl_lint,pack_errors,1},                      2,    0.004,    0.004}],     
 { {lists,map,2},                                 2,    0.004,    0.004},     %
 [ ]}.

{[{{mnesia_tm,arrange,3},                         1,    0.004,    0.004}],     
 { {ets,first,1},                                 1,    0.004,    0.004},     %
 [ ]}.

{[{{erl_lint,check_remote_function,5},            2,    0.004,    0.004}],     
 { {erl_lint,check_qlc_hrl,5},                    2,    0.004,    0.004},     %
 [ ]}.

{[{{mnesia_tm,multi_commit,4},                    1,    0.003,    0.003}],     
 { {mnesia_locker,release_tid,1},                 1,    0.003,    0.003},     %
 [ ]}.

{[{{mnesia,add_written_match,4},                  1,    0.003,    0.003}],     
 { {mnesia,add_match,3},                          1,    0.003,    0.003},     %
 [ ]}.

{[{{mnesia_tm,execute_transaction,5},             1,    0.003,    0.003}],     
 { {erlang,erase,1},                              1,    0.003,    0.003},     %
 [ ]}.

{[{{mnesia_tm,intercept_friends,2},               1,    0.002,    0.002}],     
 { {mnesia_tm,intercept_best_friend,2},           1,    0.002,    0.002},     %
 [ ]}.

{[{{mnesia_tm,execute_transaction,5},             1,    0.002,    0.002}],     
 { {mnesia_tm,flush_downs,0},                     1,    0.002,    0.002},     %
 [ ]}.

{[{{mnesia_locker,rlock_get_reply,4},             1,    0.002,    0.002}],     
 { {mnesia_locker,opt_lookup_in_client,3},        1,    0.002,    0.002},     %
 [ ]}.

{[ ],
 { undefined,                                     0,    0.000,    0.000},     %
 [{{shell,eval_exprs,6},                          0,   18.531,    0.000},      
  {{shell,exprs,6},                               0,    0.102,    0.024},      
  {{fprof,just_call,2},                           0,    0.034,    0.027}]}.

Since Mnesia queries are just erlang functions I would imagine you can profile them the same way you would profile your own erlang code. http://www.erlang.org/doc/efficiency_guide/profiling.html#id2266192 has more information on the erlang profiling tools available.

Update As a test I ran this at home on a test mnesia instance and using fprof to trace an mnesia qlc query returned output a sample of which I'm including below. So it definitely includes more information than just the query call.

....
{[{{erl_lint,pack_errors,1},                      2,    0.004,    0.004}],     
 { {lists,map,2},                                 2,    0.004,    0.004},     %
 [ ]}.

{[{{mnesia_tm,arrange,3},                         1,    0.004,    0.004}],     
 { {ets,first,1},                                 1,    0.004,    0.004},     %
 [ ]}.

{[{{erl_lint,check_remote_function,5},            2,    0.004,    0.004}],     
 { {erl_lint,check_qlc_hrl,5},                    2,    0.004,    0.004},     %
 [ ]}.

{[{{mnesia_tm,multi_commit,4},                    1,    0.003,    0.003}],     
 { {mnesia_locker,release_tid,1},                 1,    0.003,    0.003},     %
 [ ]}.

{[{{mnesia,add_written_match,4},                  1,    0.003,    0.003}],     
 { {mnesia,add_match,3},                          1,    0.003,    0.003},     %
 [ ]}.

{[{{mnesia_tm,execute_transaction,5},             1,    0.003,    0.003}],     
 { {erlang,erase,1},                              1,    0.003,    0.003},     %
 [ ]}.

{[{{mnesia_tm,intercept_friends,2},               1,    0.002,    0.002}],     
 { {mnesia_tm,intercept_best_friend,2},           1,    0.002,    0.002},     %
 [ ]}.

{[{{mnesia_tm,execute_transaction,5},             1,    0.002,    0.002}],     
 { {mnesia_tm,flush_downs,0},                     1,    0.002,    0.002},     %
 [ ]}.

{[{{mnesia_locker,rlock_get_reply,4},             1,    0.002,    0.002}],     
 { {mnesia_locker,opt_lookup_in_client,3},        1,    0.002,    0.002},     %
 [ ]}.

{[ ],
 { undefined,                                     0,    0.000,    0.000},     %
 [{{shell,eval_exprs,6},                          0,   18.531,    0.000},      
  {{shell,exprs,6},                               0,    0.102,    0.024},      
  {{fprof,just_call,2},                           0,    0.034,    0.027}]}.
锦上情书 2024-08-20 10:54:59

Mike Dunlavey 的建议让我想起了 redbug允许您对生产系统中的调用进行采样。将其视为一个易于使用的 erlang:trace,但它并没有给您足够的绳索来悬挂您的生产系统。

使用类似这样的调用应该会为您提供大量堆栈跟踪,以识别从何处调用您的 mnesia 事务:

redbug:start(10000,100,{mnesia,transaction,[stack]}).

不过,不可能获取这些跟踪的调用持续时间。

如果您已将所有 mnesia 查找组织到导出 api 来执行它们的模块中,您还可以使用 redbug 仅获取特定查询的调用频率。

Mike Dunlavey's suggestion reminds me about redbug that allow you to sample calls in production systems. Think of it as an easy-to-use erlang:trace that doesnt give you enough rope to hang your production system.

Using something like this call should give you lots of stack traces to identify where your mnesia transactions are called from:

redbug:start(10000,100,{mnesia,transaction,[stack]}).

Its not possible to get call duration for these traces though.

If you have organized all mnesia lookups into modules that export an api to perform them, you could also use redbug to get a call-frequency on specific queries only.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文