我有问题与这个。
我想(通过编程)测量L3命中(访问)和AMD EPYC 7742 CPU(ZEN2)上的错过。我在Ubuntu Server 20.04.2 LTS上运行Linux内核5.4.0-66代。根据上面链接的问题,事件RFF04(L3LookupState)和R0106(L3COMBCLSTRSTATE)应分别代表L3访问和失误。此外,内核5.4应该支持这些事件。
但是,在用perf进行衡量时,我会遇到问题。类似于上面链接的问题,如果我运行 numActl -c 0 -m 0 perf Stat -e指令,循环,R0106,RFF04 ./benchmark
,我只能测量0个值。如果我尝试使用 numActl -C 0 -M 0 perf Stat -e指令,循环,AMD_L3/R8001/,AMD_L3/R0106/
,则对“未知术语”抱怨。如果我使用perf事件名称,即 numActl -C 0 -M 0 perf Stat -e指令,循环,L3_request_g1.caching_l3_cache_accesses,l3_comb_clstr_state.request.request_miss_miss
perf uptufs perf offect &lt 对于这些事件。
此外,我实际上想使用perf的C API测量它。当前,我使用类型 perf_type_raw
和 config> config> config
设置为,例如,eg,eg, 0x8001
。如何将 amd_l3
pmu的东西获取到我的 perf_event_attr
对象?否则,它等效于 numActl -C 0 -M 0 Perf Stat -E指令,循环,R0106,RFF04 ./benchmark
,它正在测量未定义的值。
I have question related to this one.
I want to (programatically) measure L3 Hits (Accesses) and Misses on an AMD EPYC 7742 CPU (Zen2). I run Linux Kernel 5.4.0-66-generic on Ubuntu Server 20.04.2 LTS. According to the question linked above, the events rFF04 (L3LookupState) and r0106 (L3CombClstrState) should represent the L3 accesses and misses, respectively. Furthermore, Kernel 5.4 should support these events.
However, when measuring it with perf, I run into issues. Similar to the question linked above, if I run numactl -C 0 -m 0 perf stat -e instructions,cycles,r0106,rFF04 ./benchmark
, I only measure 0 values. If I try to use numactl -C 0 -m 0 perf stat -e instructions,cycles,amd_l3/r8001/,amd_l3/r0106/
, perf complains about "unknown terms". If I use the perf event names, i.e. numactl -C 0 -m 0 perf stat -e instructions,cycles,l3_request_g1.caching_l3_cache_accesses, l3_comb_clstr_state.request_miss
perf outputs <not supported>
for these events.
Furthermore, I actually want to measure this using perf's C API. Currently, I dispatch a perf_event_attr
with type PERF_TYPE_RAW
and config
set to, e.g., 0x8001
. How do I get the amd_l3
PMU stuff into my perf_event_attr
object? Otherwise, it would be equivalent to numactl -C 0 -m 0 perf stat -e instructions,cycles,r0106,rFF04 ./benchmark
, which is measuring undefined values.
发布评论
评论(1)
简短答案:尝试
-e RFF0F00000040FF04
参数,该参数显示在您的CPU PPR DOC 。详细的:
也许我可以帮助您解决第一个问题,这在您的第三段中所说的 。第二段中的第二个说,我不能。对不起。
由于您的CPU为'noreflow noreferrer'>'family 23型号49',然后我将31H'。它说使用
l3event [0xff0f00000040ff04]
for'l3 accesses'(0xff0f0000000040ff04
是64bits,它与l3 performance> )。另外,
man perf-list
还显示AMD使用此格式,其中32 -35'位。尽管在PPR文档中,l3pmcx04
没有太多信息,但该文档在l3 Performance Event select select select
中具有一些有用的Infos。我使用了我的cpu ryzen 7 4800H,它是 renoir processor(也是zen2。 “ https://github.com/torvalds/linux/blob/7796916146b8c34cbbef66470abf66470ab8b8b5b28cf47e83/x86/x86/events/events/events/events/amd/core.core.core.core.c.c” 两个,Zen2的配置应该几乎相同。)没有
AMD_L3
支持,在这里我使用ls_dc_accesses
作为表示,并且729 是
的代码
在我的CPU系列中-for-amd-family-17H-Model-60h-revision-a1“ rel =” nofollow noreferrer“ 8个相应的位代表umask。它也可以在上面的代码中找到两个(在您的17H_31H家族中31H-b0-Processors.pdf“ rel =” nofollow noreferrer“> ppr doc p182,该数字为0x430729
):而且不是每个人都有一个EPYC CPU,所以它可能不便利看看您的问题出现在哪里。也许您可以在可能的情况下提供更多有价值的信息。
希望这可以帮助您。
Short answer: Try
-e rFF0F00000040FF04
parameter which is shown in your CPU PPR doc.Detailed:
Maybe I can help you with the first problem which is said in your 3rd paragraph. The second which is said in 4th paragraph, I can't. Sorry.
Since your cpu is 'Family 23 Model 49', then I refered to '17h model 31h' amd PPR doc. It says use
L3Event[0xFF0F00000040FF04]
for 'L3 Accesses ' (0xFF0F00000040FF04
is 64bits which is same asL3 Performance Event Select
width as amd doc shows). Also, theman perf-list
also shows AMD uses this format where it has '32-35' bits. Although in the PPR doc, theL3PMCx04
doesn't have much information, the doc has some useful infos located inL3 Performance Event Select
.I used my cpu ryzen 7 4800h which is 17h_60h family Renoir processor (It is also zen2. From these two source code one which lists some encodings for the AMD CPU and two, zen2's config should be almost same.) which don't have
amd_l3
support, here I usedls_dc_accesses
as the representation and729
is the code ofAll DC Accesses
in my cpu family amd doc wherePMCx029
represents theEventCode
and 8 corresponding bits represent UMask. It can be also found in the above code two (in your 17h_31h family PPR doc p182, the number is0x430729
):And not everyone has one epyc cpu, so it may be not convenient to see where it goes wrong with your problem. Maybe you can offer more valuable information if possible.
Hope this can help you.