如何在AMD(EPYC)处理器上使用RDPMC指令?

发布于 2025-01-25 06:56:56 字数 886 浏览 4 评论 0原文

该程序显示由当前核心执行的实际CPU核心周期的计数(使用我认为是未核心的_core_cycles的相关PMC),

#include <unistd.h>
#include <cstdio>

int main(int argc, char* argv[]){

    unsigned long a, d, c, result;

    c = (1UL<<30)+1;
    __asm__ volatile("rdpmc" : "=a" (a), "=d" (d) : "c" (c));

    result = (a | (d << 32)); 
    printf("Current cycles  : %lu\n", result);

}

它在英特尔处理器上效果很好,但在AMD上显示了“分段故障”(7001和7002)。我的第一个猜测是找到一个与CPU_CLOCKS_UNHALTED AMD事件(0x76)有关的新c值,而我暂时没有成功,

  • 我在英特尔方面没有做任何特别的事情。默认情况下是否启用此PMC?
  • 我该如何使其在AMD上工作?
    • 我试图使用wrmsr命令在这里列出,但他们也给我一个了分割故障“马上
    • 我尝试了以下命令echo 2 | sudo tee/sys/devices/cpu/rdpmc#启用rdpmc始终,不仅是打开perf事件

This program displays the count of actual CPU core cycles executed by the current core (using the related PMC which I believe is UNHALTED_CORE_CYCLES)

#include <unistd.h>
#include <cstdio>

int main(int argc, char* argv[]){

    unsigned long a, d, c, result;

    c = (1UL<<30)+1;
    __asm__ volatile("rdpmc" : "=a" (a), "=d" (d) : "c" (c));

    result = (a | (d << 32)); 
    printf("Current cycles  : %lu\n", result);

}

It works well on Intel processors, but displays a "Segmentation fault" on AMD ones (7001 and 7002). My first guess was to find a new c value related to CPU_CLOCKS_UNHALTED AMD event (0x76) without success for the moment

  • I didn't do anything special on the Intel side. Does this PMC is enabled by default?
  • How can I make it work on AMD?
    • I tried to enable the counter with the wrmsr commands listed here but they also gave me a "Segmentation fault" right away
    • I tried the following command echo 2 | sudo tee /sys/devices/cpu/rdpmc # enable RDPMC always, not just when a perf event is open

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

茶色山野 2025-02-01 06:56:56

数字是错误的,AMD使用的RDPMC值与英特尔不同。根据处理器,通过rdpmc直接支持多个事件,请参阅此 amd手册有关更多信息(e节rdpmc)。

在您的情况下,核心周期编号应为0

此代码适合我计数perf_count_hw_instructions

#include <asm/unistd.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <unistd.h>

static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
                int cpu, int group_fd, unsigned long flags) {
    int ret;

    ret = syscall(__NR_perf_event_open, hw_event, pid, cpu, group_fd, flags);
    return ret;
}

#define rdpmc(counter, low, high)                \
    __asm__ __volatile__("rdpmc"                 \
                         : "=a"(low), "=d"(high) \
                         : "c"(counter))


int main() {
    unsigned long values1, values2;
    unsigned int fixed0, low, high;
    struct perf_event_attr pe;
    int fd, i;

    //PERF_COUNT_HW_INSTRUCTIONS
    // Performance counter 1 on AMD
    // 1 << 30 on Intel
    fixed0 = 1;

    memset(&pe, 0, sizeof(struct perf_event_attr));
    pe.type = PERF_TYPE_HARDWARE;
    pe.size = sizeof(struct perf_event_attr);
    pe.config = PERF_COUNT_HW_INSTRUCTIONS;
    pe.disabled = 1;
    pe.exclude_kernel = 0;
    pe.exclude_hv = 0;
    pe.exclude_idle = 0;

    fd = perf_event_open(&pe, 0, -1, -1, 0);
    if (fd == -1) {
        fprintf(stderr, "Error opening leader %llx\n", pe.config);
        exit(EXIT_FAILURE);
    }
    for (i = 1; i <= 50; i++) {
        ioctl(fd, PERF_EVENT_IOC_RESET, 0);
        ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);

        rdpmc(fixed0, low, high);
        values1 = ((unsigned long)high << 32) + (unsigned long)low;
        asm volatile("lfence": : :"memory");        // test ()
        rdpmc(fixed0, low, high);
        values2 = ((unsigned long)high << 32) + (unsigned long)low;
        
        ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
        printf(" %lu\n", values2-values1);
    }
    close(fd);
}

在Ryzen 7950x上测试

The number is wrong, AMD uses different RDPMC values than Intel. Depending on the processor, multiple events are directly supported through rdpmc, please refer to this AMD manual for further information (section rdpmc).

The core cycle number should be 0 in your case.

This code works for me to count PERF_COUNT_HW_INSTRUCTIONS

#include <asm/unistd.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <unistd.h>

static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
                int cpu, int group_fd, unsigned long flags) {
    int ret;

    ret = syscall(__NR_perf_event_open, hw_event, pid, cpu, group_fd, flags);
    return ret;
}

#define rdpmc(counter, low, high)                \
    __asm__ __volatile__("rdpmc"                 \
                         : "=a"(low), "=d"(high) \
                         : "c"(counter))


int main() {
    unsigned long values1, values2;
    unsigned int fixed0, low, high;
    struct perf_event_attr pe;
    int fd, i;

    //PERF_COUNT_HW_INSTRUCTIONS
    // Performance counter 1 on AMD
    // 1 << 30 on Intel
    fixed0 = 1;

    memset(&pe, 0, sizeof(struct perf_event_attr));
    pe.type = PERF_TYPE_HARDWARE;
    pe.size = sizeof(struct perf_event_attr);
    pe.config = PERF_COUNT_HW_INSTRUCTIONS;
    pe.disabled = 1;
    pe.exclude_kernel = 0;
    pe.exclude_hv = 0;
    pe.exclude_idle = 0;

    fd = perf_event_open(&pe, 0, -1, -1, 0);
    if (fd == -1) {
        fprintf(stderr, "Error opening leader %llx\n", pe.config);
        exit(EXIT_FAILURE);
    }
    for (i = 1; i <= 50; i++) {
        ioctl(fd, PERF_EVENT_IOC_RESET, 0);
        ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);

        rdpmc(fixed0, low, high);
        values1 = ((unsigned long)high << 32) + (unsigned long)low;
        asm volatile("lfence": : :"memory");        // test ()
        rdpmc(fixed0, low, high);
        values2 = ((unsigned long)high << 32) + (unsigned long)low;
        
        ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
        printf(" %lu\n", values2-values1);
    }
    close(fd);
}

Tested on Ryzen 7950X

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文