这个功能如何优化呢? (使用几乎所有的处理能力)

发布于 2024-12-01 22:52:13 字数 3140 浏览 3 评论 0原文

我正在编写一个小游戏来自学 OpenGL 渲染,因为这是我尚未解决的问题之一。我之前使用过 SDL,这个相同的功能虽然仍然表现不佳,但并没有像现在那么出色。

基本上,我的游戏中还没有发生太多事情,只有一些基本的动作和背景绘制。当我切换到 OpenGL 时,似乎速度太快了。我的每秒帧数超过 2000,此功能占用了大部分处理能力。

有趣的是,它的SDL版本中的程序使用了100%的CPU,但运行顺利,而OpenGL版本仅使用了大约40% - 60%的CPU,但似乎对我的显卡造成了负担,以至于我的整个桌面变得没有响应。坏的。

这不是一个太复杂的功能,它根据玩家的 X 和 Y 坐标渲染 1024x1024 背景图块,以给人运动的印象,同时玩家图形本身保持锁定在中心。因为它是一个用于更大屏幕的小图块,所以我必须多次渲染它才能将图块缝合在一起以获得完整的背景。下面代码中的两个 for 循环组合起来迭代 12 次,所以我可以明白为什么每秒调用 2000 次时这是无效的。

言归正传,这就是作恶者:

void render_background(game_t *game)
{
    int bgw;
    int bgh;

    int x, y;

    glBindTexture(GL_TEXTURE_2D, game->art_background);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH,  &bgw);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &bgh);

    glBegin(GL_QUADS);

    /*
     * Start one background tile too early and end one too late
     * so the player can not outrun the background
     */
    for (x = -bgw; x < root->w + bgw; x += bgw)
    {
        for (y = -bgh; y < root->h + bgh; y += bgh)
        {
            /* Offsets */
            int ox = x + (int)game->player->x % bgw;
            int oy = y + (int)game->player->y % bgh;

            /* Top Left */
            glTexCoord2f(0, 0);
            glVertex3f(ox, oy, 0);

            /* Top Right */
            glTexCoord2f(1, 0);
            glVertex3f(ox + bgw, oy, 0);

            /* Bottom Right */
            glTexCoord2f(1, 1);
            glVertex3f(ox + bgw, oy + bgh, 0);

            /* Bottom Left */
            glTexCoord2f(0, 1);
            glVertex3f(ox, oy + bgh, 0);
        }
    }

    glEnd();
}

如果我在游戏循环中通过调用 SDL_Delay(1) 人为地限制速度,我会将 FPS 降低到 ~660 ± 20,我不会出现“性能过度杀伤”。但我怀疑这是否是继续下去的正确方法。

为了完整起见,这些是我的一般渲染和游戏循环函数:

void game_main()
{
    long current_ticks = 0;
    long elapsed_ticks;
    long last_ticks = SDL_GetTicks();

    game_t game;
    object_t player;

    if (init_game(&game) != 0)
        return;

    init_player(&player);
    game.player = &player;

    /* game_init() */
    while (!game.quit)
    {
        /* Update number of ticks since last loop */
        current_ticks = SDL_GetTicks();
        elapsed_ticks = current_ticks - last_ticks;

        last_ticks = current_ticks;

        game_handle_inputs(elapsed_ticks, &game);
        game_update(elapsed_ticks, &game);

        game_render(elapsed_ticks, &game);

        /* Lagging stops if I enable this */
        /* SDL_Delay(1); */
    }

    cleanup_game(&game);


    return;
}

void game_render(long elapsed_ticks, game_t *game)
{
    game->tick_counter += elapsed_ticks;

    if (game->tick_counter >= 1000)
    {
        game->fps = game->frame_counter;
        game->tick_counter = 0;
        game->frame_counter = 0;

        printf("FPS: %d\n", game->fps);
    }

    render_background(game);
    render_objects(game);

    SDL_GL_SwapBuffers();
    game->frame_counter++;

    return;
}

根据 gprof 分析,即使我使用 SDL_Delay() 限制执行,它仍然花费大约 50% 的时间渲染我的背景。

I'm in the process of writing a little game to teach myself OpenGL rendering as it's one of the things I haven't tackled yet. I used SDL before and this same function, while still performing badly, didn't go as over the top as it does now.

Basically, there is not much going on in my game yet, just some basic movement and background drawing. When I switched to OpenGL, it appears as if it's way too fast. My frames per second exceed 2000 and this function uses up most of the processing power.

What is interesting is that the program in it's SDL version used 100% CPU but ran smoothly, while the OpenGL version uses only about 40% - 60% CPU but seems to tax my graphics card in such a way that my whole desktop becomes unresponsive. Bad.

It's not a too complex function, it renders a 1024x1024 background tile according to the player's X and Y coordinates to give the impression of movement while the player graphic itself stays locked in the center. Because it's a small tile for a bigger screen, I have to render it multiple times to stitch the tiles together for a full background. The two for loops in the code below iterate 12 times, combined, so I can see why this is ineffective when called 2000 times per second.

So to get to the point, this is the evil-doer:

void render_background(game_t *game)
{
    int bgw;
    int bgh;

    int x, y;

    glBindTexture(GL_TEXTURE_2D, game->art_background);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_WIDTH,  &bgw);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, GL_TEXTURE_HEIGHT, &bgh);

    glBegin(GL_QUADS);

    /*
     * Start one background tile too early and end one too late
     * so the player can not outrun the background
     */
    for (x = -bgw; x < root->w + bgw; x += bgw)
    {
        for (y = -bgh; y < root->h + bgh; y += bgh)
        {
            /* Offsets */
            int ox = x + (int)game->player->x % bgw;
            int oy = y + (int)game->player->y % bgh;

            /* Top Left */
            glTexCoord2f(0, 0);
            glVertex3f(ox, oy, 0);

            /* Top Right */
            glTexCoord2f(1, 0);
            glVertex3f(ox + bgw, oy, 0);

            /* Bottom Right */
            glTexCoord2f(1, 1);
            glVertex3f(ox + bgw, oy + bgh, 0);

            /* Bottom Left */
            glTexCoord2f(0, 1);
            glVertex3f(ox, oy + bgh, 0);
        }
    }

    glEnd();
}

If I artificially limit the speed by called SDL_Delay(1) in the game loop, I cut the FPS down to ~660 ± 20, I get no "performance overkill". But I doubt that is the correct way to go on about this.

For the sake of completion, these are my general rendering and game loop functions:

void game_main()
{
    long current_ticks = 0;
    long elapsed_ticks;
    long last_ticks = SDL_GetTicks();

    game_t game;
    object_t player;

    if (init_game(&game) != 0)
        return;

    init_player(&player);
    game.player = &player;

    /* game_init() */
    while (!game.quit)
    {
        /* Update number of ticks since last loop */
        current_ticks = SDL_GetTicks();
        elapsed_ticks = current_ticks - last_ticks;

        last_ticks = current_ticks;

        game_handle_inputs(elapsed_ticks, &game);
        game_update(elapsed_ticks, &game);

        game_render(elapsed_ticks, &game);

        /* Lagging stops if I enable this */
        /* SDL_Delay(1); */
    }

    cleanup_game(&game);


    return;
}

void game_render(long elapsed_ticks, game_t *game)
{
    game->tick_counter += elapsed_ticks;

    if (game->tick_counter >= 1000)
    {
        game->fps = game->frame_counter;
        game->tick_counter = 0;
        game->frame_counter = 0;

        printf("FPS: %d\n", game->fps);
    }

    render_background(game);
    render_objects(game);

    SDL_GL_SwapBuffers();
    game->frame_counter++;

    return;
}

According to gprof profiling, even when I limit the execution with SDL_Delay(), it still spends about 50% of the time rendering my background.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

仙女 2024-12-08 22:52:13

打开垂直同步。这样,您计算图形数据的速度将与显示器将其呈现给用户的速度一样快,并且您不会浪费 CPU 或 GPU 周期来计算中间的额外帧,这些帧将被丢弃,因为显示器仍忙于显示前一帧。

Turn on VSYNC. That way you'll calculate graphics data exactly as fast as the display can present it to the user, and you won't waste CPU or GPU cycles calculating extra frames inbetween that will just be discarded because the monitor is still busy displaying a previous frame.

逆光下的微笑 2024-12-08 22:52:13

首先,您不需要渲染图块 x*y 次 - 您可以为其应覆盖的整个区域渲染一次,并使用 GL_REPEAT 让 OpenGL 用它覆盖整个区域。您需要做的就是计算一次正确的纹理坐标,以便图块不会扭曲(拉伸)。为了使其看起来在移动,请在每帧中稍微增加纹理坐标。

现在要限制速度。您想要做的不仅仅是在其中插入 sleep() 调用,而是测量渲染一完整帧所需的时间:

function FrameCap (time_t desiredFrameTime, time_t actualFrameTime)
{
   time_t delay = 1000 / desiredFrameTime;
   if (desiredFrameTime > actualFrameTime)
      sleep (desiredFrameTime - actualFrameTime); // there is a small imprecision here
}

time_t startTime = (time_t) SDL_GetTicks ();
// render frame
FrameCap ((time_t) SDL_GetTicks () - startTime);

有多种方法可以使其更加精确(例如,通过使用 Windows 7 上的性能计数器函数,或者在 Linux 上使用微秒分辨率),但我认为您已经了解了总体思路。这种方法还具有独立于驱动程序的优点,并且与垂直同步耦合不同,允许任意帧速率。

First of all, you don't need to render the tile x*y times - you can render it once for the entire area it should cover and use GL_REPEAT to have OpenGL cover the entire area with it. All you need to do is to compute the proper texture coordinates once, so that the tile doesn't get distorted (stretched). To make it appear to be moving, increase the texture coordinates by a small margin every frame.

Now down to limiting the speed. What you want to do is not to just plug a sleep() call in there, but measure the time it takes to render one complete frame:

function FrameCap (time_t desiredFrameTime, time_t actualFrameTime)
{
   time_t delay = 1000 / desiredFrameTime;
   if (desiredFrameTime > actualFrameTime)
      sleep (desiredFrameTime - actualFrameTime); // there is a small imprecision here
}

time_t startTime = (time_t) SDL_GetTicks ();
// render frame
FrameCap ((time_t) SDL_GetTicks () - startTime);

There are ways to make this more precise (e.g. by using the performance counter functions on Windows 7, or using microsecond resolution on Linux), but I think you get the general idea. This approach also has the advantage of being driver independent and - unlike coupling to V-Sync - allowing an arbitrary frame rate.

梓梦 2024-12-08 22:52:13

在 2000 FPS 下,渲染整个帧只需要 0.5 毫秒。如果您想获得 60 FPS,那么每帧应该花费大约 16 毫秒。为此,首先渲染帧(大约 0.5 毫秒),然后使用 SDL_Delay() 用完剩余的 16 毫秒。

另外,如果您有兴趣分析您的代码(如果您获得 2000 FPS,则不需要!),那么您可能需要使用 高分辨率计时器。这样你就可以准确地知道任何代码块需要多长时间,而不仅仅是你的程序在其中花费了多少时间。

At 2000 FPS it only takes 0.5 ms to render the entire frame. If you want to get 60 FPS then each frame should take about 16 ms. To do this, first render your frame (about 0.5 ms), then use SDL_Delay() to use up the rest of the 16 ms.

Also, if you are interested in profiling your code (which isn't needed if you are getting 2000 FPS!) then you may want to use High Resolution Timers. That way you could tell exactly how long any block of code takes, not just how much time your program spends in it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文