在 C++ 中调用函数有多少开销？

你在我安 2024-07-13 19:58:55

根据您构建代码的方式，划分为模块和库等单元在某些情况下可能非常重要。

使用具有外部链接的动态库函数在大多数情况下会强制进行完整的堆栈帧处理。
这就是为什么当比较操作像整数比较一样简单时，使用 stdc 库中的 qsort 比使用 stl 代码慢一个数量级（10 倍）。
在模块之间传递函数指针也会受到影响。
同样的惩罚很可能会影响 C++ 的虚函数以及其他函数的使用，这些函数的代码是在单独的模块中定义的。
同样的惩罚很可能
好消息是，整个程序优化可能会解决静态库和模块之间的依赖问题。
好消息是，

回复收藏 0 原文

就此别过 2024-07-13 19:58:54

在大多数架构上，成本包括将所有（或部分，或没有）寄存器保存到堆栈，将函数参数推送到堆栈（或将它们放入寄存器中），递增堆栈指针并跳转到堆栈的开头。新代码。然后，当函数完成时，您必须从堆栈中恢复寄存器。此网页描述了各种调用约定所涉及的内容。

大多数 C++ 编译器现在都足够智能，可以为您内联函数。 inline 关键字只是对编译器的一个提示。有些人甚至会在他们认为有帮助的地方跨翻译单元进行内联。

回复收藏 0 原文

一花一树开 2024-07-13 19:58:54

我针对简单的增量函数做了一个简单的基准测试：

inc.c:

typedef unsigned long ulong;
ulong inc(ulong x){
    return x+1;
}

main.c

#include <stdio.h>
#include <stdlib.h>

typedef unsigned long ulong;

#ifdef EXTERN 
ulong inc(ulong);
#else
static inline ulong inc(ulong x){
    return x+1;
}
#endif

int main(int argc, char** argv){
    if (argc < 1+1)
        return 1;
    ulong i, sum = 0, cnt;
    cnt = atoi(argv[1]);
    for(i=0;i<cnt;i++){
        sum+=inc(i);
    }
    printf("%lu\n", sum);
    return 0;
}

在我的 Intel(R) Core(TM) 上运行十亿次迭代）i5 CPU M 430 @ 2.27GHz 给了我：

内联版本1.4秒
常规版本4.4秒链接版本

（它似乎波动高达 0.2，但我懒得计算适当的标准偏差，也不关心它们）

这表明这台计算机上函数调用的开销约为 3纳秒

我测得的最快速度约为 0.3 纳秒，因此这表明函数调用的成本约为 9 个原始操作，简单地说。

对于通过 PLT（共享库中的函数）调用的函数，每次调用此开销会增加约 2ns（总调用时间约为 6ns）。

I made a simple benchmark against a simple increment function:

inc.c:

typedef unsigned long ulong;
ulong inc(ulong x){
    return x+1;
}

main.c

#include <stdio.h>
#include <stdlib.h>

typedef unsigned long ulong;

#ifdef EXTERN 
ulong inc(ulong);
#else
static inline ulong inc(ulong x){
    return x+1;
}
#endif

int main(int argc, char** argv){
    if (argc < 1+1)
        return 1;
    ulong i, sum = 0, cnt;
    cnt = atoi(argv[1]);
    for(i=0;i<cnt;i++){
        sum+=inc(i);
    }
    printf("%lu\n", sum);
    return 0;
}

Running it with a billion iterations on my Intel(R) Core(TM) i5 CPU M 430 @ 2.27GHz gave me:

1.4 seconds for the inlinining version
4.4 seconds for the regularly linked version

(It appears to fluctuate by up to 0.2 but I'm too lazy to calculate proper standard deviations nor do I care for them)

This suggests that the overhead of function calls on this computer is about 3 nanoseconds

The fastest I measured something at it was about 0.3ns so that would suggest a function call costs about 9 primitive ops, to put it very simplistically.

This overhead increases by about another 2ns per call (total time call time about 6ns) for functions called through a PLT (functions in a shared library).

回复收藏 0 原文

浅暮の光 2024-07-13 19:58:54

这是技术和实践的答案。实际的答案是，这永远不会重要，在极少数情况下，您知道的唯一方法是通过实际的分析测试。

由于编译器优化，您的文献引用的技术答案通常不相关。但如果您仍然感兴趣，乔什。

至于“百分比”，您必须知道该功能本身的成本有多高。除了被调用函数的成本之外，没有百分比，因为您正在与零成本操作进行比较。对于内联代码没有任何成本，处理器只是移动到下一条指令。内联的缺点是代码大小较大，这以与堆栈构建/拆卸成本不同的方式体现其成本。

回复收藏 0 原文

々眼睛长脚气 2024-07-13 19:58:54

你的问题是没有答案的问题之一，可以称之为“绝对真理”。正常函数调用的开销取决于三个因素：

CPU。 x86、PPC 和 ARM CPU 的开销差异很大，即使您只使用一种架构，Intel Pentium 4、Intel Core 2 Duo 和 Intel Core i7 之间的开销也有很大差异。 Intel 和 AMD CPU 之间的开销甚至可能存在显着差异，即使两者以相同的时钟速度运行，因为缓存大小、缓存算法、内存访问模式和调用操作码本身的实际硬件实现等因素可能会产生巨大的差异。对开销的影响。
ABI（应用程序二进制接口）。即使使用相同的 CPU，也经常存在不同的 ABI，它们指定函数调用如何传递参数（通过寄存器、堆栈或两者的组合）以及堆栈帧初始化和清理的位置和方式。所有这些都会影响开销。不同的操作系统可能对同一个CPU使用不同的ABI；例如，Linux、Windows 和 Solaris 三者可能对同一 CPU 使用不同的 ABI。
编译器。仅当在独立代码单元之间调用函数时，严格遵循 ABI 才重要，例如，如果应用程序调用系统库的函数或用户库调用另一个用户库的函数。只要函数是“私有”的，在某个库或二进制文件之外不可见，编译器就可能“作弊”。它可能不严格遵循 ABI，而是使用快捷方式来实现更快的函数调用。例如，它可以在寄存器中传递参数而不是使用堆栈，或者如果不是真正必要的话，它可以完全跳过堆栈帧设置和清理。

如果您想了解上述三个因素的特定组合的开销，例如 Linux 上使用 GCC 的 Intel Core i5，获取此信息的唯一方法是对两种实现之间的差异进行基准测试，一种使用函数调用，另一种使用直接将代码复制到调用者中；这样你就可以强制内联，因为内联语句只是一个提示，并不总是导致内联。

然而，这里真正的问题是：确切的开销真的很重要吗？有一点是肯定的：函数调用总是有开销的。它可能很小，也可能很大，但它确实存在。如果一个函数在性能关键部分被足够频繁地调用，那么无论它有多小，开销都会在某种程度上产生影响。内联很少会让你的代码变慢，除非你做得太过分了。但它会使代码变得更大。今天的编译器非常擅长自行决定何时内联、何时不内联，因此您几乎不必为此绞尽脑汁。

就我个人而言，我在开发过程中完全忽略内联，直到我有一个或多或少可用的产品，我可以对其进行分析，并且只有当分析告诉我某个函数确实经常被调用并且也在应用程序的性能关键部分内调用时，然后我才会考虑此函数的“强制内联”。

到目前为止，我的答案非常通用，它适用于 C，就像适用于 C++ 和 Objective-C 一样。作为结束语，让我特别谈谈 C++：虚拟方法是双重间接函数调用，这意味着它们比普通函数调用具有更高的函数调用开销，而且它们不能内联。非虚拟方法可能会被编译器内联，也可能不会，但即使它们没有内联，它们仍然比虚拟方法快得多，因此您不应该使方法成为虚拟方法，除非您真的打算覆盖它们或让它们被覆盖。

Your question is one of the questions, that has no answer one could call the "absolute truth". The overhead of a normal function call depends on three factors:

The CPU. The overhead of x86, PPC, and ARM CPUs varies a lot and even if you just stay with one architecture, the overhead also varies quite a bit between an Intel Pentium 4, Intel Core 2 Duo and an Intel Core i7. The overhead might even vary noticeably between an Intel and an AMD CPU, even if both run at the same clock speed, since factors like cache sizes, caching algorithms, memory access patterns and the actual hardware implementation of the call opcode itself can have a huge influence on the overhead.
The ABI (Application Binary Interface). Even with the same CPU, there often exist different ABIs that specify how function calls pass parameters (via registers, via stack, or via a combination of both) and where and how stack frame initialization and clean-up takes place. All this has an influence on the overhead. Different operating systems may use different ABIs for the same CPU; e.g. Linux, Windows and Solaris may all three use a different ABI for the same CPU.
The Compiler. Strictly following the ABI is only important if functions are called between independent code units, e.g. if an application calls a function of a system library or a user library calls a function of another user library. As long as functions are "private", not visible outside a certain library or binary, the compiler may "cheat". It may not strictly follow the ABI but instead use shortcuts that lead to faster function calls. E.g. it may pass parameters in register instead of using the stack or it may skip stack frame setup and clean-up completely if not really necessary.

If you want to know the overhead for a specific combination of the three factors above, e.g. for Intel Core i5 on Linux using GCC, your only way to get this information is benchmarking the difference between two implementations, one using function calls and one where you copy the code directly into the caller; this way you force inlining for sure, since the inline statement is only a hint and does not always lead to inlining.

However, the real question here is: Does the exact overhead really matter? One thing is for sure: A function call always has an overhead. It may be small, it may be big, but it is for sure existent. And no matter how small it is if a function is called often enough in a performance critical section, the overhead will matter to some degree. Inlining rarely makes your code slower, unless you terribly overdo it; it will make the code bigger though. Today's compilers are pretty good at deciding themselves when to inline and when not, so you hardly ever have to rack your brain about it.

Personally I ignore inlining during development completely, until I have a more or less usable product that I can profile and only if profiling tells me, that a certain function is called really often and also within a performance critical section of the application, then I will consider "force-inlining" of this function.

So far my answer is very generic, it applies to C as much as it applies to C++ and Objective-C. As a closing word let me say something about C++ in particular: Methods that are virtual are double indirect function calls, that means they have a higher function call overhead than normal function calls and also they cannot be inlined. Non-virtual methods might be inlined by the compiler or not but even if they are not inlined, they are still significant faster than virtual ones, so you should not make methods virtual, unless you really plan to override them or have them overridden.

回复收藏 0 原文

三生一梦 2024-07-13 19:58:54

开销量取决于编译器、CPU 等。开销百分比取决于您内联的代码。唯一了解的方法是获取您的代码并以两种方式对其进行分析 - 这就是为什么没有明确的答案。

回复收藏 0 原文

狼亦尘 2024-07-13 19:58:54

对于非常小的函数，内联是有意义的，因为函数调用的（小的）成本相对于函数体的（非常小的）成本来说是显着的。对于大多数只用几行代码实现的函数来说，这并不是一个很大的胜利。

回复收藏 0 原文

十雾 2024-07-13 19:58:54

值得指出的是，内联函数会增加调用函数的大小，并且任何增加函数大小的内容都可能对缓存产生负面影响。如果您正处于边界，“再多一个薄薄的薄荷”内联代码可能会对性能产生巨大的负面影响。

如果您正在阅读警告“函数调用的成本”的文献，我建议它可能是旧材料，不能反映现代处理器。除非你身处嵌入式世界，否则 C 作为“可移植汇编语言”的时代基本上已经过去了。过去十年（比如说）芯片设计者的大量聪明才智已经融入到各种低级复杂性中，这些复杂性可能与“过去”的工作方式截然不同。

回复收藏 0 原文

椒妓 2024-07-13 19:58:54

有一个很棒的概念，称为“寄存器影子”，它允许通过寄存器（在 CPU 上）而不是堆栈（内存）传递（最多 6 个？）值。此外，根据其中使用的函数和变量，编译器可能会决定不需要帧管理代码！

另外，即使C++编译器也可能会进行“尾递归优化”，即如果A()调用B()，并且在调用B()之后，A刚刚返回，编译器将重用堆栈帧！

当然，这一切都可以完成，前提是程序坚持标准的语义（请参阅指针别名及其对优化的影响）

回复收藏 0 原文

坏尐絯℡ 2024-07-13 19:58:54

现代 CPU 速度非常快（显然！）。几乎每个涉及调用和参数传递的操作都是全速指令（间接调用可能会稍微昂贵，主要是第一次通过循环）。

函数调用开销是如此之小，只有调用函数的循环才能使调用开销相关。

因此，当我们今天谈论（和测量）函数调用开销时，我们通常真正谈论的是无法将公共子表达式提升到循环之外的开销。如果一个函数每次被调用时都必须做一堆（相同的）工作，那么编译器将能够将它“提升”到循环之外，并且如果它是内联的，则只执行一次。当未内联时，代码可能会继续并重复您告诉它的工作！

内联函数看起来快得不可思议，不是因为调用和参数开销，而是因为可以从函数中提升的公共子表达式。

示例：

Foo::result_type MakeMeFaster()
{
  Foo t = 0;
  for (auto i = 0; i < 1000; ++i)
    t += CheckOverhead(SomethingUnpredictible());
  return t.result();
}

Foo CheckOverhead(int i)
{
  auto n = CalculatePi_1000_digits();
  return i * n;
}

优化器可以看穿这种愚蠢行为，并执行以下操作：

Foo::result_type MakeMeFaster()
{
  Foo t;
  auto _hidden_optimizer_tmp = CalculatePi_1000_digits();
  for (auto i = 0; i < 1000; ++i)
    t += SomethingUnpredictible() * _hidden_optimizer_tmp;
  return t.result();
}

似乎调用开销不可能减少，因为它确实将函数的很大一部分从循环中取出（CalculatePi_1000_digits 调用）。编译器需要能够证明CalculatePi_1000_digits 始终返回相同的结果，但优秀的优化器可以做到这一点。

Modern CPUs are very fast (obviously!). Almost every operation involved with calls and argument passing are full speed instructions (indirect calls might be slightly more expensive, mostly the first time through a loop).

Function call overhead is so small, only loops that call functions can make call overhead relevant.

Therefore, when we talk about (and measure) function call overhead today, we are usually really talking about the overhead of not being able to hoist common subexpressions out of loops. If a function has to do a bunch of (identical) work every time it is called, the compiler would be able to "hoist" it out of the loop and do it once if it was inlined. When not inlined, the code will probably just go ahead and repeat the work, you told it to!

Inlined functions seem impossibly faster not because of call and argument overhead, but because of common subexpressions that can be hoisted out of the function.

Example:

Foo::result_type MakeMeFaster()
{
  Foo t = 0;
  for (auto i = 0; i < 1000; ++i)
    t += CheckOverhead(SomethingUnpredictible());
  return t.result();
}

Foo CheckOverhead(int i)
{
  auto n = CalculatePi_1000_digits();
  return i * n;
}

An optimizer can see through this foolishness and do:

Foo::result_type MakeMeFaster()
{
  Foo t;
  auto _hidden_optimizer_tmp = CalculatePi_1000_digits();
  for (auto i = 0; i < 1000; ++i)
    t += SomethingUnpredictible() * _hidden_optimizer_tmp;
  return t.result();
}

It seems like call overhead is impossibly reduced because it really has hoised a big chunk of the function out of the loop (the CalculatePi_1000_digits call). The compiler would need to be able to prove that CalculatePi_1000_digits always returns the same result, but good optimizers can do that.

回复收藏 0 原文

近箐 2024-07-13 19:58:54

这里有几个问题。

如果你有一个足够聪明的编译器，即使你没有指定内联，它也会为你做一些自动内联。另一方面，有很多东西是不能内联的。
如果函数是虚拟的，那么您当然要付出无法内联的代价，因为目标是在运行时确定的。相反，在 Java 中，除非您表明该方法是最终方法，否则您可能会付出这个代价。
根据代码在内存中的组织方式，您可能会因缓存未命中甚至页面未命中而付出代价，因为代码位于其他位置。这最终可能会对某些应用程序产生巨大影响。

回复收藏 0 原文

南街女流氓 2024-07-13 19:58:54

根本没有太多开销，特别是对于小型（可内联）函数甚至类。

以下示例具有三个不同的测试，每个测试都运行很多很多次并定时。结果始终等于时间单位的千分之几的数量级。

#include <boost/timer/timer.hpp>
#include <iostream>
#include <cmath>

double sum;
double a = 42, b = 53;

//#define ITERATIONS 1000000 // 1 million - for testing
//#define ITERATIONS 10000000000 // 10 billion ~ 10s per run
//#define WORK_UNIT sum += a + b
/* output
8.609619s wall, 8.611255s user + 0.000000s system = 8.611255s CPU(100.0%)
8.604478s wall, 8.611255s user + 0.000000s system = 8.611255s CPU(100.1%)
8.610679s wall, 8.595655s user + 0.000000s system = 8.595655s CPU(99.8%)
9.5e+011 9.5e+011 9.5e+011
*/

#define ITERATIONS 100000000 // 100 million ~ 10s per run
#define WORK_UNIT sum += std::sqrt(a*a + b*b + sum) + std::sin(sum) + std::cos(sum)
/* output
8.485689s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (100.0%)
8.494153s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (99.9%)
8.467291s wall, 8.470854s user + 0.000000s system = 8.470854s CPU (100.0%)
2.50001e+015 2.50001e+015 2.50001e+015
*/


// ------------------------------
double simple()
{
   sum = 0;
   boost::timer::auto_cpu_timer t;
   for (unsigned long long i = 0; i < ITERATIONS; i++)
   {
      WORK_UNIT;
   }
   return sum;
}

// ------------------------------
void call6()
{
   WORK_UNIT;
}
void call5(){ call6(); }
void call4(){ call5(); }
void call3(){ call4(); }
void call2(){ call3(); }
void call1(){ call2(); }

double calls()
{
   sum = 0;
   boost::timer::auto_cpu_timer t;

   for (unsigned long long i = 0; i < ITERATIONS; i++)
   {
      call1();
   }
   return sum;
}

// ------------------------------
class Obj3{
public:
   void runIt(){
      WORK_UNIT;
   }
};

class Obj2{
public:
   Obj2(){it = new Obj3();}
   ~Obj2(){delete it;}
   void runIt(){it->runIt();}
   Obj3* it;
};

class Obj1{
public:
   void runIt(){it.runIt();}
   Obj2 it;
};

double objects()
{
   sum = 0;
   Obj1 obj;

   boost::timer::auto_cpu_timer t;
   for (unsigned long long i = 0; i < ITERATIONS; i++)
   {
      obj.runIt();
   }
   return sum;
}
// ------------------------------


int main(int argc, char** argv)
{
   double ssum = 0;
   double csum = 0;
   double osum = 0;

   ssum = simple();
   csum = calls();
   osum = objects();

   std::cout << ssum << " " << csum << " " << osum << std::endl;
}

运行 10,000,000 次迭代（每种类型：简单、六个函数调用、三个对象调用）的输出是使用这个半复杂的工作负载：

sum += std::sqrt(a*a + b*b + sum) + std::sin(sum) + std::cos(sum)

如下所示：

8.485689s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (100.0%)
8.494153s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (99.9%)
8.467291s wall, 8.470854s user + 0.000000s system = 8.470854s CPU (100.0%)
2.50001e+015 2.50001e+015 2.50001e+015

使用的简单工作负载

sum += a + b

给出相同的结果，只是速度快了几个数量级对于每种情况。

There is not much overhead at all, especially with small (inline-able) functions or even classes.

The following example has three different tests that are each run many, many times and timed. The results are always equal to the order of a couple 1000ths of a unit of time.

#include <boost/timer/timer.hpp>
#include <iostream>
#include <cmath>

double sum;
double a = 42, b = 53;

//#define ITERATIONS 1000000 // 1 million - for testing
//#define ITERATIONS 10000000000 // 10 billion ~ 10s per run
//#define WORK_UNIT sum += a + b
/* output
8.609619s wall, 8.611255s user + 0.000000s system = 8.611255s CPU(100.0%)
8.604478s wall, 8.611255s user + 0.000000s system = 8.611255s CPU(100.1%)
8.610679s wall, 8.595655s user + 0.000000s system = 8.595655s CPU(99.8%)
9.5e+011 9.5e+011 9.5e+011
*/

#define ITERATIONS 100000000 // 100 million ~ 10s per run
#define WORK_UNIT sum += std::sqrt(a*a + b*b + sum) + std::sin(sum) + std::cos(sum)
/* output
8.485689s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (100.0%)
8.494153s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (99.9%)
8.467291s wall, 8.470854s user + 0.000000s system = 8.470854s CPU (100.0%)
2.50001e+015 2.50001e+015 2.50001e+015
*/


// ------------------------------
double simple()
{
   sum = 0;
   boost::timer::auto_cpu_timer t;
   for (unsigned long long i = 0; i < ITERATIONS; i++)
   {
      WORK_UNIT;
   }
   return sum;
}

// ------------------------------
void call6()
{
   WORK_UNIT;
}
void call5(){ call6(); }
void call4(){ call5(); }
void call3(){ call4(); }
void call2(){ call3(); }
void call1(){ call2(); }

double calls()
{
   sum = 0;
   boost::timer::auto_cpu_timer t;

   for (unsigned long long i = 0; i < ITERATIONS; i++)
   {
      call1();
   }
   return sum;
}

// ------------------------------
class Obj3{
public:
   void runIt(){
      WORK_UNIT;
   }
};

class Obj2{
public:
   Obj2(){it = new Obj3();}
   ~Obj2(){delete it;}
   void runIt(){it->runIt();}
   Obj3* it;
};

class Obj1{
public:
   void runIt(){it.runIt();}
   Obj2 it;
};

double objects()
{
   sum = 0;
   Obj1 obj;

   boost::timer::auto_cpu_timer t;
   for (unsigned long long i = 0; i < ITERATIONS; i++)
   {
      obj.runIt();
   }
   return sum;
}
// ------------------------------


int main(int argc, char** argv)
{
   double ssum = 0;
   double csum = 0;
   double osum = 0;

   ssum = simple();
   csum = calls();
   osum = objects();

   std::cout << ssum << " " << csum << " " << osum << std::endl;
}

The output for running 10,000,000 iterations (of each type: simple, six function calls, three object calls) was with this semi-convoluted work payload:

sum += std::sqrt(a*a + b*b + sum) + std::sin(sum) + std::cos(sum)

as follows:

8.485689s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (100.0%)
8.494153s wall, 8.486454s user + 0.000000s system = 8.486454s CPU (99.9%)
8.467291s wall, 8.470854s user + 0.000000s system = 8.470854s CPU (100.0%)
2.50001e+015 2.50001e+015 2.50001e+015

Using a simple work payload of

sum += a + b

Gives the same results except a couple orders of magnitude faster for each case.

回复收藏 0 原文

七颜 2024-07-13 19:58:54

正如其他人所说，您实际上不必太担心开销，除非您想要终极性能或类似的东西。当你创建一个函数时，编译器必须编写代码来：

将函数参数保存到堆栈
将返回地址保存到堆栈
跳转到函数的起始地址
为函数的局部变量（堆栈）分配空间
运行函数体
保存返回值（堆栈）
为局部变量释放空间，即垃圾回收
跳转回保存的返回地址
释放参数的保存
等等...

但是，您必须考虑降低代码的可读性，以及它将如何影响您的测试策略、维护计划以及 src 文件的总体大小影响。

回复收藏 0 原文

梦行七里 2024-07-13 19:58:54

每个新函数都需要创建一个新的本地堆栈。但是，只有当您在大量迭代中的循环的每次迭代中调用函数时，这种开销才会很明显。

回复收藏 0 原文

后eg是否自 2024-07-13 19:58:54

对于大多数函数来说，在 C++ 与 C 中调用它们没有额外的开销（除非您将“this”指针视为每个函数的不必要参数。您必须以某种方式将状态传递给函数）...

对于虚拟函数，它们是额外的间接级别（相当于通过 C 中的指针调用函数）...但实际上，在当今的硬件上，这是微不足道的。

回复收藏 0 原文

在 C++ 中调用函数有多少开销？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（15）

关于作者

相关话题

热门标签

推荐作者

知足的幸福

我一向站在原地

慕烟庭风

秉忠贞之诚守退让之实

小兔几

mb_3y7WUgWY

友情链接

在 C++ 中调用函数有多少开销？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（15）

关于作者

相关话题

热门标签

推荐作者

知足的幸福

我一向站在原地

慕烟庭风

秉忠贞之诚 守退让之实

小兔几

mb_3y7WUgWY

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

秉忠贞之诚守退让之实