赛扬 2.8G VS 速龙3000+ 的疑惑

发布于 2022-09-29 16:24:54 字数 6239 浏览 25 评论 0

最近看了一篇文章《让你的软件飞起来》（见附件），讲怎样提升程序的性能，以24位彩色位图转换为灰度图为例子。
我按照文中介绍的方法进行了测试，分别使用公司的机器-赛扬 2.8G，和自己的机器-速龙64位3000+ 1.8G。
本以为速龙的会快一些，结果却恰恰相反，速龙的耗时是赛扬的2倍，很是不解。
以前听说过intel的U浮点运算速度快，但是不使用浮点运算的结果也是一样。
另外，按照《让你的软件飞起来》的说法，浮点运算特别慢，但是我的测试结果显示相差不多。
我用1280*1024的图测试的，在xp下的cygwin中，赛扬的耗时一般是60-80ms，而速龙是160-200ms。

希望得到坛友的帮助，解释下原因。

程序放在这里超长，所以分段放在后面的帖子里了。

test.c：

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <time.h>

typedef struct RGB {
uint8_t b;
uint8_t g;
uint8_t r;
} RGB;

typedef struct {
uint32_t imageSize;
uint32_t blank;
uint32_t startPosition;
} BmpHead;

typedef struct {
uint32_t Length;
uint32_t width;
uint32_t height;
uint16_t colorPlane;
uint16_t bitColor;
uint32_t zipFormat;
uint32_t realSize;
uint32_t xPels;
uint32_t yPels;
uint32_t colorUse;
uint32_t colorImportant;
} InfoHead;

typedef struct BMPImage {
BmpHead bh;
InfoHead ih;
RGB *data;
} BMPImage;

uint8_t rt[256], gt[256], bt[256];
void initRgbTable() {
int i = 0;
for(i=0; i<256; i++) {
      rt[i] = (i*1225) >> 12;
      gt[i] = (i*2404) >> 12;
      bt[i] = (i*467) >> 12;
}
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浅沫记忆 2022-10-06 16:24:54

程序继续：

char *useage = "Useage: %s <multicolor BMP file> <out filename>\n";

bool loadBMPImage(const char *filename, BMPImage *img);
void rgb2gray(BMPImage *img, const char *outfile);

int main(int argc, char *argv[]) {
if(argc < 3) {
printf(useage, argv[0]);
exit(1);
}

BMPImage img;
loadBMPImage(argv[1], &img);

// initRgbTable();

// time_t start = time(NULL);

clock_t start = clock();
rgb2gray(&img, argv[2]);
clock_t end = clock();
printf("past: %ld\n", end - start);

return 0;
}

bool loadBMPImage(const char *filename, BMPImage *img) {
FILE *file;
uint32_t size; // size of the image in bytes.

uint32_t i; // standard counter.

uint16_t planes; // number of planes in image (must be 1)

uint16_t bpp; // number of bits per pixel (must be 24)

printf("Start reading info from BMP file: %s\n", filename);
// make sure the file is there

if(!(file=fopen(filename, "rb"))) {
printf("File not found\n");
return false;
}

fseek(file, 2, SEEK_CUR);

if(fread(&img->bh, 12, 1, file) != 1) {
printf("Error reading BmpHead\n");
return false;
}

if(fread(&img->ih, 40, 1, file) != 1) {
printf("Error reading InfoHead\n");
return false;
}

if(img->ih.colorPlane != 1) {
      printf("Planes is not 1: %u\n", planes);
      return false;
}
if(img->ih.bitColor != 24) {
      printf("Bpp is not 24: %u\n", bpp);
      return false;
}

printf("width: %dpx\n", img->ih.width);
printf("height: %dpx\n", img->ih.height);

// calculate the size (assuming 24 bits or 3 bytes pre pixel).

size = img->ih.width * img->ih.height;

// read the data

img->data = malloc(size*3);
if(!img->data) {
      printf("Error allocating memory fro color-corrected image data\n");
      return false;
}
if(fread(img->data, size*3, 1, file) != 1) {
      printf("Error reading image data\n");
      return false;
}
printf("Finish Reading info from BMP file: %s\n", filename);
/*
for( i=0; i<size; i++) { // reverse all of the colors. (bgr -> rgb)
      img->data.r ^= img->data.b;
      img->data.b ^= img->data.r;
      img->data.r ^= img->data.b;
}
*/
// we're done

fclose(file);
return true;
}

回复收藏 0

迷鸟归林 2022-10-06 16:24:54

程序继续：

void rgb2gray(BMPImage *img, const char *outfile) {
size_t size = img->ih.width * img->ih.height;
int i = 0;
for(i=0; i<size; i++) {
      RGB *rgb = img->data+i;
      // int g = 0.299*rgb->r + 0.587*rgb->g + 0.114*rgb->b;       // 效率：1

      // int g = (299*rgb->r + 587*rgb->g + 114*rgb->b) / 1000; // 效率：1.41

int g = (1225*rgb->r + 2404*rgb->g + 467*rgb->b) >> 12; // 效率：1.63

// int g = rt[rgb->r] + gt[rgb->g] + bt[rgb->b]; // 效率：1.33

rgb->r = rgb->g = rgb->b = g;
}

FILE *out = fopen(outfile,"wb");
if (NULL == out) {
printf("Can not write to file %s.\n", outfile);
exit(1);
}
char hh[2] = {0x42, 0x4D};
fwrite(hh, 2, 1, out);
fwrite(&(img->bh), 12, 1, out);
fwrite(&(img->ih), 40, 1, out);
fwrite(img->data, 3, img->ih.width*img->ih.height, out);
fclose(out);
}

回复收藏 0

烟雨凡馨 2022-10-06 16:24:54

另外，使用的gcc编译器。

回复收藏 0

固执像三岁 2022-10-06 16:24:54

尝试把rgb2gray中的fwrite等等IO去掉再看看时间对比有什么变化没有。

回复收藏 0

隱形的亼 2022-10-06 16:24:54

我想，家里的硬盘是西捷sata 串口 8M 缓存 7200转，不会比公司的慢，所以差距应该不在这里，
就算有差距，也是家里的机器写文件快些吧，当然，只是推测，硬盘写入的机制还不了解，所以也可能是硬盘的原因。

现在公司，不能用速龙测试，
修改之后，赛扬处理1280*1024用 30ms，1024*736用15ms。
回家再测试，不过不能上网，明天再来贴出结果。

回复收藏 0

沐歌 2022-10-06 16:24:54

看了《让你的软件飞起来》，将浮点运算转化为整数运算，其实就是整数运算嘛
你的测试环境不一样，不具可比性吧

回复收藏 0

姐不稀罕 2022-10-06 16:24:54

昨晚在家里测试了，多谢bluster的提醒，忽然想到自己的机器开了google桌面搜索，导致通常移动文件会有延迟响应的问题；
我把计时部分的代码只用于bmp的处理部分，而不再包含写文件，得到的结果和赛扬的一样，对于1280*1024和1024*768分别是30ms和15ms。
原来的计时部分包含了写文件，是因为原来我把bmp的处理部分循环了数百遍，这样写文件的时间就可以忽略了，而且将计时放在bmp的处理函数外，方便些，呵呵。

对于mik的“不具可比性”的推测，我的直观的想法是，配置高些的机器，运行总是要快些的吧，而且现在二者得到了类似的结果，更让人不解；难道赛扬的高频率抵消了速龙的优势？

昨天太忙，只测试了移位操作，其他的忘记了，今晚再测试浮点操作。

回复收藏 0