如何从 image_to_base64 字符串中删除垃圾字符

发布于 2025-01-20 18:40:41 字数 2382 浏览 5 评论 0原文

我将裁剪后的图像编码为 base64 字符串

大多数编码正确,但我注意到有些字符串末尾添加了垃圾 ex1) ~~~ PM3fhtGKOYZ/9k= ex2) ~~~ f8KKKAP//Z

而且我还确认,如果我删除垃圾值,它是一个正确的base64字符串

我认为这是因为分配的字符串的长度,但我不知道问题到底是什么以及如何解决它,所以我请求帮助

这是我的代码

static char encoding_table[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
                            'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
                            'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                            'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
                            'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
                            'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
                            'w', 'x', 'y', 'z', '0', '1', '2', '3',
                            '4', '5', '6', '7', '8', '9', '+', '/'};
static char *decoding_table = NULL;
static int mod_table[] = {0, 2, 1};

void build_decoding_table() {

  decoding_table = (char*)malloc(256);

  for (int i = 0; i < 64; i++)
    decoding_table[(unsigned char) encoding_table[i]] = i;
}

char *base64_encode(const unsigned char *data, uint32_t input_length, size_t *output_length) {

  *output_length = 4 * ((input_length + 2) / 3);

  char *encoded_data = (char*)malloc(*output_length);
  if (encoded_data == NULL) return NULL;

  for (int i = 0, j = 0; i < input_length;) {

    uint32_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
    uint32_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
    uint32_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;
 
    uint32_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

    encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];

  }

  for (int i = 0; i < mod_table[input_length % 3]; i++)
    encoded_data[*output_length - 1 - i] = '=';

  return encoded_data;
}

static char *rt_b64(){
            char *lobi;
            size_t output_length;
            lobi = base64_encode(enc_jpeg_image->outBuffer, enc_jpeg_image->outLen, &output_length); // "enc_jpeg_image" is a structure that holds information about objects in the pipeline
            return lobi;
}

I encoded the cropped image as a base64 string

Most of them are encoded correctly, but I've noticed that some strings have garbage added at the end
ex1) ~~~ PM3fhtGKOYZ/9k=
ex2) ~~~ f8KKKAP//Z

And I also confirmed that if I remove the garbage value, it is a correct base64 string

I think it's because of the length of the allocated string, but I don't know exactly what the problem is and how to solve it, so I ask for help

this is my code

static char encoding_table[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
                            'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
                            'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                            'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
                            'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
                            'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
                            'w', 'x', 'y', 'z', '0', '1', '2', '3',
                            '4', '5', '6', '7', '8', '9', '+', '/'};
static char *decoding_table = NULL;
static int mod_table[] = {0, 2, 1};

void build_decoding_table() {

  decoding_table = (char*)malloc(256);

  for (int i = 0; i < 64; i++)
    decoding_table[(unsigned char) encoding_table[i]] = i;
}

char *base64_encode(const unsigned char *data, uint32_t input_length, size_t *output_length) {

  *output_length = 4 * ((input_length + 2) / 3);

  char *encoded_data = (char*)malloc(*output_length);
  if (encoded_data == NULL) return NULL;

  for (int i = 0, j = 0; i < input_length;) {

    uint32_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
    uint32_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
    uint32_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;
 
    uint32_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

    encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];

  }

  for (int i = 0; i < mod_table[input_length % 3]; i++)
    encoded_data[*output_length - 1 - i] = '=';

  return encoded_data;
}

static char *rt_b64(){
            char *lobi;
            size_t output_length;
            lobi = base64_encode(enc_jpeg_image->outBuffer, enc_jpeg_image->outLen, &output_length); // "enc_jpeg_image" is a structure that holds information about objects in the pipeline
            return lobi;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

顾铮苏瑾 2025-01-27 18:40:41

您忘记添加字符串结束标记 '\0'

因此,您的“字符串”没有定义的结尾,并且只要找不到 '\0',此后的所有字符串处理都会尝试读取。因为在分配空间之后会使用其他字节,例如用于 malloc() 或其他任何内容的内存管理,因此尝试处理“字符串”的函数会越界访问内存。解释为字符,这看起来像“垃圾”,正如你所说的那样。


通过再分配一个字符并标记字符串末尾来解决此问题:

char *base64_encode(const unsigned char *data, size_t input_length, size_t *output_length) {
    *output_length = 4 * ((input_length + 2) / 3);

    char *encoded_data = (char*)malloc(*output_length + 1); /* <-- here */
    if (encoded_data == NULL) {
        return NULL;
    }

    for (size_t i = 0, j = 0; i < input_length; ) {
        size_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
        size_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
        size_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;
 
        size_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

        encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];
    }

    for (size_t i = 0; i < mod_table[input_length % 3]; i++) {
        encoded_data[*output_length - 1 - i] = '=';
    }

    encoded_data[*output_length] = '\0'; /* <-- here */

    return encoded_data;
}

注意:我已经调整了一些类型以减少警告。

You forgot to add the end-of-string marker '\0'.

Therefore your "string" has no defined end, and all string processing thereafter tries to read as long as it does not find a '\0'. Because after the allocated space other bytes are used, for example for memory management of malloc() or anything else, functions trying to handle your "string" are accessing memory out-of-bounds. Interpreted as characters, this looks like "garbage", as you call it.


Solve this issue by allocating one more character and marking the end of the string:

char *base64_encode(const unsigned char *data, size_t input_length, size_t *output_length) {
    *output_length = 4 * ((input_length + 2) / 3);

    char *encoded_data = (char*)malloc(*output_length + 1); /* <-- here */
    if (encoded_data == NULL) {
        return NULL;
    }

    for (size_t i = 0, j = 0; i < input_length; ) {
        size_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
        size_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
        size_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;
 
        size_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

        encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];
    }

    for (size_t i = 0; i < mod_table[input_length % 3]; i++) {
        encoded_data[*output_length - 1 - i] = '=';
    }

    encoded_data[*output_length] = '\0'; /* <-- here */

    return encoded_data;
}

Note: I have adjusted some types for less warnings.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文