如何为移动设备上的大文件获取快速文件哈希算法

发布于 2025-01-05 09:29:59 字数 10565 浏览 1 评论 0原文

序言
然而,我在测试 100Mb 文件上的 md5、adler32 和 crc32 过程中发现的一个重要发现是,奇怪的是它们花费的时间相同。这只能意味着我猜测的两件事之一,即在 Android 设备上,文件系统是瓶颈,它无法足够快地提供算法,或者我在实现 JNI 时犯了一个基本错误,后者是我可以忍受的。 em>

使用 MD5 对图像、mp3 和 10Mb 以下的文件等小文件进行哈希处理只需几秒钟 算法。

我的问题是我的文件大小超过 100-700MB。

我的要求是下载的文件需要与原始源文件匹配。

我做了一些测试,为一个大小为 100Mb 的文件生成 MD5 哈希值。

在设备 HTC Desire Android v2.2 上,我运行 jni 本机测试和 java MessageDigest.getInstance("MD5"); 测试。

两个测试都计算了同一文件的 MD5,并且两个测试运行的时间长度相同(1-2 分钟)。我已经关闭了调试。

据我了解,本机测试会更快。

在上述设备上,如何才能将 100MB 的哈希时间缩短至 10-15 秒。
这样做的代价当然是碰撞精度,但我可以忍受哈希值的百万分之一不一样。

更新 我不是 C 专家,但这是我的 MD5 测试 C 代码。这个速度比 Java MessageDigest 快不了多少。感觉就像我在 Android 主 UI 线程上运行。

#include <android/log.h>
#include <stdio.h>
#include <sys/types.h>
#include <time.h>
#include <string.h>
#include <inttypes.h>
#include <jni.h>
#include <stdlib.h>
/* typedef a 32 bit type */
typedef unsigned long int UINT4;

/* Data structure for MD5 (Message Digest) computation */
typedef struct {
  UINT4 i[2];                   /* number of _bits_ handled mod 2^64 */
  UINT4 buf[4];                                    /* scratch buffer */
  unsigned char in[64];                              /* input buffer */
  unsigned char digest[16];     /* actual digest after MD5Final call */
} MD5_CTX;

void MD5Init ();
void MD5Update ();
void MD5Final ();



/* forward declaration */
static void Transform ();

static unsigned char PADDING[64] = {
  0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};

/* F, G and H are basic MD5 functions: selection, majority, parity */
#define F(x, y, z) (((x) & (y)) | ((~x) & (z)))
#define G(x, y, z) (((x) & (z)) | ((y) & (~z)))
#define H(x, y, z) ((x) ^ (y) ^ (z))
#define I(x, y, z) ((y) ^ ((x) | (~z)))

/* ROTATE_LEFT rotates x left n bits */
#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32-(n))))

/* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4 */
/* Rotation is separate from addition to prevent recomputation */
#define FF(a, b, c, d, x, s, ac) \
  {(a) += F ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define GG(a, b, c, d, x, s, ac) \
  {(a) += G ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define HH(a, b, c, d, x, s, ac) \
  {(a) += H ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define II(a, b, c, d, x, s, ac) \
  {(a) += I ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }

void MD5Init (mdContext)
MD5_CTX *mdContext;
{
  mdContext->i[0] = mdContext->i[1] = (UINT4)0;

  /* Load magic initialization constants.
   */
  mdContext->buf[0] = (UINT4)0x67452301;
  mdContext->buf[1] = (UINT4)0xefcdab89;
  mdContext->buf[2] = (UINT4)0x98badcfe;
  mdContext->buf[3] = (UINT4)0x10325476;
}

void MD5Update (mdContext, inBuf, inLen)
MD5_CTX *mdContext;
unsigned char *inBuf;
unsigned int inLen;
{
  UINT4 in[16];
  int mdi;
  unsigned int i, ii;

  /* compute number of bytes mod 64 */
  mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

  /* update number of bits */
  if ((mdContext->i[0] + ((UINT4)inLen << 3)) < mdContext->i[0])
    mdContext->i[1]++;
  mdContext->i[0] += ((UINT4)inLen << 3);
  mdContext->i[1] += ((UINT4)inLen >> 29);

  while (inLen--) {
    /* add new character to buffer, increment mdi */
    mdContext->in[mdi++] = *inBuf++;

    /* transform if necessary */
    if (mdi == 0x40) {
      for (i = 0, ii = 0; i < 16; i++, ii += 4)
        in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
                (((UINT4)mdContext->in[ii+2]) << 16) |
                (((UINT4)mdContext->in[ii+1]) << 8) |
                ((UINT4)mdContext->in[ii]);
      Transform (mdContext->buf, in);
      mdi = 0;
    }
  }
}

void MD5Final (mdContext)
MD5_CTX *mdContext;
{
  UINT4 in[16];
  int mdi;
  unsigned int i, ii;
  unsigned int padLen;

  /* save number of bits */
  in[14] = mdContext->i[0];
  in[15] = mdContext->i[1];

  /* compute number of bytes mod 64 */
  mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

  /* pad out to 56 mod 64 */
  padLen = (mdi < 56) ? (56 - mdi) : (120 - mdi);
  MD5Update (mdContext, PADDING, padLen);

  /* append length in bits and transform */
  for (i = 0, ii = 0; i < 14; i++, ii += 4)
    in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
            (((UINT4)mdContext->in[ii+2]) << 16) |
            (((UINT4)mdContext->in[ii+1]) << 8) |
            ((UINT4)mdContext->in[ii]);
  Transform (mdContext->buf, in);

  /* store buffer in digest */
  for (i = 0, ii = 0; i < 4; i++, ii += 4) {
    mdContext->digest[ii] = (unsigned char)(mdContext->buf[i] & 0xFF);
    mdContext->digest[ii+1] =
      (unsigned char)((mdContext->buf[i] >> 8) & 0xFF);
    mdContext->digest[ii+2] =
      (unsigned char)((mdContext->buf[i] >> 16) & 0xFF);
    mdContext->digest[ii+3] =
      (unsigned char)((mdContext->buf[i] >> 24) & 0xFF);
  }
}

/* Basic MD5 step. Transform buf based on in.
 */
static void Transform (buf, in)
UINT4 *buf;
UINT4 *in;
{
  UINT4 a = buf[0], b = buf[1], c = buf[2], d = buf[3];

  /* Round 1 */
#define S11 7
#define S12 12
#define S13 17
#define S14 22
  FF ( a, b, c, d, in[ 0], S11, 3614090360u); /* 1 */
  FF ( d, a, b, c, in[ 1], S12, 3905402710u); /* 2 */
  FF ( c, d, a, b, in[ 2], S13,  606105819u); /* 3 */
  FF ( b, c, d, a, in[ 3], S14, 3250441966u); /* 4 */
  FF ( a, b, c, d, in[ 4], S11, 4118548399u); /* 5 */
  FF ( d, a, b, c, in[ 5], S12, 1200080426u); /* 6 */
  FF ( c, d, a, b, in[ 6], S13, 2821735955u); /* 7 */
  FF ( b, c, d, a, in[ 7], S14, 4249261313u); /* 8 */
  FF ( a, b, c, d, in[ 8], S11, 1770035416u); /* 9 */
  FF ( d, a, b, c, in[ 9], S12, 2336552879u); /* 10 */
  FF ( c, d, a, b, in[10], S13, 4294925233u); /* 11 */
  FF ( b, c, d, a, in[11], S14, 2304563134u); /* 12 */
  FF ( a, b, c, d, in[12], S11, 1804603682u); /* 13 */
  FF ( d, a, b, c, in[13], S12, 4254626195u); /* 14 */
  FF ( c, d, a, b, in[14], S13, 2792965006u); /* 15 */
  FF ( b, c, d, a, in[15], S14, 1236535329u); /* 16 */

  /* Round 2 */
#define S21 5
#define S22 9
#define S23 14
#define S24 20
  GG ( a, b, c, d, in[ 1], S21, 4129170786u); /* 17 */
  GG ( d, a, b, c, in[ 6], S22, 3225465664u); /* 18 */
  GG ( c, d, a, b, in[11], S23,  643717713u); /* 19 */
  GG ( b, c, d, a, in[ 0], S24, 3921069994u); /* 20 */
  GG ( a, b, c, d, in[ 5], S21, 3593408605u); /* 21 */
  GG ( d, a, b, c, in[10], S22,   38016083u); /* 22 */
  GG ( c, d, a, b, in[15], S23, 3634488961u); /* 23 */
  GG ( b, c, d, a, in[ 4], S24, 3889429448u); /* 24 */
  GG ( a, b, c, d, in[ 9], S21,  568446438u); /* 25 */
  GG ( d, a, b, c, in[14], S22, 3275163606u); /* 26 */
  GG ( c, d, a, b, in[ 3], S23, 4107603335u); /* 27 */
  GG ( b, c, d, a, in[ 8], S24, 1163531501u); /* 28 */
  GG ( a, b, c, d, in[13], S21, 2850285829u); /* 29 */
  GG ( d, a, b, c, in[ 2], S22, 4243563512u); /* 30 */
  GG ( c, d, a, b, in[ 7], S23, 1735328473u); /* 31 */
  GG ( b, c, d, a, in[12], S24, 2368359562u); /* 32 */

  /* Round 3 */
#define S31 4
#define S32 11
#define S33 16
#define S34 23
  HH ( a, b, c, d, in[ 5], S31, 4294588738u); /* 33 */
  HH ( d, a, b, c, in[ 8], S32, 2272392833u); /* 34 */
  HH ( c, d, a, b, in[11], S33, 1839030562u); /* 35 */
  HH ( b, c, d, a, in[14], S34, 4259657740u); /* 36 */
  HH ( a, b, c, d, in[ 1], S31, 2763975236u); /* 37 */
  HH ( d, a, b, c, in[ 4], S32, 1272893353u); /* 38 */
  HH ( c, d, a, b, in[ 7], S33, 4139469664u); /* 39 */
  HH ( b, c, d, a, in[10], S34, 3200236656u); /* 40 */
  HH ( a, b, c, d, in[13], S31,  681279174u); /* 41 */
  HH ( d, a, b, c, in[ 0], S32, 3936430074u); /* 42 */
  HH ( c, d, a, b, in[ 3], S33, 3572445317u); /* 43 */
  HH ( b, c, d, a, in[ 6], S34,   76029189u); /* 44 */
  HH ( a, b, c, d, in[ 9], S31, 3654602809u); /* 45 */
  HH ( d, a, b, c, in[12], S32, 3873151461u); /* 46 */
  HH ( c, d, a, b, in[15], S33,  530742520u); /* 47 */
  HH ( b, c, d, a, in[ 2], S34, 3299628645u); /* 48 */

  /* Round 4 */
#define S41 6
#define S42 10
#define S43 15
#define S44 21
  II ( a, b, c, d, in[ 0], S41, 4096336452u); /* 49 */
  II ( d, a, b, c, in[ 7], S42, 1126891415u); /* 50 */
  II ( c, d, a, b, in[14], S43, 2878612391u); /* 51 */
  II ( b, c, d, a, in[ 5], S44, 4237533241u); /* 52 */
  II ( a, b, c, d, in[12], S41, 1700485571u); /* 53 */
  II ( d, a, b, c, in[ 3], S42, 2399980690u); /* 54 */
  II ( c, d, a, b, in[10], S43, 4293915773u); /* 55 */
  II ( b, c, d, a, in[ 1], S44, 2240044497u); /* 56 */
  II ( a, b, c, d, in[ 8], S41, 1873313359u); /* 57 */
  II ( d, a, b, c, in[15], S42, 4264355552u); /* 58 */
  II ( c, d, a, b, in[ 6], S43, 2734768916u); /* 59 */
  II ( b, c, d, a, in[13], S44, 1309151649u); /* 60 */
  II ( a, b, c, d, in[ 4], S41, 4149444226u); /* 61 */
  II ( d, a, b, c, in[11], S42, 3174756917u); /* 62 */
  II ( c, d, a, b, in[ 2], S43,  718787259u); /* 63 */
  II ( b, c, d, a, in[ 9], S44, 3951481745u); /* 64 */

  buf[0] += a;
  buf[1] += b;
  buf[2] += c;
  buf[3] += d;
}

JNIEXPORT jstring
    Java_com_carlsberg_IntentServiceSendFiles_gethash( JNIEnv* env,  jobject thiz ,
 jstring filename)
{

    const char *fi = (*env)->GetStringUTFChars(env,filename, 0);

      FILE *inFile = fopen (fi, "rb");
      MD5_CTX mdContext;
      int bytes;
      unsigned char data[1024];

      if (inFile == NULL) {
        printf ("%s can't be opened.\n",fi);
        return;
      }

      MD5Init (&mdContext);
      while ((bytes = fread (data, 1, 1024, inFile)) != 0)
      MD5Update (&mdContext, data, bytes);
      MD5Final (&mdContext);
      fclose (inFile);

      char tempValue[33]; // 32 hex digits + 0-terminator
      int i;
      // convert to hex
      for (i = 0; i < 16; ++i)
          sprintf(tempValue + 2*i, "%02x", (unsigned char)mdContext.digest[i]);

      return (*env)->NewStringUTF(env,tempValue );

}

prologue
However one important discovery i make during testing the md5, adler32 and crc32 on a 100Mb file, is that strangely it take the same time. This can only mean one of two thing i guess, that on The Android device, the filesystem is the bottleneck and it cannot feed the algorithm fast enough, or i made a fundamental error implementing JNI, the later one i could live with.

Hashing small files like images, mp3 and files under 10Mb take seconds using the MD5
algorithm.

My problem is i have files with sizes over 100-700MB.

My requirement is that downloaded files need to match the original source file.

I did some tests for making MD5 hashes for a file with the size of 100Mb.

On the device HTC Desire Android v2.2 i run both a jni native test and
the java MessageDigest.getInstance("MD5"); test.

Both tests calculated the MD5 of the same file and both test run approximation for the same length of time 1-2min. I had debugging turned of.

It was my understanding that the Native test would be faster.

How can i get the hashing time down to let say 10-15sec for 100MB on the above device.
The cost for this is of course collision accuracy but i can live with that the hash is not same in one in a million.

UPDATE
Im no c guru but here is my test c-code for MD5. The speed on this one was not not much faster then the Java MessageDigest . Felt like i was running on the Android main UI thread.

#include <android/log.h>
#include <stdio.h>
#include <sys/types.h>
#include <time.h>
#include <string.h>
#include <inttypes.h>
#include <jni.h>
#include <stdlib.h>
/* typedef a 32 bit type */
typedef unsigned long int UINT4;

/* Data structure for MD5 (Message Digest) computation */
typedef struct {
  UINT4 i[2];                   /* number of _bits_ handled mod 2^64 */
  UINT4 buf[4];                                    /* scratch buffer */
  unsigned char in[64];                              /* input buffer */
  unsigned char digest[16];     /* actual digest after MD5Final call */
} MD5_CTX;

void MD5Init ();
void MD5Update ();
void MD5Final ();



/* forward declaration */
static void Transform ();

static unsigned char PADDING[64] = {
  0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};

/* F, G and H are basic MD5 functions: selection, majority, parity */
#define F(x, y, z) (((x) & (y)) | ((~x) & (z)))
#define G(x, y, z) (((x) & (z)) | ((y) & (~z)))
#define H(x, y, z) ((x) ^ (y) ^ (z))
#define I(x, y, z) ((y) ^ ((x) | (~z)))

/* ROTATE_LEFT rotates x left n bits */
#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32-(n))))

/* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4 */
/* Rotation is separate from addition to prevent recomputation */
#define FF(a, b, c, d, x, s, ac) \
  {(a) += F ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define GG(a, b, c, d, x, s, ac) \
  {(a) += G ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define HH(a, b, c, d, x, s, ac) \
  {(a) += H ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define II(a, b, c, d, x, s, ac) \
  {(a) += I ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }

void MD5Init (mdContext)
MD5_CTX *mdContext;
{
  mdContext->i[0] = mdContext->i[1] = (UINT4)0;

  /* Load magic initialization constants.
   */
  mdContext->buf[0] = (UINT4)0x67452301;
  mdContext->buf[1] = (UINT4)0xefcdab89;
  mdContext->buf[2] = (UINT4)0x98badcfe;
  mdContext->buf[3] = (UINT4)0x10325476;
}

void MD5Update (mdContext, inBuf, inLen)
MD5_CTX *mdContext;
unsigned char *inBuf;
unsigned int inLen;
{
  UINT4 in[16];
  int mdi;
  unsigned int i, ii;

  /* compute number of bytes mod 64 */
  mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

  /* update number of bits */
  if ((mdContext->i[0] + ((UINT4)inLen << 3)) < mdContext->i[0])
    mdContext->i[1]++;
  mdContext->i[0] += ((UINT4)inLen << 3);
  mdContext->i[1] += ((UINT4)inLen >> 29);

  while (inLen--) {
    /* add new character to buffer, increment mdi */
    mdContext->in[mdi++] = *inBuf++;

    /* transform if necessary */
    if (mdi == 0x40) {
      for (i = 0, ii = 0; i < 16; i++, ii += 4)
        in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
                (((UINT4)mdContext->in[ii+2]) << 16) |
                (((UINT4)mdContext->in[ii+1]) << 8) |
                ((UINT4)mdContext->in[ii]);
      Transform (mdContext->buf, in);
      mdi = 0;
    }
  }
}

void MD5Final (mdContext)
MD5_CTX *mdContext;
{
  UINT4 in[16];
  int mdi;
  unsigned int i, ii;
  unsigned int padLen;

  /* save number of bits */
  in[14] = mdContext->i[0];
  in[15] = mdContext->i[1];

  /* compute number of bytes mod 64 */
  mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

  /* pad out to 56 mod 64 */
  padLen = (mdi < 56) ? (56 - mdi) : (120 - mdi);
  MD5Update (mdContext, PADDING, padLen);

  /* append length in bits and transform */
  for (i = 0, ii = 0; i < 14; i++, ii += 4)
    in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
            (((UINT4)mdContext->in[ii+2]) << 16) |
            (((UINT4)mdContext->in[ii+1]) << 8) |
            ((UINT4)mdContext->in[ii]);
  Transform (mdContext->buf, in);

  /* store buffer in digest */
  for (i = 0, ii = 0; i < 4; i++, ii += 4) {
    mdContext->digest[ii] = (unsigned char)(mdContext->buf[i] & 0xFF);
    mdContext->digest[ii+1] =
      (unsigned char)((mdContext->buf[i] >> 8) & 0xFF);
    mdContext->digest[ii+2] =
      (unsigned char)((mdContext->buf[i] >> 16) & 0xFF);
    mdContext->digest[ii+3] =
      (unsigned char)((mdContext->buf[i] >> 24) & 0xFF);
  }
}

/* Basic MD5 step. Transform buf based on in.
 */
static void Transform (buf, in)
UINT4 *buf;
UINT4 *in;
{
  UINT4 a = buf[0], b = buf[1], c = buf[2], d = buf[3];

  /* Round 1 */
#define S11 7
#define S12 12
#define S13 17
#define S14 22
  FF ( a, b, c, d, in[ 0], S11, 3614090360u); /* 1 */
  FF ( d, a, b, c, in[ 1], S12, 3905402710u); /* 2 */
  FF ( c, d, a, b, in[ 2], S13,  606105819u); /* 3 */
  FF ( b, c, d, a, in[ 3], S14, 3250441966u); /* 4 */
  FF ( a, b, c, d, in[ 4], S11, 4118548399u); /* 5 */
  FF ( d, a, b, c, in[ 5], S12, 1200080426u); /* 6 */
  FF ( c, d, a, b, in[ 6], S13, 2821735955u); /* 7 */
  FF ( b, c, d, a, in[ 7], S14, 4249261313u); /* 8 */
  FF ( a, b, c, d, in[ 8], S11, 1770035416u); /* 9 */
  FF ( d, a, b, c, in[ 9], S12, 2336552879u); /* 10 */
  FF ( c, d, a, b, in[10], S13, 4294925233u); /* 11 */
  FF ( b, c, d, a, in[11], S14, 2304563134u); /* 12 */
  FF ( a, b, c, d, in[12], S11, 1804603682u); /* 13 */
  FF ( d, a, b, c, in[13], S12, 4254626195u); /* 14 */
  FF ( c, d, a, b, in[14], S13, 2792965006u); /* 15 */
  FF ( b, c, d, a, in[15], S14, 1236535329u); /* 16 */

  /* Round 2 */
#define S21 5
#define S22 9
#define S23 14
#define S24 20
  GG ( a, b, c, d, in[ 1], S21, 4129170786u); /* 17 */
  GG ( d, a, b, c, in[ 6], S22, 3225465664u); /* 18 */
  GG ( c, d, a, b, in[11], S23,  643717713u); /* 19 */
  GG ( b, c, d, a, in[ 0], S24, 3921069994u); /* 20 */
  GG ( a, b, c, d, in[ 5], S21, 3593408605u); /* 21 */
  GG ( d, a, b, c, in[10], S22,   38016083u); /* 22 */
  GG ( c, d, a, b, in[15], S23, 3634488961u); /* 23 */
  GG ( b, c, d, a, in[ 4], S24, 3889429448u); /* 24 */
  GG ( a, b, c, d, in[ 9], S21,  568446438u); /* 25 */
  GG ( d, a, b, c, in[14], S22, 3275163606u); /* 26 */
  GG ( c, d, a, b, in[ 3], S23, 4107603335u); /* 27 */
  GG ( b, c, d, a, in[ 8], S24, 1163531501u); /* 28 */
  GG ( a, b, c, d, in[13], S21, 2850285829u); /* 29 */
  GG ( d, a, b, c, in[ 2], S22, 4243563512u); /* 30 */
  GG ( c, d, a, b, in[ 7], S23, 1735328473u); /* 31 */
  GG ( b, c, d, a, in[12], S24, 2368359562u); /* 32 */

  /* Round 3 */
#define S31 4
#define S32 11
#define S33 16
#define S34 23
  HH ( a, b, c, d, in[ 5], S31, 4294588738u); /* 33 */
  HH ( d, a, b, c, in[ 8], S32, 2272392833u); /* 34 */
  HH ( c, d, a, b, in[11], S33, 1839030562u); /* 35 */
  HH ( b, c, d, a, in[14], S34, 4259657740u); /* 36 */
  HH ( a, b, c, d, in[ 1], S31, 2763975236u); /* 37 */
  HH ( d, a, b, c, in[ 4], S32, 1272893353u); /* 38 */
  HH ( c, d, a, b, in[ 7], S33, 4139469664u); /* 39 */
  HH ( b, c, d, a, in[10], S34, 3200236656u); /* 40 */
  HH ( a, b, c, d, in[13], S31,  681279174u); /* 41 */
  HH ( d, a, b, c, in[ 0], S32, 3936430074u); /* 42 */
  HH ( c, d, a, b, in[ 3], S33, 3572445317u); /* 43 */
  HH ( b, c, d, a, in[ 6], S34,   76029189u); /* 44 */
  HH ( a, b, c, d, in[ 9], S31, 3654602809u); /* 45 */
  HH ( d, a, b, c, in[12], S32, 3873151461u); /* 46 */
  HH ( c, d, a, b, in[15], S33,  530742520u); /* 47 */
  HH ( b, c, d, a, in[ 2], S34, 3299628645u); /* 48 */

  /* Round 4 */
#define S41 6
#define S42 10
#define S43 15
#define S44 21
  II ( a, b, c, d, in[ 0], S41, 4096336452u); /* 49 */
  II ( d, a, b, c, in[ 7], S42, 1126891415u); /* 50 */
  II ( c, d, a, b, in[14], S43, 2878612391u); /* 51 */
  II ( b, c, d, a, in[ 5], S44, 4237533241u); /* 52 */
  II ( a, b, c, d, in[12], S41, 1700485571u); /* 53 */
  II ( d, a, b, c, in[ 3], S42, 2399980690u); /* 54 */
  II ( c, d, a, b, in[10], S43, 4293915773u); /* 55 */
  II ( b, c, d, a, in[ 1], S44, 2240044497u); /* 56 */
  II ( a, b, c, d, in[ 8], S41, 1873313359u); /* 57 */
  II ( d, a, b, c, in[15], S42, 4264355552u); /* 58 */
  II ( c, d, a, b, in[ 6], S43, 2734768916u); /* 59 */
  II ( b, c, d, a, in[13], S44, 1309151649u); /* 60 */
  II ( a, b, c, d, in[ 4], S41, 4149444226u); /* 61 */
  II ( d, a, b, c, in[11], S42, 3174756917u); /* 62 */
  II ( c, d, a, b, in[ 2], S43,  718787259u); /* 63 */
  II ( b, c, d, a, in[ 9], S44, 3951481745u); /* 64 */

  buf[0] += a;
  buf[1] += b;
  buf[2] += c;
  buf[3] += d;
}

JNIEXPORT jstring
    Java_com_carlsberg_IntentServiceSendFiles_gethash( JNIEnv* env,  jobject thiz ,
 jstring filename)
{

    const char *fi = (*env)->GetStringUTFChars(env,filename, 0);

      FILE *inFile = fopen (fi, "rb");
      MD5_CTX mdContext;
      int bytes;
      unsigned char data[1024];

      if (inFile == NULL) {
        printf ("%s can't be opened.\n",fi);
        return;
      }

      MD5Init (&mdContext);
      while ((bytes = fread (data, 1, 1024, inFile)) != 0)
      MD5Update (&mdContext, data, bytes);
      MD5Final (&mdContext);
      fclose (inFile);

      char tempValue[33]; // 32 hex digits + 0-terminator
      int i;
      // convert to hex
      for (i = 0; i < 16; ++i)
          sprintf(tempValue + 2*i, "%02x", (unsigned char)mdContext.digest[i]);

      return (*env)->NewStringUTF(env,tempValue );

}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

心房敞 2025-01-12 09:29:59

Android 使用 BouncyCastle 作为 crytpoapi,它在 java 中实现了所有摘要算法。所以你是对的,当它完全本地化时,它应该更快。当您有知识和时间(以及需要)在本机代码中使用它们时,它会(根据您的测量)更快一些。

您还应该使用 TCP 或其他协议来确保数据正确到达(我猜您已经使用 TCP 而不是 UDP,因为您使用 FTP)

在这种情况下我要做的是以下内容:

我将创建 2 个新线程(除了 UI线程执行一些花哨的进度条打印),其中第一个负责下载,第二个负责散列。

下载线程现在将通知哈希线程有关新到达的块。这些块可能是 10MB 左右。因此,散列线程仅处理 10MB 块,这应该相当快,并且还应该保留尽早发现文件中断的能力。通过这种方法,您还可以检测下载何时中断,并可以使用第一个损坏的块重新下载文件。当然,您必须先创建一个块列表并将其传输到客户端,然后才能工作。

您还可以在此处使用非常快速的哈希算法,该算法适合检测传输中断(当您使用 TCP 时,不应出现传输中断,TCP 可以保证数据在发送时正确到达)。

再次阅读我的文本后,这感觉有点像 torrent(基于块,散列以查看一切是否正确,能够重新传输......)。

加分点:用本机代码执行,因此速度更快一些。

Android uses BouncyCastle for the crytpoapi which implements all its digest algorithms in java. So you are right, it should be faster when it's done entirely native. When you have the knowledge and time (and the need to) use them in native code, it will be (according to your measurements) a little bit faster.

You sould also use TCP or another protocol which ensures that data arrives correctly (I guess you already use TCP and not UDP as you use FTP)

What I would do is the following in this case:

I would create 2 new threads (besides the UI thread which does some fancy progressbar printing) where the first is responsible for the download and the second is responsible for the hashing.

The download thread would now notify the hashing thread about newly arrived chunks. The chunks could be 10MB or so. So the hashing thread processes only 10MB chunks which should be reasonable fast and should also preserve the ability to notice file breaks early. With this approach you could also detect when the download broke and could redownload the file with the first broken chunk. Of course you would have to create and transfer a chunklist to the client before this can work.

You can also use a very fast hashing algo here which is suitable to detect transfer breaks (which shouldn't come up as you use TCP which guarantees that the data arrives correctly if send so).

After reading my text again this feels somewhat like a torrent (chunkbased, hashed to see if everything is correct, able to retransmit...).

Bonus points: do it in native code so it's a little bit faster.

暖伴 2025-01-12 09:29:59

您可以尝试 rsync 方法,即最初使用快速哈希,例如 Adler32 或 CRC-32,并且仅在快速哈希发生冲突时才使用较慢的 MD5。

You could try the rsync approach i.e. initially use a fast hash such as Adler32 or CRC-32 and only use the slower MD5 when you get a collision on the fast hash.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文