在二进制数据中查找字符串

发布于 2024-08-12 18:35:44 字数 1015 浏览 5 评论 0原文

我有一个使用 NSData 对象加载的二进制文件。有没有办法在该二进制数据中定位字符序列(例如“abcd”)并返回偏移量,而无需将整个文件转换为字符串?看起来这应该是一个简单的答案,但我不知道该怎么做。有什么想法吗?

我在 iOS 3 上执行此操作,因此没有 -rangeOfData:options:range: 可用。

我要把这个奖励给十六号奥托,因为他提出了 strstr 的建议。我找到了 C 函数 strstr 的源代码,并重写了它以处理固定长度的字节数组——顺便说一句,它与 char 数组不同,因为它不是以 null 结尾的。这是我最终得到的代码:

- (Byte*)offsetOfBytes:(Byte*)bytes inBuffer:(const Byte*)buffer ofLength:(int)len;
{
    Byte *cp = bytes;
    Byte *s1, *s2;

    if ( !*buffer )
        return bytes;

    int i = 0;
    for (i=0; i < len; ++i)
    {
        s1 = cp;
        s2 = (Byte*)buffer;

        while ( *s1 && *s2 && !(*s1-*s2) )
            s1++, s2++;

        if (!*s2)
            return cp;

        cp++;
    }

    return NULL;
}

这返回一个指向第一次出现的字节的指针,这是我在缓冲区中寻找的东西,应该包含字节的字节数组。

我这样称呼它:

// data is the NSData object
const Byte *bytes = [data bytes];
Byte* index = [self offsetOfBytes:tag inBuffer:bytes ofLength:[data length]];

I have a binary file I've loaded using an NSData object. Is there a way to locate a sequence of characters, 'abcd' for example, within that binary data and return the offset without converting the entire file to a string? Seems like it should be a simple answer, but I'm not sure how to do it. Any ideas?

I'm doing this on iOS 3 so I don't have -rangeOfData:options:range: available.

I'm going to award this one to Sixteen Otto for suggesting strstr. I went and found the source code for the C function strstr and rewrote it to work on a fixed length Byte array--which incidentally is different from a char array as it is not null terminated. Here is the code I ended up with:

- (Byte*)offsetOfBytes:(Byte*)bytes inBuffer:(const Byte*)buffer ofLength:(int)len;
{
    Byte *cp = bytes;
    Byte *s1, *s2;

    if ( !*buffer )
        return bytes;

    int i = 0;
    for (i=0; i < len; ++i)
    {
        s1 = cp;
        s2 = (Byte*)buffer;

        while ( *s1 && *s2 && !(*s1-*s2) )
            s1++, s2++;

        if (!*s2)
            return cp;

        cp++;
    }

    return NULL;
}

This returns a pointer to the first occurrence of bytes, the thing I'm looking for, in buffer, the byte array that should contain bytes.

I call it like this:

// data is the NSData object
const Byte *bytes = [data bytes];
Byte* index = [self offsetOfBytes:tag inBuffer:bytes ofLength:[data length]];

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

空宴 2024-08-19 18:35:44

将子字符串转换为 NSData 对象,并使用 rangeOfData:options:range :。确保字符串编码匹配!

在 iPhone 上,此功能不可用,您可能必须自己执行此操作。 C 函数 strstr() 将为您提供一个指向缓冲区中第一次出现的模式的指针(只要都不包含空值!),但不是索引。这是一个应该完成这项工作的函数(但没有承诺,因为我还没有尝试实际运行它......):

- (NSUInteger)indexOfData:(NSData*)needle inData:(NSData*)haystack
{
    const void* needleBytes = [needle bytes];
    const void* haystackBytes = [haystack bytes];

    // walk the length of the buffer, looking for a byte that matches the start
    // of the pattern; we can skip (|needle|-1) bytes at the end, since we can't
    // have a match that's shorter than needle itself
    for (NSUInteger i=0; i < [haystack length]-[needle length]+1; i++)
    {
        // walk needle's bytes while they still match the bytes of haystack
        // starting at i; if we walk off the end of needle, we found a match
        NSUInteger j=0;
        while (j < [needle length] && needleBytes[j] == haystackBytes[i+j])
        {
            j++;
        }
        if (j == [needle length])
        {
            return i;
        }
    }
    return NSNotFound;
}

它的运行时间类似于 O(nm),其中 n 是缓冲区长度,m 是子串的大小。它被编写为与 NSData 一起使用有两个原因:1)这就是您手中的东西,2)这些对象已经封装了实际字节和缓冲区的长度。

Convert your substring to an NSData object, and search for those bytes in the larger NSData using rangeOfData:options:range:. Make sure that the string encodings match!

On iPhone, where that isn't available, you may have to do this yourself. The C function strstr() will give you a pointer to the first occurrence of a pattern within the buffer (as long as neither contain nulls!), but not the index. Here's a function that should do the job (but no promises, since I haven't tried actually running it...):

- (NSUInteger)indexOfData:(NSData*)needle inData:(NSData*)haystack
{
    const void* needleBytes = [needle bytes];
    const void* haystackBytes = [haystack bytes];

    // walk the length of the buffer, looking for a byte that matches the start
    // of the pattern; we can skip (|needle|-1) bytes at the end, since we can't
    // have a match that's shorter than needle itself
    for (NSUInteger i=0; i < [haystack length]-[needle length]+1; i++)
    {
        // walk needle's bytes while they still match the bytes of haystack
        // starting at i; if we walk off the end of needle, we found a match
        NSUInteger j=0;
        while (j < [needle length] && needleBytes[j] == haystackBytes[i+j])
        {
            j++;
        }
        if (j == [needle length])
        {
            return i;
        }
    }
    return NSNotFound;
}

This runs in something like O(nm), where n is the buffer length, and m is the size of the substring. It's written to work with NSData for two reasons: 1) that's what you seem to have in hand, and 2) those objects already encapsulate both the actual bytes, and the length of the buffer.

当梦初醒 2024-08-19 18:35:44

如果您使用的是 Snow Leopard,一种方便的方法是 NSData 返回一条数据第一次出现的范围。否则,您可以使用 NSData 的 -bytes 方法自行访问 NSData 的内容来执行您自己的搜索。

If you're using Snow Leopard, a convenient way is the new -rangeOfData:options:range: method in NSData that returns the range of the first occurrence of a piece of data. Otherwise, you can access the NSData's contents yourself using its -bytes method to perform your own search.

苹果你个爱泡泡 2024-08-19 18:35:44

我也有同样的问题。
与建议相比,我以相反的方式解决了这个问题。

首先,我重新格式化数据(假设您的 NSData 存储在 var rawFile 中):

NSString *ascii = [[NSString alloc] initWithData:rawFile encoding:NSAsciiStringEncoding];

现在,您可以使用 NSScanner 类轻松执行字符串搜索,如“abcd”或任何您想要的内容,并将 ascii 字符串传递给扫描仪。也许这并不是真正有效,但它会一直有效,直到 -rangeOfData 方法也适用于 iPhone。

I had the same problem.
I solved it doing the other way round, compared to the suggestions.

first, I reformat the data (assume your NSData is stored in var rawFile) with:

NSString *ascii = [[NSString alloc] initWithData:rawFile encoding:NSAsciiStringEncoding];

Now, you can easily do string searches like 'abcd' or whatever you want using the NSScanner class and passing the ascii string to the scanner. Maybe this is not really efficient, but it works until the -rangeOfData method will be available for iPhone also.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文