跳过 unicode 字符的最快方法是什么
我正在尝试获取 UTF-16 格式文件中的某些字符。
我知道我想跳过多少个字符。我目前正在使用 TextReader.ReadBlock
命令来读取我想要跳过的所有字符的临时数组,但我相信设置位置会更快。我只是不知道如何确定新的位置。
如果您想跳过多少个字符,您知道跳到 unicode 文件中某个位置的最快方法是什么吗?
I am trying to get to certain charcters in a file that is in UTF-16 format.
I know how many characters I want to skip. I am currently using the TextReader.ReadBlock
command to read a temporary array of all of the characters I want to skip, but I believe that setting the position would be faster. I just do not how to determine the new position.
Any idea what would be the fastest way to skip to a position in a unicode file if you have how many characters that you want to skip?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
跳过一个块并不那么容易,这需要相对定位。
如果您可以计算下一个块的开头(距文件开头的偏移量),则这是可行的:
您可能需要调整计算,因为 UTF-16 文件可以有 BOM(2 个前导字节)。
It's not so easy to skip a block, that requires relative positioning.
If you can calculate the begiining of the next block (offset from the start of the file) it is doable:
You may have to tweak your calculation because UTF-16 file can have a BOM (2 leading bytes).
考虑到此操作系统是 UTF-16 而不是 UTF-8(字符大小可能会有所不同),每个字符有 2 个字节。因此,要跳过 x 个字符,您必须跳过 x*2 个字节。
Considwring that this os UTF-16 and not UTF-8 (where character size can vary) you have 2 bytes per character. So to skip x characters you have to skip x*2 bytes.