如何交换十六进制字节字符串中的相邻字节(使用或不使用正则表达式)

发布于 2024-08-18 11:33:01 字数 135 浏览 5 评论 0原文

给定字符串(或具有偶数个单词对的任何长度字符串): “12345678”

我如何交换相邻的“单词”?

我想要的结果是 “34127856”

此外,完成后我需要交换多头。 我想要的结果是: “78563412”

Given the string (or any length string with an even-number of word pairs):
"12345678"

How would I swap adjacent "words"?

The result I want is
"34127856"

As well as, when that's done I need to swap the longs.
The result I want is:
"78563412"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

似梦非梦 2024-08-25 11:33:01

正则表达式方法:

import re
twopairs = re.compile(r'(..)(..)')
stringwithswappedwords = twopairs.sub(r'\2\1', basestring)
twoquads = re.compile(r'(....)(....)')
stringwithswappedlongs = twoquads.sub(r'\2\1', stringwithswappedwords)

编辑
然而,这绝对不是 Python 中最快的方法——这是人们发现这些事情的方法:首先,将所有“竞争”方法写入一个模块,这里我将其称为'swa.py'...:

import re

twopairs = re.compile(r'(..)(..)')
twoquads = re.compile(r'(....)(....)')

def withre(basestring, twopairs=twopairs, twoquads=twoquads):
  stringwithswappedwords = twopairs.sub(r'\2\1', basestring)
  return twoquads.sub(r'\2\1', stringwithswappedwords)

def withoutre(basestring):
  asalist = list(basestring)
  asalist.reverse()
  for i in range(0, len(asalist), 2):
    asalist[i+1], asalist[i] = asalist[i], asalist[i+1]
  return ''.join(asalist)

s = '12345678'
print withre(s)
print withoutre(s)

请注意,我设置了 s 并尝试了两种方法来进行快速健全性检查,以确保它们实际上计算出相同的结果 - 一般来说,对于这种类型来说,这是一种很好的做法“面对面的表演赛”!

然后,在 shell 提示符处,使用 timeit,如下所示:

$ python -mtimeit -s'import swa' 'swa.withre(swa.s)'
78563412
78563412
10000 loops, best of 3: 42.2 usec per loop
$ python -mtimeit -s'import swa' 'swa.withoutre(swa.s)'
78563412
78563412
100000 loops, best of 3: 9.84 usec per loop

...您发现在这种情况下,无 RE 方法的速度大约快 4 倍,这是值得优化的。一旦您有了这样的“测量工具”,当然,如果在此操作中需要“真正的速度”,就可以轻松地尝试进一步的替代方案和调整以进一步优化。

编辑:例如,这是一种更快的方法(添加到相同的swa.py,最后一行print fast(s)当然;-):

def faster(basestring):
  asal = [basestring[i:i+2]
          for i in range(0, len(basestring), 2)]
  asal.reverse()
  return ''.join(asal)

这给出:

$ python -mtimeit -s'import swa' 'swa.faster(swa.s)'
78563412
78563412
78563412
100000 loops, best of 3: 5.58 usec per loop

大约 5.6 微秒,低于最简单的无 RE 方法的大约 9.8 微秒,是另一个可能值得的微优化。

当然,等等——有一个古老的(伪)定理说任何程序都可以缩短至少一个字节,至少加快一纳秒......;-)

编辑:为了“证明”伪定理,这里有一个完全不同的方法(替换 swa.py 的末尾)...:

import array
def witharray(basestring):
  a2 = array.array('H', basestring)
  a2.reverse()
  return a2.tostring()

s = '12345678'
# print withre(s)
# print withoutre(s)
print faster(s)
print witharray(s)

这给出了:

$ python -mtimeit -s'import swa' 'swa.witharray(swa.s)'
78563412
78563412
100000 loops, best of 3: 3.01 usec per loop

进一步可能值得加速。

A regex approach:

import re
twopairs = re.compile(r'(..)(..)')
stringwithswappedwords = twopairs.sub(r'\2\1', basestring)
twoquads = re.compile(r'(....)(....)')
stringwithswappedlongs = twoquads.sub(r'\2\1', stringwithswappedwords)

Edit:
However, this is definitely not the fastest approach in Python -- here's how one finds out about such things: first, write all "competing" approaches into a module, here I'm calling it 'swa.py'...:

import re

twopairs = re.compile(r'(..)(..)')
twoquads = re.compile(r'(....)(....)')

def withre(basestring, twopairs=twopairs, twoquads=twoquads):
  stringwithswappedwords = twopairs.sub(r'\2\1', basestring)
  return twoquads.sub(r'\2\1', stringwithswappedwords)

def withoutre(basestring):
  asalist = list(basestring)
  asalist.reverse()
  for i in range(0, len(asalist), 2):
    asalist[i+1], asalist[i] = asalist[i], asalist[i+1]
  return ''.join(asalist)

s = '12345678'
print withre(s)
print withoutre(s)

Note that I set s and try out the two approaches for a fast sanity check that they're actually computing the same result -- good practice, in general, for this kind of "head to head performance races"!

Then, at the shell prompt, you use timeit, as follows:

$ python -mtimeit -s'import swa' 'swa.withre(swa.s)'
78563412
78563412
10000 loops, best of 3: 42.2 usec per loop
$ python -mtimeit -s'import swa' 'swa.withoutre(swa.s)'
78563412
78563412
100000 loops, best of 3: 9.84 usec per loop

...and you find that in this case the RE-less approach is about 4 times faster, a worthwhile optimization. Once you have such a "measurement harness" in place, it's also easy to experiment with further alternative and tweaks for further optimization, if there is any need for "really blazing speed" in this operation, of course.

Edit: for example, here's an even faster approach (add to the same swa.py, with a final line of print faster(s) of course;-):

def faster(basestring):
  asal = [basestring[i:i+2]
          for i in range(0, len(basestring), 2)]
  asal.reverse()
  return ''.join(asal)

This gives:

$ python -mtimeit -s'import swa' 'swa.faster(swa.s)'
78563412
78563412
78563412
100000 loops, best of 3: 5.58 usec per loop

About 5.6 microseconds, down from about 9.8 for the simplest RE-less approach, is another possibly-worthwhile micro-optimization.

And so on, of course -- there's an old folk (pseudo)theorem that says that any program can be made at least one byte shorter and at least one nanosecond faster...;-)

Edit: and to "prove" the pseudotheorem, here's a completely different approach (replace the end of swa.py)...:

import array
def witharray(basestring):
  a2 = array.array('H', basestring)
  a2.reverse()
  return a2.tostring()

s = '12345678'
# print withre(s)
# print withoutre(s)
print faster(s)
print witharray(s)

This gives:

$ python -mtimeit -s'import swa' 'swa.witharray(swa.s)'
78563412
78563412
100000 loops, best of 3: 3.01 usec per loop

for a further possible-worthy speedup.

梦中的蝴蝶 2024-08-25 11:33:01
import re
re.sub(r'(..)(..)', r'\2\1', '12345678')
re.sub(r'(....)(....)', r'\2\1', '34127856')
import re
re.sub(r'(..)(..)', r'\2\1', '12345678')
re.sub(r'(....)(....)', r'\2\1', '34127856')
蓝海似她心 2024-08-25 11:33:01

仅对于字符串“12345678”,

from textwrap import wrap
s="12345678"
t=wrap(s,len(s)/2)
a,b=wrap(t[0],len(t[0])/2)
c,d=wrap(t[1],len(t[1])/2)
a,b=b,a
c,d=d,c
print a+b+c+d

您可以将其设置为通用函数来执行可变长度字符串。

输出

$ ./python.py
34127856

just for the string "12345678"

from textwrap import wrap
s="12345678"
t=wrap(s,len(s)/2)
a,b=wrap(t[0],len(t[0])/2)
c,d=wrap(t[1],len(t[1])/2)
a,b=b,a
c,d=d,c
print a+b+c+d

you can make it to a generic function to do variable length string.

output

$ ./python.py
34127856
呆° 2024-08-25 11:33:01
>>> import re
>>> re.sub("(..)(..)","\\2\\1","12345678")
'34127856'
>>> re.sub("(....)(....)","\\2\\1","34127856")
'78563412'
>>> import re
>>> re.sub("(..)(..)","\\2\\1","12345678")
'34127856'
>>> re.sub("(....)(....)","\\2\\1","34127856")
'78563412'
茶底世界 2024-08-25 11:33:01

如果你想进行字节顺序转换,请使用Python的struct模块 在原始二进制数据上。

如果这不是您的目标,这里有一个简单的示例代码来重新排列一个 8 个字符的字符串:

def wordpairswapper(s):
    return s[6:8] + s[4:6] + s[2:4] + s[0:2]

If you want to do endianness conversion, use Python's struct module on the original binary data.

If that is not your goal, here's a simple sample code to rearrange one 8 character string:

def wordpairswapper(s):
    return s[6:8] + s[4:6] + s[2:4] + s[0:2]
画骨成沙 2024-08-25 11:33:01

我正在使用以下方法:

data = "deadbeef"
if len(data) == 4: #2 bytes, 4 characters
   value = socket.ntohs(int(data, 16))
elif len(data) >= 8:
   value = socket.ntohl(int(data, 16))
else:
   value = int(data, 16)

对我有用!

I'm using the following approach:

data = "deadbeef"
if len(data) == 4: #2 bytes, 4 characters
   value = socket.ntohs(int(data, 16))
elif len(data) >= 8:
   value = socket.ntohl(int(data, 16))
else:
   value = int(data, 16)

works for me!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文