将LSUSB输出的怪异字符转换为Ruby中的stlink hla_serial
I currently use stlink board that have 'weird' iSerial given by lsusb: below an example:
lsusb -v -d 0483:3748 | grep iSerial
iSerial 3 4ÿkVK607 C
iSerial 3 4ÿkVK60'7 C
I already got the corresponding hla_serial that is used in openocd config files, it looks like "\x34\x3f\x6b\x06\x56\x4b\ x36 \ x30 \ x27 \ x37 \ x20 \ x43“,或者如果我们删除hex sign => “ 343F6B06564B363027372043”
我希望能够在这两种格式之间进行转换,但这并不容易。 LSUSB(DEC)的以下怪异字符串相对应
ref = ["343f6b06564b363027372043","343f6b06564b363011372043","343f6906564b363043372043","343f6c06564b363029372043","343f6b06564b363014362043","343f6d06564b363016372043","343f7106564b363016362043"]
dec = ["4ÿqVK606 C","4ÿmVK607 C","4ÿkVK606 C" ,"4ÿlVK60)7 C" ,"4ÿiVK60C7 C" ,"4ÿkVK607 C" ,"4ÿkVK60'7 C"]
我知道,外部团队给我的HLA_Serial(参考)列表与 但我无法找到正确的编码/解码
">4ÿkVK60'7 C".unpack('H*')
["34c3bf6b564b363027372043"]
。
> ref = "343f6b06564b363027372043"
=> "343f6b06564b363027372043"
> ["4ÿqVK606 C","4ÿmVK607 C","4ÿkVK606 C" ,"4ÿlVK60)7 C" ,"4ÿiVK60C7 C" ,"4ÿkVK607 C" ,"4ÿkVK60'7 C"].each {|t| $stdout << "ref:#{ref}\ndec:#{t.unpack('H*').first}\n\n"}
ref:343f6b06564b363027372043
dec:34c3bf71564b3630362043
ref:343f6b06564b363027372043
dec:34c3bf6d564b3630372043
ref:343f6b06564b363027372043
dec:34c3bf6b564b3630362043
ref:343f6b06564b363027372043
dec:34c3bf6c564b363029372043
ref:343f6b06564b363027372043
dec:34c3bf69564b363043372043
ref:343f6b06564b363027372043
dec:34c3bf6b564b3630372043
ref:343f6b06564b363027372043
dec:34c3bf6b564b363027372043
/拆箱, 没有正确的
长度
> require 'open3'
=> true
> o,e,s = Open3.capture3('lsusb -v -d 0483:3748 | grep iSerial')
> dec = o.split("\n").map {|l| l.sub(/.*iSerial.* 3 /,'').unpack('H*').first}
34c3bf7106564b363016362043
34c3bf6d06564b363016372043
34c3bf6b06564b363014362043
34c3bf6c06564b363029372043
34c3bf6906564b363043372043
34c3bf6b06564b363011372043
34c3bf6b06564b363027372043
。 奇怪的是,所有这些都真正接近参考文献,如果我们删除了两个字符:
dec: 34c3bf6b06564b363027372043
ref: 34_3_f6b06564b363027372043
那么我得到的列表与参考表中的列表完全相同。
现在,我只想知道这2个额外的核心的原因。我应该以不同的方式解码吗?
I currently use stlink board that have 'weird' iSerial given by lsusb: below an example:
lsusb -v -d 0483:3748 | grep iSerial
iSerial 3 4ÿkVK607 C
iSerial 3 4ÿkVK60'7 C
I already got the corresponding hla_serial that is used in openocd config files, it looks like "\x34\x3f\x6b\x06\x56\x4b\x36\x30\x27\x37\x20\x43" or if we remove hex sign => "343f6b06564b363027372043"
I want to be able to do conversion between those two formats, but it's not so easy.
I know that the following list of hla_serial (ref) given to me by external team correspond to the following weird strings from lsusb (dec) (at least openocd work with those.
ref = ["343f6b06564b363027372043","343f6b06564b363011372043","343f6906564b363043372043","343f6c06564b363029372043","343f6b06564b363014362043","343f6d06564b363016372043","343f7106564b363016362043"]
dec = ["4ÿqVK606 C","4ÿmVK607 C","4ÿkVK606 C" ,"4ÿlVK60)7 C" ,"4ÿiVK60C7 C" ,"4ÿkVK607 C" ,"4ÿkVK60'7 C"]
I've tried lots of pack/unpack but I was not able to find correct encoding/decoding. The closets things was
">4ÿkVK60'7 C".unpack('H*')
["34c3bf6b564b363027372043"]
So when trying to decode all iSerial:
> ref = "343f6b06564b363027372043"
=> "343f6b06564b363027372043"
> ["4ÿqVK606 C","4ÿmVK607 C","4ÿkVK606 C" ,"4ÿlVK60)7 C" ,"4ÿiVK60C7 C" ,"4ÿkVK607 C" ,"4ÿkVK60'7 C"].each {|t| $stdout << "ref:#{ref}\ndec:#{t.unpack('H*').first}\n\n"}
ref:343f6b06564b363027372043
dec:34c3bf71564b3630362043
ref:343f6b06564b363027372043
dec:34c3bf6d564b3630372043
ref:343f6b06564b363027372043
dec:34c3bf6b564b3630362043
ref:343f6b06564b363027372043
dec:34c3bf6c564b363029372043
ref:343f6b06564b363027372043
dec:34c3bf69564b363043372043
ref:343f6b06564b363027372043
dec:34c3bf6b564b3630372043
ref:343f6b06564b363027372043
dec:34c3bf6b564b363027372043
Last on this list maybe the closest but did not match the ref. You can also note that some of decoded numbers do not have correct length.
I tried to get directly output of shell command in ruby and postprocess it
> require 'open3'
=> true
> o,e,s = Open3.capture3('lsusb -v -d 0483:3748 | grep iSerial')
> dec = o.split("\n").map {|l| l.sub(/.*iSerial.* 3 /,'').unpack('H*').first}
34c3bf7106564b363016362043
34c3bf6d06564b363016372043
34c3bf6b06564b363014362043
34c3bf6c06564b363029372043
34c3bf6906564b363043372043
34c3bf6b06564b363011372043
34c3bf6b06564b363027372043
This is better, at least everyone have same length (26 char instead of 24 for the ref)
What is strange is that all of them are really close to the strings in ref and if we remove two chars like this:
dec: 34c3bf6b06564b363027372043
ref: 34_3_f6b06564b363027372043
Then i got the exact same list than in ref table.
Now I just want to know the reason of those 2 extra-chars. Should I decode differently?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要的是2005年的“ USB ECN:unicode utf-16LE”字符串描述符”
事实证明,原始规格只是在ECN中说的“ Unicode String”
chuckle
,据描述,USB字符串描述符应正式包含UTF16-LE字符串,因为这是大多数程序员一直在做的事情。
因此,如果您完全进行任何解码,则应该是UTF-16LE ...并且应该在您从USB子系统中获得的数据上。
您还必须意识到,您在控制台中看到的不是UTF16。这是您决定渲染的游戏机。现在,您在Stdout上获得的字节可能是UTF16,但是当您打印并复制它们时,上帝只知道您得到的。您的文本编辑器可能会粘贴UTF8。而且我什至不想考虑那里发生了什么。
您没有指定您使用的语言,但看起来像是Python ...我会使用Python3,因为它具有方便的字节类型和更好的Unicode字符串内容。
看来ST的开发人员已决定使用0到255之间的Unicode代码点来编码其版本号的字节。我的意思是“决定”是他们只是那样做的,我们现在一直陷入困境;)
以便您看到的十六进制编码序列号:343F6B06564B363027372043实际上是b“ \ x34 \ x34 \ x00 \ x00 \ x3f \ x3f \ x00 \ x00 \ x6b \ x00 \ x06 \ x00 \ x56 \ x00 \ x4b \ x00 \ x36 \ x36 \ x00 \ x30 \ x30 \ x00 \ x00 \ x27 \ x00 \ x37 \ x37 \ x00 \ x00 \ x20 \ x20 \ x00 \ x00 \ x43 \ x43 \ x00“
您可以看到它是有效的UTF16小endian编码字符串。
在Python3中,当您向PYUSB询问序列号时,您会得到一个实际的“ Unicode字符串”,即UTF16 USB USB字符串代表的代码点的顺序。您真正需要的只是代码点Thogh,因此:
以下是使用Python3的示例
,或者您还可以使用为我们做到这一点的PysWD。
What you need is "USB ECN: UNICODE UTF-16LE for String Descriptors" from 2005
Turns out that the original spec just said "unicode string"
chuckle
In the ECN it is described that USB string descriptors should officially contain utf16-le strings because that's what most programmers had been doing.
So if you do any decoding at all it should be UTF-16LE... and it should be on the data you get from the USB subsystem.
You must also realize that what you see in your console isn't utf16. it's whatever your console has decided to render. Now the bytes you get on stdout might be utf16 but when you print them and copy paste them god only knows what you get. Your text editor is probably going to paste utf8. and I don't even want to think about what happens there.
You don't specify what language you are using, but looks like it's python... I'll be using python3 since it has the convenient bytes type and better unicode string stuff.
It looks like the devs at st had decided to use the unicode code points between 0 and 255 to encode the bytes of their version number. And what i mean by 'decided' is that they just did it that way and we're stuck with it now ;)
so that hex encoded serial number you see: 343f6b06564b363027372043 is actually b"\x34\x00\x3F\x00\x6B\x00\x06\x00\x56\x00\x4B\x00\x36\x00\x30\x00\x27\x00\x37\x00\x20\x00\x43\x00" as stored in the USB string descriptor.
You can see that its a valid utf16 little endian encoded string.
In python3, when you ask pyusb for the serial number you'll get an actual 'unicode string' i.e. the sequence of code points that the utf16 USB string represents. All you really need are the code points thogh so:
Here's an example using python3
Or you could also use pyswd that does that for us.