代码高尔夫 - 十六进制到(原始)二进制转换
为了回应这个问题询问十六进制到(原始)二进制转换,评论建议它可以用“5-10 行 C 语言或任何其他语言”来解决。
我确信对于(某些)脚本语言来说这是可以实现的,并且想看看如何实现。 我们能证明这个评论对于 C 来说也是正确的吗?
注意:这并不意味着十六进制到 ASCII 二进制 - 具体来说,输出应该是与输入 ASCII 十六进制相对应的原始八位字节流。 此外,输入解析器应该跳过/忽略空格。
编辑(作者:Brian Campbell)为了保持一致性,我可以提出以下规则吗? 如果您认为这些内容没有帮助,请随意编辑或删除它们,但我认为由于已经对某些情况应如何工作进行了一些讨论,因此进行一些澄清会有所帮助。
- 程序必须从标准输入读取并写入标准输出(我们也可以允许读取和写入命令行上传入的文件,但我无法想象在任何语言中这都会比标准输入和标准输出更短)
- 程序必须使用仅包含在您的基本标准语言发行版中的软件包。 对于 C/C++,这意味着它们各自的标准库,而不是 POSIX。
- 该程序必须在没有任何特殊选项传递给编译器或解释器的情况下编译或运行(因此,“gcc myprog.c”或“python myprog.py”或“ruby myprog.rb”都可以,而“ruby -rscanf myprog.rb”) ' 是不允许的;要求/导入模块会计入您的字符数)。
- 程序应读取由相邻的十六进制数字对(大写、小写或混合大小写)表示的整数字节,可以选择用空格分隔,并将相应的字节写入输出。 每对十六进制数字首先写入最高有效的半字节。
- 程序在无效输入时的行为(除
[a-fA-F \t\r\n]
之外的字符、分隔单个字节中两个字符的空格、奇数个十六进制数字)输入)未定义; 对错误输入的任何行为(除了主动损坏用户的计算机或其他东西)都是可以接受的(抛出错误、停止输出、忽略错误字符、将单个字符视为一个字节的值,都可以) - 。要输出的附加字节。
- 代码按源文件中最少的总字节数进行评分。 (或者,如果我们想要更真实地对待最初的挑战,分数将基于最低的代码行数;在这种情况下,我会每行施加 80 个字符的限制,否则你会得到一堆1 条线的联系)。
In response to this question asking about hex to (raw) binary conversion, a comment suggested that it could be solved in "5-10 lines of C, or any other language."
I'm sure that for (some) scripting languages that could be achieved, and would like to see how. Can we prove that comment true, for C, too?
NB: this doesn't mean hex to ASCII binary - specifically the output should be a raw octet stream corresponding to the input ASCII hex. Also, the input parser should skip/ignore white space.
edit (by Brian Campbell) May I propose the following rules, for consistency? Feel free to edit or delete these if you don't think these are helpful, but I think that since there has been some discussion of how certain cases should work, some clarification would be helpful.
- The program must read from stdin and write to stdout (we could also allow reading from and writing to files passed in on the command line, but I can't imagine that would be shorter in any language than stdin and stdout)
- The program must use only packages included with your base, standard language distribution. In the case of C/C++, this means their respective standard libraries, and not POSIX.
- The program must compile or run without any special options passed to the compiler or interpreter (so, 'gcc myprog.c' or 'python myprog.py' or 'ruby myprog.rb' are OK, while 'ruby -rscanf myprog.rb' is not allowed; requiring/importing modules counts against your character count).
- The program should read integer bytes represented by pairs of adjacent hexadecimal digits (upper, lower, or mixed case), optionally separated by whitespace, and write the corresponding bytes to output. Each pair of hexadecimal digits is written with most significant nibble first.
- The behavior of the program on invalid input (characters besides
[a-fA-F \t\r\n]
, spaces separating the two characters in an individual byte, an odd number of hex digits in the input) is undefined; any behavior (other than actively damaging the user's computer or something) on bad input is acceptable (throwing an error, stopping output, ignoring bad characters, treating a single character as the value of one byte, are all OK) - The program may write no additional bytes to output.
- Code is scored by fewest total bytes in the source file. (Or, if we wanted to be more true to the original challenge, the score would be based on lowest number of lines of code; I would impose an 80 character limit per line in that case, since otherwise you'd get a bunch of ties for 1 line).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(16)
编辑 Checkers 将我的 C 解决方案简化为 46 字节,然后由于 BillyONeal 的提示以及我的错误修复而减少到 44 字节(不再有错误输入的无限循环,现在它只是终止循环)。 请感谢 Checkers 将其从 77 字节减少到 46 字节:
我有一个比上一个更好的 Ruby 解决方案,在
4238 字节中(感谢 Joshua Swank正则表达式建议):原始解决方案
C,77 个字节或两行代码(如果您可以将
#include
放在同一行,则为 1)。 请注意,这会在错误输入时出现无限循环; 在 Checkers 和 BillyONeal 的帮助下,44 字节解决方案修复了该错误,并在错误输入时停止。如果你正常格式化它甚至只有 6 行:
Ruby, 79 bytes (我相信这可以改进):
这些都从 STDIN 获取输入并写入 STDOUT
edit Checkers has reduced my C solution to 46 bytes, which was then reduced to 44 bytes thanks to a tip from BillyONeal plus a bugfix on my part (no more infinite loop on bad input, now it just terminates the loop). Please give credit to Checkers for reducing this from 77 to 46 bytes:
And I have a much better Ruby solution than my last, in
4238 bytes (thanks to Joshua Swank for the regexp suggestion):original solutions
C, in 77 bytes, or two lines of code (would be 1 if you could put the
#include
on the same line). Note that this has an infinite loop on bad input; the 44 byte solution with the help of Checkers and BillyONeal fixes the bug, and simply stops on bad input.It's even just 6 lines if you format it normally:
Ruby, 79 bytes (I'm sure this can be improved):
These both take input from STDIN and write to STDOUT
39 char perl oneliner
编辑: 并没有真正接受大写,已修复。
39 char perl oneliner
Edit: wasn't really accepting uppercase, fixed.
45 字节可执行文件(base64 编码):(
粘贴到扩展名为 .com 的文件中)
编辑:好的,这是代码。 打开 Windows 控制台,创建一个名为“hex.com”的 45 字节文件,输入“debug hex.com”,然后输入“a”并输入。 复制并粘贴这些行:
按 Enter 键“w”,然后再次输入“q”并输入。 您现在可以运行“hex.com”
EDIT2:使其小两个字节!
这很棘手。 我不敢相信我花了时间这么做。
45 byte executable (base64 encoded):
(paste into a file with a .com extension)
EDIT: Ok, here's the code. Open a Window's console, create a file with 45 bytes called 'hex.com', type "debug hex.com" then 'a' and enter. Copy and paste these lines:
Press enter, 'w' and then enter again, 'q' and enter. You can now run 'hex.com'
EDIT2: Made it two bytes smaller!
That was tricky. I can't believe I spent time doing that.
Brian 的 77 字节 C 解决方案可以改进为44 个字节,这要归功于 C 对函数原型的宽松。
Brian's 77-byte C solution can be improved to 44 bytes, thanks to leniency of C with regard to function prototypes.
在 Python 中:
一行! (是的,这是作弊。)
In Python:
ONE LINE! (Yes, this is cheating.)
编辑: 这段代码是在问题编辑之前很长时间编写的,它充实了需求。
鉴于单行 C 可以包含大量语句,它几乎肯定是正确的,但没有用处。
在 C# 中,我几乎肯定会用超过 10 行来编写它,尽管在 10 行内可行。我将“解析 nybble”部分与“将字符串转换为字符串”部分分开。字节数组”部分。
当然,如果你不关心发现不正确的长度等,它就会变得容易一些。 您的原始文本也包含空格 - 是否应该跳过、验证这些空格等? 它们是必需输入格式的一部分吗?
我相当怀疑这个评论是在没有考虑到一个令人愉快、可读的解决方案是什么样子的。
话虽如此,这里有一个可怕的 C# 版本。 为了获得奖励,它完全不恰当地使用了 LINQ,以节省一两行代码。 当然,行可以更长...
(这是通过使用任何内置的十六进制解析代码来避免“作弊”,例如
Convert.ToByte(string, 16)
。除了其他任何事情之外,这意味着失去 nybble 这个词的使用,这总是一个好处。)EDIT: This code was written a long time before the question edit which fleshed out the requirements.
Given that a single line of C can contain a huge number of statements, it's almost certainly true without being useful.
In C# I'd almost certainly write it in more than 10 lines, even though it would be feasible in 10. I'd separate out the "parse nybble" part from the "convert a string to a byte array" part.
Of course, if you don't care about spotting incorrect lengths etc, it becomes a bit easier. Your original text also contained spaces - should those be skipped, validated, etc? Are they part of the required input format?
I rather suspect that the comment was made without consideration as to what a pleasant, readable solution would look like.
Having said that, here's a hideous version in C#. For bonus points, it uses LINQ completely inappropriately in an effort to save a line or two of code. The lines could be longer, of course...
(This is avoiding "cheating" by using any built-in hex parsing code, such as
Convert.ToByte(string, 16)
. Aside from anything else, that would mean losing the use of the word nybble, which is always a bonus.)Perl中只有一行(相当短):
当然,
Perl
In, of course, one (fairly short) line:
哈斯克尔:
Haskell:
嘎。
你不可以打电话给我询问我的即兴估算! ;-P
这是一个 9 行 C 版本,没有奇怪的格式(好吧,我同意你将 tohex 数组更好地分成 16 行,这样你就可以看到哪些字符代码映射到哪些值......),并且仅除了一次性脚本之外,我不会在任何其他脚本中部署两个快捷方式:
没有组合行(每个语句都有自己的行),它完全可读,等等。混淆版本无疑可以更短,人们可以欺骗并把右大括号与前面的语句在同一行,等等,等等。
我不喜欢的两件事是我在那里没有 close(fd) ,并且 main 不应该是 void并且应该返回一个 int 。 可以说它们是不需要的 - 操作系统将释放程序使用的所有资源,文件将毫无问题地关闭,并且编译器将处理程序退出值。 鉴于它是一次性使用脚本,这是可以接受的,但不要部署它。
两者都变成了十一行,所以无论如何它都不是一个巨大的增加,十行版本将包括一个或另一个,这取决于人们可能觉得哪一个是两害相权取其轻。
它不进行任何错误检查,并且不允许空格 - 再次假设它是一次性程序,那么在运行脚本之前进行搜索/替换并删除空格和其他空格会更快,但是它应该不需要多于另外几行来吃掉空格。
当然,有一些方法可以使其更短,但它们可能会显着降低可读性……
哼。 只需阅读有关行长度的评论,所以这是一个较新的版本,带有更丑陋的十六进制宏,而不是数组:
它并不是完全不可读,但我知道很多人对三元运算符有疑问,但是宏的适当命名和一些分析应该很容易让普通 C 程序员了解它是如何工作的。 由于宏中的副作用,我不得不转移到 for 循环,因此我不必为 i+=2 编写另一行(
hextonum(i++)
每次调用时都会将 i 增加 5 ,宏观副作用不适合胆小的人!)。此外,输入解析器应该跳过/忽略空格。
抱怨,抱怨,抱怨。
我必须添加几行来满足此要求,现在合理格式化的版本最多可添加 14 行。 它将忽略所有不是十六进制字符的内容:
我没有考虑 80 个字符的行长度,因为输入甚至不少于 80 个字符,但 3 级三元宏可以替换第一个 256 个条目数组。 如果不介意一点“替代格式”,那么下面的 10 行版本并不是完全无法阅读:
而且,进一步的混淆和位调整可能会导致一个更短的示例。
Gah.
You aren't allowed to call me on my off-the-cuff estimates! ;-P
Here's a 9 line C version with no odd formatting (Well, I'll grant you that the tohex array would be better split into 16 lines so you can see which character codes map to which values...), and only 2 shortcuts that I wouldn't deploy in anything other than a one-off script:
No combined lines (each statement is given its own line), it's perfectly readable, etc. An obfuscated version could undoubtedly be shorter, one could cheat and put the close braces on the same line as the preceding statement, etc, etc, etc.
The two things I don't like about it is that I don't have a close(fd) in there, and main shouldn't be void and should return an int. Arguably they're not needed - the OS will release every resource the program used, the file will close without any problems, and the compiler will take care of the program exit value. Given that it's a one-time use script, it's acceptable, but don't deploy this.
It becomes eleven lines with both, so it's not a huge increase anyway, and a ten line version would include one or the other depending on which one might feel is the lessor of two evils.
It doesn't do any error checking, and it doesn't allow whitespace - assuming, again, that it's a one time program then it's faster to do search/replace and get rid of spaces and other whitespace before running the script, however it shouldn't need more than another few lines to eat whitespace as well.
There are, of course, ways to make it shorter but they would likely decrease readability significantly...
Hmph. Just read the comment about line length, so here's a newer version with an uglier hextonum macro, rather than the array:
It isn't horribly unreadable, but I know many people have issues with the ternary operator, but the appropriate naming of the macro and some analysis should readily yield how it works to the average C programmer. Due to side effects in the macro I had to move to a for loop so I didn't have to have another line for i+=2 (
hextonum(i++)
will increment i by 5 each time it's called, macro side effects are not for the faint of heart!).Also, the input parser should skip/ignore white space.
grumble, grumble, grumble.
I had to add a few lines to take care of this requirement, now up to 14 lines for a reasonably formatted version. It will ignore everything that's not a hexadecimal character:
I didn't bother with the 80 character line length because the input isn't even less than 80 characters, but a 3 level ternary macro could replace the first 256 entry array. If one didn't mind a bit of "alternative formatting" then the following 10 line version isn't completely unreadable:
And, again, further obfuscation and bit twiddling could result in an even shorter example.
它是一种称为“十六进制!”的语言。 它的唯一用途是从 stdin 读取十六进制数据并将其输出到 stdout。
十六进制! 由一个简单的 python 脚本解析。
导入系统
Its an language called "Hex!". Its only usage is to read hex data from stdin and output it to stdout.
Hex! is parsed by an simple python script.
import sys
相当易读的 C 解决方案(9 个“真实”行):
为了支持 16 位小端优先,请将
main
替换为:Fairly readably C solution (9 "real" lines):
To support 16-bit little endian goodness, replace
main
with:31 个字符的 Perl 解决方案:
s/\W//g,print(pack'H*',$_)for<>
A 31-character Perl solution:
s/\W//g,print(pack'H*',$_)for<>
我无法直接编码,但对于每两个字符,输出 (byte)((AsciiValueChar1-(AsciiValueChar1>64?48:55)*16)+(AsciiValueChar1-(AsciiValueChar1>64?48: 55)))将十六进制字符串更改为原始二进制文件。 如果您的输入字符串包含 0 到 9 或 A 到 B 以外的任何内容,这会严重破坏,所以我不能说它对您有多大用处。
I can't code this off the top of my head, but for every two characters, output (byte)((AsciiValueChar1-(AsciiValueChar1>64?48:55)*16)+(AsciiValueChar1-(AsciiValueChar1>64?48:55))) to get a hex string changed into raw binary. This would break horribly if your input string has anything other than 0 to 9 or A to B, so I can't say how useful it would be to you.
我知道 Jon 已经发布了一个(更干净的)LINQ 解决方案。 但这一次我能够使用 LINQ 语句,该语句在执行期间修改字符串并滥用 LINQ 的延迟评估,而不会受到同事的斥责。 :p
1 条语句经过格式化以提高可读性。
更新
支持空格和不均匀的小数位数(89A 等于 08 9A)
还是一种说法。 通过在开头运行十六进制字符串的replace(" ", "") 可以使代码变得更短,但这将是第二条语句。
这一点有两个有趣的点。 如何在不借助源字符串本身以外的外部变量的情况下跟踪字符计数。 在解决这个问题时,我遇到了这样一个事实: char y.CompareTo(x) 仅返回“y - x”,而 int y.CompareTo(x) 返回 -1、0 或 1。所以 char y.CompareTo(x).CompareTo(0 ) 等于返回 -1、0 或 1 的 char 比较。
I know Jon posted a (cleaner) LINQ solution already. But for once I am able to use a LINQ statement which modifies a string during its execution and abuses LINQ's deferred evaluation without getting yelled at by my co-workers. :p
1 statement formatted for readability.
Update
Support for spaces and uneven amount of decimals (89A is equal to 08 9A)
Still one statement. Could be made much shorter by running the replace(" ", "") on hex string in the start, but this would be a second statement.
Two interesting points with this one. How to track the character count without the help of outside variables other than the source string itself. While solving this I encountered the fact that char y.CompareTo(x) just returns "y - x" while int y.CompareTo(x) returns -1, 0 or 1. So char y.CompareTo(x).CompareTo(0) equals a char comparison which returns -1, 0 or 1.
PHP,28个符号:
PHP, 28 symbols:
游戏迟到了,但这里有一些 Python{2,3} 单行代码(100 个字符,需要
import sys, re
):Late to the game, but here's some Python{2,3} one-liner (100 chars, needs
import sys, re
):