Unicode Clojure 单元测试输出
当对一些将 ascii 序列转换为 unicode 字符的代码进行单元测试时,我发现 Clojure 测试的输出存在问题。
我已经测试过我的终端可以输出 unicode 字符(通过 cat-ing 测试文件)并且工作正常,所以问题似乎与 leiningen、Clojure 或 clojure.test 有关。
这是一个示例测试(使用 unicode 的希腊语部分 - 我也将使用希腊语扩展,但我假设也会出现相同的问题):
(deftest bc-string-w-comma
(is (= "αβγ, ΑΒΓ" (parse "abg,*a*b*g"))))
由于输入中缺少空格,它会失败。 lein test 的输出如下:
Testing parse_perseus.test.betacode
FAIL in (bc-string-w-comma) (betacode.clj:15)
expected: (= "???, ???" (parse "abg,*a*b*g"))
actual: (not (= "???, ???" "???,???"))
Testing parse_perseus.test.core
Testing parse_perseus.test.pluralise
Ran 10 tests containing 59 assertions.
1 failures, 0 errors.
我在这里做错了什么?这是终端仿真问题还是与 clojure 相关的问题?我在使用 Slime/swank/emacs 的 REPL 中运行代码时遇到同样的问题。 emacs 中的 REPL 只输出 unicode 输出的问号(尽管 emacs 非常有能力理解 unicode)。
我尝试在终端和 iTerm (OS X) 中运行它,得到相同的结果。
When unit testing some code that translates ascii sequences into unicode characters I have found a problem with the output of Clojure tests.
I have tested that my terminal can output unicode characters (by cat-ing the test files) and that works fine, so the problem seems related to leiningen, Clojure or clojure.test somehow.
Here's an example test (using the Greek section of unicode - I will also be using Greek extended but I assume the same problems will apply):
(deftest bc-string-w-comma
(is (= "αβγ, ΑΒΓ" (parse "abg,*a*b*g"))))
It is meant to fail due to the missing space in the input. The output from lein test
is the following:
Testing parse_perseus.test.betacode
FAIL in (bc-string-w-comma) (betacode.clj:15)
expected: (= "???, ???" (parse "abg,*a*b*g"))
actual: (not (= "???, ???" "???,???"))
Testing parse_perseus.test.core
Testing parse_perseus.test.pluralise
Ran 10 tests containing 59 assertions.
1 failures, 0 errors.
What am I doing wrong here? Is this a terminal emulation problem or something clojure-related? I have the same problem running code in the REPL with Slime/swank/emacs. The REPL in emacs only outputs question marks for unicode output (although emacs is quite capable of understanding unicode).
I have tried running this in Terminal and iTerm (OS X) with the same results.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
事实证明,您可以将选项传递给 java 以强制
*out*
的输出编码,以便 unicode 工作,如下所示:当我使用 Leiningen 时,我将此属性添加到了我的 project.clj 中文件:
It turns out that you can pass options to java to force the output encoding of
*out*
so that unicode works, like this:As I'm using Leiningen, I added this property to my project.clj file:
Clojure 本身似乎很清楚(这是 Ubuntu 10.10、gnome-terminal、OpenJDK):
破坏了 emacs/swank/clojure-maven-plugin/maven
但它确实在 emacs 中的 REPL 处
:如果我使用 maven,下面的简单 pom 文件,和 mvn clojure:repl 那么就可以了:
但是如果我使用这个代码片段添加 jline 库:
那么我得到:
这看起来非常像你的错误。所以问题可能出在 jLine 中,或者 Leiningen 和 Maven 中与 jLine 相关的其他一些共同点。
或者当然,可能存在两个独立的与 unicode 相关的故障。
这是我的 Maven pom.xml 文件,以防有人尝试调试它。
我很高兴这不是一个答案,但我认为这可能会有所帮助。
Clojure itself seems in the clear (this is Ubuntu 10.10, gnome-terminal, OpenJDK):
But it does break emacs/swank/clojure-maven-plugin/maven
at REPL in emacs:
If I use maven, the simple pom file below, and mvn clojure:repl then it's ok:
but if I add the jline library using this snippet:
then I get:
Which looks awfully like your error. So it may be that the problem is in jLine, or some other piece which Leiningen and maven have in common which is associated with jLine.
Or of course, there may be two independent unicode-related failures.
Here is my maven pom.xml file in case anyone is trying to debug this.
I appreciate this is not an answer, but i thought it might be helpful.