“Scala 和 Clojure 中的简单字符串模板替换”的后续内容
在我之前的帖子中,我展示了一个简单的(天真的)进行字符串模板替换的算法。
mikera 提供的解决方案之一似乎是一种更好的算法。我在 Clojure 中实现了它(如下),并根据我之前的算法对其进行了计时。新的在 100 次运行中较慢(41.475 毫秒 vs. 19.128 毫秒)。我一定在我的新实现中做了一些愚蠢的事情。
(defn replace-templates
"Return a String with each occurrence of a substring of the form {key}
replaced with the corresponding value from a map parameter.
@param str the String in which to do the replacements
@param m a map of keyword->value"
[text m]
(let [builder (StringBuilder.)
text-length (.length text)]
(loop [current-index 0]
(if (>= current-index text-length)
(.toString builder)
(let [open-brace (.indexOf text "{" current-index)]
(if (< open-brace 0)
(.toString (.append builder (.substring text current-index)))
(let [close-brace (.indexOf text "}" open-brace)]
(if (< close-brace 0)
(.toString (.append builder (.substring text current-index)))
(do
(.append builder (.substring text current-index open-brace))
(.append builder (let [key (keyword (.substring text (inc open-brace) close-brace))
replacement (m key)]
(if (nil? replacement) "" replacement)))
(recur (inc close-brace)))))))))))
虽然它通过了所有测试用例:
(use 'clojure.test)
(deftest test-replace-templates
(is (= (replace-templates "this is a test" {:foo "FOO"})
"this is a test"))
(is (= (replace-templates "this is a {foo} test" {:foo "FOO"})
"this is a FOO test"))
(is (= (replace-templates "this is a {foo} test {bar}" {:foo "FOO" :bar "BAR"})
"this is a FOO test BAR"))
(is (= (replace-templates "this is a {foo} test {bar} 42" {:foo "FOO" :bar "BAR"})
"this is a FOO test BAR 42"))
(is (= (replace-templates "this is a {foo} test {bar" {:foo "FOO" :bar "BAR"})
"this is a FOO test {bar")))
; user=> Ran 1 tests containing 5 assertions.
; user=> 0 failures, 0 errors.
; user=> {:type :summary, :test 1, :pass 5, :fail 0, :error 0}
这是测试代码:
(time (dotimes [n 100] (replace-templates
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, {foo} mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. {bar} Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis {baz} venenatis {foo}
lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in {bar} neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo." {:foo "HELLO" :bar "GOODBYE" :baz "FORTY-TWO"})))
; user=> "Elapsed time: 41.475 msecs"
; user=> nil
我想知道问题是否是 StringBuilder
的连续重新分配。
In my previous post, I showed a simple (naive) algorithm for doing a String template replacement.
One of the solutions, provided by mikera, seems like a much better algorithm. I implemented it in Clojure (follows) and timed it against my previous algorithm. The new one is slower (41.475 msecs vs. 19.128 msecs) on 100 runs. I must be doing something stupid in my new implementation.
(defn replace-templates
"Return a String with each occurrence of a substring of the form {key}
replaced with the corresponding value from a map parameter.
@param str the String in which to do the replacements
@param m a map of keyword->value"
[text m]
(let [builder (StringBuilder.)
text-length (.length text)]
(loop [current-index 0]
(if (>= current-index text-length)
(.toString builder)
(let [open-brace (.indexOf text "{" current-index)]
(if (< open-brace 0)
(.toString (.append builder (.substring text current-index)))
(let [close-brace (.indexOf text "}" open-brace)]
(if (< close-brace 0)
(.toString (.append builder (.substring text current-index)))
(do
(.append builder (.substring text current-index open-brace))
(.append builder (let [key (keyword (.substring text (inc open-brace) close-brace))
replacement (m key)]
(if (nil? replacement) "" replacement)))
(recur (inc close-brace)))))))))))
although it passes all test cases:
(use 'clojure.test)
(deftest test-replace-templates
(is (= (replace-templates "this is a test" {:foo "FOO"})
"this is a test"))
(is (= (replace-templates "this is a {foo} test" {:foo "FOO"})
"this is a FOO test"))
(is (= (replace-templates "this is a {foo} test {bar}" {:foo "FOO" :bar "BAR"})
"this is a FOO test BAR"))
(is (= (replace-templates "this is a {foo} test {bar} 42" {:foo "FOO" :bar "BAR"})
"this is a FOO test BAR 42"))
(is (= (replace-templates "this is a {foo} test {bar" {:foo "FOO" :bar "BAR"})
"this is a FOO test {bar")))
; user=> Ran 1 tests containing 5 assertions.
; user=> 0 failures, 0 errors.
; user=> {:type :summary, :test 1, :pass 5, :fail 0, :error 0}
Here is the test code:
(time (dotimes [n 100] (replace-templates
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, {foo} mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. {bar} Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis {baz} venenatis {foo}
lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in {bar} neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo." {:foo "HELLO" :bar "GOODBYE" :baz "FORTY-TWO"})))
; user=> "Elapsed time: 41.475 msecs"
; user=> nil
I wonder if the problem is the continuous reallocation of the StringBuilder
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我认为你受到了反思的打击。
*warn-on-reflection*
是你的朋友。这里是一些使用标准的测试。这是
replace-templates-very-new
,这是我自己为高尔夫制作的一个版本。 :)它通过了所有测试,所以它应该可以工作。
更新:支持非键大括号括起来的值(
“这是一个 {not-a-key-{foo}-in-the-map} 测试”=>“这是一个{not-a-key-FOO-in-the-map} test"
),允许它在 Java 代码生成器中使用,其中非键大括号括起来的内容很重要:-)。I think you are hit by reflection.
*warn-on-reflection*
is your friend. Here some tests with criterium.Here is the
replace-templates-very-new
, a version I did myself for golf. :)It passes all tests, so it should work.
UPDATE: Support non-key brace enclosed values (
"this is a {not-a-key-{foo}-in-the-map} test" => "this is a {not-a-key-FOO-in-the-map} test"
), allowing it to be used in a Java code generator where non-key brace-enclosed things are significant :-).我编写了一些 Clojure 代码( https://gist.github.com/3729307 ),允许将任何映射值插入到模板中,可能是最快的方式(见下文)如果模板在编译时已知。
它不使用相同的模板语法(尽管它可以适应这一点),但我认为它仍然可以用来解决完全相同的问题。
使用这个解决方案,代码必须像这样重写......
I've written some Clojure code ( https://gist.github.com/3729307 ) that allows to interpolate any map value into a template, in probably the fastest possible way (see below) IF the template is known at compile-time.
It doesn't use the same template syntax (although it could be adapted for that), but I think it still can be used to solve the exact same problem.
With this solution, the code would have to be rewritten like...
老实说,您的解决方案看起来更像是披着 Clojure 外衣的 Java。 Clojure 已经有了相当灵活的
clojure.string /replace
函数可以满足您的需要。另外,您的文档字符串与 Clojure 约定不匹配。我建议这样:正如人们可以想象的那样,替换已经相当优化,因此没有真正的理由推出自己的实现。它在内部使用
StringBuffer
,而您使用StringBuilder
,因此您的实现可能会节省几微秒——没什么值得谈论的。如果您真的关心每一微秒,您可能应该研究宏观方法。如果这是不可能的,因为例如您正在从文件加载模板,那么 I/O 无论如何都会成为一个更大的问题。同样在这种情况下,我建议查看 Selmer 模板系统,它的语法略有不同(双花括号而不是单花括号进行替换),但它的功能也更加灵活。
To be honest, your solution looks more like Java in Clojure clothing. Clojure already has the quite flexible
clojure.string/replace
function which is able to do what you need. Also, your docstring is not matching the Clojure conventions. I would suggest something like this:As one can imagine,
replace
is already quite optimized, so there is no real reason to roll an own implementation. It's usingStringBuffer
internally, while you're usingStringBuilder
, so your implementation might save a few microseconds -- nothing worth talking about.If you really care about every microsecond you probably should look into the macro approach. If that is not possible because e.g. you're loading the template from a file then i/o will be a bigger concern anyway. Also in this case I would suggest looking into the Selmer template system, which has a slightly different syntax (with double instead of single curly braces for replacements) but is also much more flexible in what it can do.