“Scala 和 Clojure 中的简单字符串模板替换”的后续内容

发布于 2024-11-09 02:57:43 字数 3436 浏览 0 评论 0原文

在我之前的帖子中,我展示了一个简单的(天真的)进行字符串模板替换的算法。

mikera 提供的解决方案之一似乎是一种更好的算法。我在 Clojure 中实现了它(如下),并根据我之前的算法对其进行了计时。新的在 100 次运行中较慢(41.475 毫秒 vs. 19.128 毫秒)。我一定在我的新实现中做了一些愚蠢的事情。

(defn replace-templates
  "Return a String with each occurrence of a substring of the form {key}
   replaced with the corresponding value from a map parameter.
   @param str the String in which to do the replacements
   @param m a map of keyword->value"
  [text m]
  (let [builder (StringBuilder.)
        text-length (.length text)]
    (loop [current-index 0]
      (if (>= current-index text-length)
        (.toString builder)
        (let [open-brace (.indexOf text "{" current-index)]
          (if (< open-brace 0)
            (.toString (.append builder (.substring text current-index)))
            (let [close-brace (.indexOf text "}" open-brace)]
              (if (< close-brace 0)
                (.toString (.append builder (.substring text current-index)))
                (do
                  (.append builder (.substring text current-index open-brace))
                  (.append builder (let [key (keyword (.substring text (inc open-brace) close-brace))
                                         replacement (m key)]
                                     (if (nil? replacement) "" replacement)))
                  (recur (inc close-brace)))))))))))

虽然它通过了所有测试用例:

(use 'clojure.test)

(deftest test-replace-templates
  (is (= (replace-templates "this is a test" {:foo "FOO"})
        "this is a test"))
  (is (= (replace-templates "this is a {foo} test" {:foo "FOO"})
        "this is a FOO test"))
  (is (= (replace-templates "this is a {foo} test {bar}" {:foo "FOO" :bar "BAR"})
        "this is a FOO test BAR"))
  (is (= (replace-templates "this is a {foo} test {bar} 42" {:foo "FOO" :bar "BAR"})
        "this is a FOO test BAR 42"))
  (is (= (replace-templates "this is a {foo} test {bar" {:foo "FOO" :bar "BAR"})
        "this is a FOO test {bar")))

; user=> Ran 1 tests containing 5 assertions.
; user=> 0 failures, 0 errors.
; user=> {:type :summary, :test 1, :pass 5, :fail 0, :error 0}

这是测试代码:

(time (dotimes [n 100] (replace-templates
  "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, {foo} mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. {bar} Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis {baz} venenatis {foo}
lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in {bar} neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo." {:foo "HELLO" :bar "GOODBYE" :baz "FORTY-TWO"})))

; user=> "Elapsed time: 41.475 msecs"
; user=> nil

我想知道问题是否是 StringBuilder 的连续重新分配。

In my previous post, I showed a simple (naive) algorithm for doing a String template replacement.

One of the solutions, provided by mikera, seems like a much better algorithm. I implemented it in Clojure (follows) and timed it against my previous algorithm. The new one is slower (41.475 msecs vs. 19.128 msecs) on 100 runs. I must be doing something stupid in my new implementation.

(defn replace-templates
  "Return a String with each occurrence of a substring of the form {key}
   replaced with the corresponding value from a map parameter.
   @param str the String in which to do the replacements
   @param m a map of keyword->value"
  [text m]
  (let [builder (StringBuilder.)
        text-length (.length text)]
    (loop [current-index 0]
      (if (>= current-index text-length)
        (.toString builder)
        (let [open-brace (.indexOf text "{" current-index)]
          (if (< open-brace 0)
            (.toString (.append builder (.substring text current-index)))
            (let [close-brace (.indexOf text "}" open-brace)]
              (if (< close-brace 0)
                (.toString (.append builder (.substring text current-index)))
                (do
                  (.append builder (.substring text current-index open-brace))
                  (.append builder (let [key (keyword (.substring text (inc open-brace) close-brace))
                                         replacement (m key)]
                                     (if (nil? replacement) "" replacement)))
                  (recur (inc close-brace)))))))))))

although it passes all test cases:

(use 'clojure.test)

(deftest test-replace-templates
  (is (= (replace-templates "this is a test" {:foo "FOO"})
        "this is a test"))
  (is (= (replace-templates "this is a {foo} test" {:foo "FOO"})
        "this is a FOO test"))
  (is (= (replace-templates "this is a {foo} test {bar}" {:foo "FOO" :bar "BAR"})
        "this is a FOO test BAR"))
  (is (= (replace-templates "this is a {foo} test {bar} 42" {:foo "FOO" :bar "BAR"})
        "this is a FOO test BAR 42"))
  (is (= (replace-templates "this is a {foo} test {bar" {:foo "FOO" :bar "BAR"})
        "this is a FOO test {bar")))

; user=> Ran 1 tests containing 5 assertions.
; user=> 0 failures, 0 errors.
; user=> {:type :summary, :test 1, :pass 5, :fail 0, :error 0}

Here is the test code:

(time (dotimes [n 100] (replace-templates
  "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, {foo} mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. {bar} Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis {baz} venenatis {foo}
lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in {bar} neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo." {:foo "HELLO" :bar "GOODBYE" :baz "FORTY-TWO"})))

; user=> "Elapsed time: 41.475 msecs"
; user=> nil

I wonder if the problem is the continuous reallocation of the StringBuilder.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

贱贱哒 2024-11-16 02:57:43

我认为你受到了反思的打击。 *warn-on-reflection* 是你的朋友。这里是一些使用标准的测试。

replace-templates-original:         56.4us
replace-templates-original-hinted:   9.4us
replace-templates-new:             131.4us
replace-templates-new-hinted:        6.3us
replace-templates-very-new:          7.3us

这是replace-templates-very-new,这是我自己为高尔夫制作的一个版本。 :)

(defn replace-templates-very-new
  [^String text m]
  (let [builder (StringBuilder.)]
    (loop [text text]
      (cond
        (zero? (count text))
        (.toString builder)

        (.startsWith text "{")
        (let [brace (.indexOf text "}")]
          (if (neg? brace)
            (.toString (.append builder text))
            (do
              (.append builder (get m (keyword (subs text 1 brace))))
              (recur (subs text (inc brace))))))

        :else
        (let [brace (.indexOf text "{")]
          (if (neg? brace)
            (.toString (.append builder text))
            (do
              (.append builder (subs text 0 brace))
              (recur (subs text brace)))))))))

它通过了所有测试,所以它应该可以工作。

更新:支持非键大括号括起来的值(“这是一个 {not-a-key-{foo}-in-the-map} 测试”=>“这是一个{not-a-key-FOO-in-the-map} test"),允许它在 Java 代码生成器中使用,其中非键大括号括起来的内容很重要:-)。

(defn replace-templates-even-newer
  "Return a String with each occurrence of a substring of the form {key}
   replaced with the corresponding value from a map parameter.
   @param str the String in which to do the replacements
   @param m a map of keyword->value
   @thanks kotarak http://stackoverflow.com/questions/6112534/
     follow-up-to-simple-string-template-replacement-in-scala-and-clojure"
  [^String text m]
  (let [builder (StringBuilder.)]
    (loop [text text]
      (cond
        (zero? (count text))
        (.toString builder)

        (.startsWith text "{")
        (let [brace (.indexOf text "}")]
          (if (neg? brace)
            (.toString (.append builder text))
            (if-let [[_ replacement] (find m (keyword (subs text 1 brace)))]
              (do
                (.append builder replacement)
                (recur (subs text (inc brace))))
              (do
                (.append builder "{")
                (recur (subs text 1))))))

        :else
        (let [brace (.indexOf text "{")]
          (if (neg? brace)
            (.toString (.append builder text))
            (do
              (.append builder (subs text 0 brace))
              (recur (subs text brace)))))))))

I think you are hit by reflection. *warn-on-reflection* is your friend. Here some tests with criterium.

replace-templates-original:         56.4us
replace-templates-original-hinted:   9.4us
replace-templates-new:             131.4us
replace-templates-new-hinted:        6.3us
replace-templates-very-new:          7.3us

Here is the replace-templates-very-new, a version I did myself for golf. :)

(defn replace-templates-very-new
  [^String text m]
  (let [builder (StringBuilder.)]
    (loop [text text]
      (cond
        (zero? (count text))
        (.toString builder)

        (.startsWith text "{")
        (let [brace (.indexOf text "}")]
          (if (neg? brace)
            (.toString (.append builder text))
            (do
              (.append builder (get m (keyword (subs text 1 brace))))
              (recur (subs text (inc brace))))))

        :else
        (let [brace (.indexOf text "{")]
          (if (neg? brace)
            (.toString (.append builder text))
            (do
              (.append builder (subs text 0 brace))
              (recur (subs text brace)))))))))

It passes all tests, so it should work.

UPDATE: Support non-key brace enclosed values ("this is a {not-a-key-{foo}-in-the-map} test" => "this is a {not-a-key-FOO-in-the-map} test"), allowing it to be used in a Java code generator where non-key brace-enclosed things are significant :-).

(defn replace-templates-even-newer
  "Return a String with each occurrence of a substring of the form {key}
   replaced with the corresponding value from a map parameter.
   @param str the String in which to do the replacements
   @param m a map of keyword->value
   @thanks kotarak http://stackoverflow.com/questions/6112534/
     follow-up-to-simple-string-template-replacement-in-scala-and-clojure"
  [^String text m]
  (let [builder (StringBuilder.)]
    (loop [text text]
      (cond
        (zero? (count text))
        (.toString builder)

        (.startsWith text "{")
        (let [brace (.indexOf text "}")]
          (if (neg? brace)
            (.toString (.append builder text))
            (if-let [[_ replacement] (find m (keyword (subs text 1 brace)))]
              (do
                (.append builder replacement)
                (recur (subs text (inc brace))))
              (do
                (.append builder "{")
                (recur (subs text 1))))))

        :else
        (let [brace (.indexOf text "{")]
          (if (neg? brace)
            (.toString (.append builder text))
            (do
              (.append builder (subs text 0 brace))
              (recur (subs text brace)))))))))
他不在意 2024-11-16 02:57:43

我编写了一些 Clojure 代码( https://gist.github.com/3729307 ),允许将任何映射值插入到模板中,可能是最快的方式(见下文)如果模板在编译时已知。

它不使用相同的模板语法(尽管它可以适应这一点),但我认为它仍然可以用来解决完全相同的问题。

使用这个解决方案,代码必须像这样重写......

; renderer-fn is defined in https://gist.github.com/3729307
(time (dotimes [n 100] ((renderer-fn
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, " (:foo %) " mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. " (:bar %) " Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis " (:baz %) " venenatis " (:foo %)
"lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in " (:bar %) " neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo.") {:foo "HELLO" :bar "GOODBYE" :baz "FORTY-TWO"})))

; => "Elapsed time: 1.371 msecs"

I've written some Clojure code ( https://gist.github.com/3729307 ) that allows to interpolate any map value into a template, in probably the fastest possible way (see below) IF the template is known at compile-time.

It doesn't use the same template syntax (although it could be adapted for that), but I think it still can be used to solve the exact same problem.

With this solution, the code would have to be rewritten like...

; renderer-fn is defined in https://gist.github.com/3729307
(time (dotimes [n 100] ((renderer-fn
"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, " (:foo %) " mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. " (:bar %) " Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis " (:baz %) " venenatis " (:foo %)
"lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in " (:bar %) " neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo.") {:foo "HELLO" :bar "GOODBYE" :baz "FORTY-TWO"})))

; => "Elapsed time: 1.371 msecs"
旧话新听 2024-11-16 02:57:43

老实说,您的解决方案看起来更像是披着 Clojure 外衣的 Java。 Clojure 已经有了相当灵活的 clojure.string /replace 函数可以满足您的需要。另外,您的文档字符串与 Clojure 约定不匹配。我建议这样:

(defn replace-templates
  "Returns a string with each occurrence of the form `{key}` in a
  `template` replaced with the corresponding value from a map
  `m`. Keys of `m` are expected to be keywords."
  [template m]
  (clojure.string/replace template #"\{([^{]+?)\}"
    (fn [[orig key]] (or (get m (keyword key)) orig))))

正如人们可以想象的那样,替换已经相当优化,因此没有真正的理由推出自己的实现。它在内部使用 StringBuffer,而您使用 StringBuilder,因此您的实现可能会节省几微秒——没什么值得谈论的。

如果您真的关心每一微秒,您可能应该研究宏观方法。如果这是不可能的,因为例如您正在从文件加载模板,那么 I/O 无论如何都会成为一个更大的问题。同样在这种情况下,我建议查看 Selmer 模板系统,它的语法略有不同(双花括号而不是单花括号进行替换),但它的功能也更加灵活。

To be honest, your solution looks more like Java in Clojure clothing. Clojure already has the quite flexible clojure.string/replace function which is able to do what you need. Also, your docstring is not matching the Clojure conventions. I would suggest something like this:

(defn replace-templates
  "Returns a string with each occurrence of the form `{key}` in a
  `template` replaced with the corresponding value from a map
  `m`. Keys of `m` are expected to be keywords."
  [template m]
  (clojure.string/replace template #"\{([^{]+?)\}"
    (fn [[orig key]] (or (get m (keyword key)) orig))))

As one can imagine, replace is already quite optimized, so there is no real reason to roll an own implementation. It's using StringBuffer internally, while you're using StringBuilder, so your implementation might save a few microseconds -- nothing worth talking about.

If you really care about every microsecond you probably should look into the macro approach. If that is not possible because e.g. you're loading the template from a file then i/o will be a bigger concern anyway. Also in this case I would suggest looking into the Selmer template system, which has a slightly different syntax (with double instead of single curly braces for replacements) but is also much more flexible in what it can do.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文