在 Clojure 中表示图表
我试图通过移植玩具 NFA 正则表达式匹配器。显然我的主要问题是表示和操作图表。我找到了一个可行的解决方案,但我的实现(基本上使用 gensym 来模拟指针)让我感到恶心。
愿意提出改进建议,无论是对于表示图形还是对于一般可读性和惯用语? (最初的命令式解决方案比我当前的解决方案更具可读性)。
干杯!
; This is a port of Matt Might's toy regexp library described in
; http://matt.might.net/articles/implementation-of-nfas-and-regular-expressions-in-java/
; char_transitions is a map that associates a character with a set
; of target state names
; empty_transitions is a set of target state names
(defrecord NfaState [final char_transitions empty_transitions])
; 'entry' and 'exit' are state names
; 'states' is a map that associates state names with states
(defrecord Nfa [entry exit states])
(defn- add-empty-edge [nfa curr_state_name next_state_name]
(let [curr_state (curr_state_name (:states nfa))]
(Nfa. (:entry nfa) (:exit nfa) (assoc (:states nfa) curr_state_name (NfaState. (:final curr_state) (:char_transitions curr_state) (conj (:empty_transitions curr_state) next_state_name))))))
(defn- nfa-matches?
([nfa nfa_state_name string] (nfa-matches? nfa nfa_state_name string #{}))
([nfa nfa_state_name string visited]
(let [nfa_state (nfa_state_name (:states nfa))
new_visited (conj visited nfa_state_name)]
(cond
(contains? visited nfa_state_name) false
(empty? string) (or (:final nfa_state)
(some #(nfa-matches? nfa % string new_visited)
(:empty_transitions nfa_state)))
(not (empty? string)) (or
(some #(nfa-matches? nfa % (.substring string 1)) (get (:char_transitions nfa_state) (.charAt string 0) #{}))
(some #(nfa-matches? nfa % string new_visited) (:empty_transitions nfa_state)))
:else false))))
(defn matches? [nfa string]
"Matches a string against an NFA"
(nfa-matches? nfa (:entry nfa) string))
(defn c [ch]
"Creates an NFA which matches the character 'c'"
(let [in (gensym)
out (gensym)]
(Nfa. in out {in (NfaState. false {ch #{out}} #{}) out (NfaState. true {} #{})})))
(defn e []
"Creates an NFA which matches the empty string"
(let [in (gensym)
out (gensym)]
(Nfa. in out {in (NfaState. false {} #{out}) out (NfaState. true {} #{})})))
(defn rep [nfa]
"Creates an NFA which matches zero or more repetitions of the given NFA"
(add-empty-edge (add-empty-edge nfa (:entry nfa) (:exit nfa)) (:exit nfa) (:entry nfa)))
(defn- update-final [nfastate new_final]
(NfaState. new_final (:char_transitions nfastate) (:empty_transitions nfastate)))
(defn s [first second]
"Creates an NFA which matches a sequence of two provided NFAs"
(add-empty-edge (Nfa. (:entry first) (:exit second) (merge (update-in (:states first) [(:exit first)] update-final false) (update-in (:states second) [(:exit second)] update-final true))) (:exit first) (:entry second)))
(defn ou [first second]
"Creates an NFA which matches either provided NFA"
(let [in (gensym)
out (gensym)]
(add-empty-edge (add-empty-edge (Nfa. in out (assoc (merge (update-in (:states first) [(:exit first)] update-final false) (update-in (:states second) [(:exit second)] update-final false)) in (NfaState. false {} #{(:entry first) (:entry second)}) out (NfaState. true {} #{}))) (:exit first) out) (:exit second) out)))
; sugar
(defn st [string]
"Creates an NFA which matches the provided string"
(if (empty? string)
(e)
(s (c (.charAt string 0)) (st (.substring string 1)))))
I am trying to learn a bit of clojure by porting a toy NFA regexp matcher. Obviously my main issue is representing and manipulating graphs. I hit a working solution, but my implementation (using gensym
to emulate pointers, basically) leaves me with a yucky taste in my mouth.
Care to suggest improvements, both for representing graphs and for general readability and idioms? (the original imperative solution is much more readable than my current one is).
Cheers!
; This is a port of Matt Might's toy regexp library described in
; http://matt.might.net/articles/implementation-of-nfas-and-regular-expressions-in-java/
; char_transitions is a map that associates a character with a set
; of target state names
; empty_transitions is a set of target state names
(defrecord NfaState [final char_transitions empty_transitions])
; 'entry' and 'exit' are state names
; 'states' is a map that associates state names with states
(defrecord Nfa [entry exit states])
(defn- add-empty-edge [nfa curr_state_name next_state_name]
(let [curr_state (curr_state_name (:states nfa))]
(Nfa. (:entry nfa) (:exit nfa) (assoc (:states nfa) curr_state_name (NfaState. (:final curr_state) (:char_transitions curr_state) (conj (:empty_transitions curr_state) next_state_name))))))
(defn- nfa-matches?
([nfa nfa_state_name string] (nfa-matches? nfa nfa_state_name string #{}))
([nfa nfa_state_name string visited]
(let [nfa_state (nfa_state_name (:states nfa))
new_visited (conj visited nfa_state_name)]
(cond
(contains? visited nfa_state_name) false
(empty? string) (or (:final nfa_state)
(some #(nfa-matches? nfa % string new_visited)
(:empty_transitions nfa_state)))
(not (empty? string)) (or
(some #(nfa-matches? nfa % (.substring string 1)) (get (:char_transitions nfa_state) (.charAt string 0) #{}))
(some #(nfa-matches? nfa % string new_visited) (:empty_transitions nfa_state)))
:else false))))
(defn matches? [nfa string]
"Matches a string against an NFA"
(nfa-matches? nfa (:entry nfa) string))
(defn c [ch]
"Creates an NFA which matches the character 'c'"
(let [in (gensym)
out (gensym)]
(Nfa. in out {in (NfaState. false {ch #{out}} #{}) out (NfaState. true {} #{})})))
(defn e []
"Creates an NFA which matches the empty string"
(let [in (gensym)
out (gensym)]
(Nfa. in out {in (NfaState. false {} #{out}) out (NfaState. true {} #{})})))
(defn rep [nfa]
"Creates an NFA which matches zero or more repetitions of the given NFA"
(add-empty-edge (add-empty-edge nfa (:entry nfa) (:exit nfa)) (:exit nfa) (:entry nfa)))
(defn- update-final [nfastate new_final]
(NfaState. new_final (:char_transitions nfastate) (:empty_transitions nfastate)))
(defn s [first second]
"Creates an NFA which matches a sequence of two provided NFAs"
(add-empty-edge (Nfa. (:entry first) (:exit second) (merge (update-in (:states first) [(:exit first)] update-final false) (update-in (:states second) [(:exit second)] update-final true))) (:exit first) (:entry second)))
(defn ou [first second]
"Creates an NFA which matches either provided NFA"
(let [in (gensym)
out (gensym)]
(add-empty-edge (add-empty-edge (Nfa. in out (assoc (merge (update-in (:states first) [(:exit first)] update-final false) (update-in (:states second) [(:exit second)] update-final false)) in (NfaState. false {} #{(:entry first) (:entry second)}) out (NfaState. true {} #{}))) (:exit first) out) (:exit second) out)))
; sugar
(defn st [string]
"Creates an NFA which matches the provided string"
(if (empty? string)
(e)
(s (c (.charAt string 0)) (st (.substring string 1)))))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
地图和集合是一种优秀的数据结构,因为它们在开发过程中是可见的,并且可以使用 zipper 库 轻松操作,该库您可能会发现对于此类问题很有用。
Maps and sets are an excellent data such structures because they are visible durring development and can be easily manipulated using the zipper library which you may find useful for this type of problem.