Stata中的三元比较运算符?
在我的 Stata do
脚本中,我经常需要比较可能丢失的日期。不幸的是, .
的内部表示是给定范围内最大可能的数字,因此以下情况成立:
5 < .
这可能会变得非常烦人,例如在检查日期是否在某个范围内时:
gen between_start_stop = . if d == .
replace between_start_stop = 1 if ///
!missing(d) & !missing(start) & !missing(stop) & ///
start < d & d < stop
replace between_start_stop = 0 if ///
((!missing(d) & !missing(start) & !(start < d)) | ///
(!missing(d) & !missing(stop) & !(d < stop))
而不是以下内容:
gen between_start_stop = (start < d) & (d < stop)
有没有办法使用与三元逻辑一起使用的比较运算符?
即,我希望以下陈述为真:
(5 < .) == .
(. < .) == .
(. < 5) == .
(. & 1) == .
(. & 0) == 0
etc...
In my Stata do
scripts, I often have to compare dates which may be missing. Unfortunately, the internal representation of .
is the largest possible number of the given range, so the following holds:
5 < .
This can become quite annoying e.g. when checking whether a date is within a certain range:
gen between_start_stop = . if d == .
replace between_start_stop = 1 if ///
!missing(d) & !missing(start) & !missing(stop) & ///
start < d & d < stop
replace between_start_stop = 0 if ///
((!missing(d) & !missing(start) & !(start < d)) | ///
(!missing(d) & !missing(stop) & !(d < stop))
instead of the following:
gen between_start_stop = (start < d) & (d < stop)
Is there a way to use comparison operators that work with ternary logic?
I.e., I would like the following statements to be true:
(5 < .) == .
(. < .) == .
(. < 5) == .
(. & 1) == .
(. & 0) == 0
etc...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一些建议:
inrange()
(另请参阅 inlist)来指定范围,而不是一系列<
和>
语句;missing()
或!missing()
语句中指定多个项目,例如!missing(start, stop, d)
和听起来您确实想使用
cond()
,它(使用帮助文件中的 ex)可用于在一个函数中指定多个条件:g var = 1 if cond(missing(x), ., cond(x>2,50,70))
如果
x
缺失,则返回.
;如果x > 则返回
,如果50
。 2x
,则返回
70
2A couple of suggestions:
inrange()
(also look at inlist) to specify ranges instead of a series of<
and>
statements;missing()
or!missing()
statements like!missing(start, stop, d)
andit really sounds like you want to use
cond()
, which (using an ex from the help file) can be used to specify multiple conditions in one function:g var = 1 if cond(missing(x), ., cond(x>2,50,70))
returns
.
ifx
is missing, returns50
ifx > 2
, and returns70
ifx < 2
这个类比并没有让你得到你想要的——当已知的 d 低于已知的开始(即使停止在这里无关紧要地缺失)或已知的 d 高于已知的停止(即使开始)时,这个公式返回“缺失”在这里,无关紧要的是,丢失了)。两种情况下的正确值为“false”。我有一个实用程序(“有效”),允许“生成”访问三值逻辑并执行您想要的操作 - 请参阅我的网页上的讨论 http://www.nuffield.ox.ac.uk/People/sites/KIM/SitePages/Biography.aspx
其中有一篇论文的链接,该论文进一步扩展(但要注意的是,该论文刚刚被 Stata Journal 拒绝,因为“太难理解了”)
The analogy does not get you what you want -- This formulation returns ‘missing’ when a known d is below a known start (even if stop is, here irrelevantly, missing) or a known d is above a known stop (even if start is, here irrelevantly, missing). The correct value in both cases is ‘false’. I have a utility ('validly') which allows 'generate' to access three-valued logic and does what you want -- see discussion on my webpage http://www.nuffield.ox.ac.uk/People/sites/KIM/SitePages/Biography.aspx
which has a link to a paper expanding further (but be warned -- that has just been rejected by the Stata Journal as being "far too difficult to understand"