MatchIt 问题 - 如何使用 Mahalanobis Dsitance 访问匹配单位之间的距离
是否可以使用 MatchIt::matchit()
函数获取匹配单元之间的距离?
这是一个可重现的示例。当我使用 distance = "glm"
时可以看到距离,但使用 distance = "mahalanobis"
时则看不到距离。
如果您有其他套餐的推荐,我也很乐意尝试。我只想与另一个单位匹配,而不是计算 ATT。谢谢你!
# Run nearest neighbor with "mahalanobis" distance
res_matchitmahalanobis <- matchit(
data = df_example,
formula = treat ~ age + male,
method = "nearest",
distance = "mahalanobis",
exact = ~ male,
replace = TRUE
)
# Note: No `distance` column
get_matches(res_matchitmahalanobis)
# Note: `distance` element is missing
res_matchitmahalanobis$distance
# Run nearest neighbor with "glm" distance
res_glm <- matchit(
data = df_example,
formula = treat ~ age + male,
method = "nearest",
distance = "glm",
exact = ~ male,
replace = TRUE
)
# Note: There is now a `distance` column
get_matches(res_glm)
# Note: `distance` element is now present
res_glm$distance
Is it possible to get the distances between matched units using the MatchIt::matchit()
function?
Here is a reproducible example. I can see the distances when I use distance = "glm"
but not with distance = "mahalanobis"
.
If you have a recommendation for a different package I am also happy to try that. I am only looking to match to another unit and not, for example, to calculate an ATT. Thank you!
# Run nearest neighbor with "mahalanobis" distance
res_matchitmahalanobis <- matchit(
data = df_example,
formula = treat ~ age + male,
method = "nearest",
distance = "mahalanobis",
exact = ~ male,
replace = TRUE
)
# Note: No `distance` column
get_matches(res_matchitmahalanobis)
# Note: `distance` element is missing
res_matchitmahalanobis$distance
# Run nearest neighbor with "glm" distance
res_glm <- matchit(
data = df_example,
formula = treat ~ age + male,
method = "nearest",
distance = "glm",
exact = ~ male,
replace = TRUE
)
# Note: There is now a `distance` column
get_matches(res_glm)
# Note: `distance` element is now present
res_glm$distance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您使用马哈拉诺比斯,他们似乎不会给您距离。不过,他们使用该指标来计算结果。
如果您想使用 Mahalanobis,可以将其与另一个指标(如“glm”)一起使用。或者,您可以单独收集距离。
我使用 glm 和 Mahalanobis 距离运行
matchit
函数。然后我分别收集了马氏距离。 (真的,我想看看距离是 Malahanobis 还是 glm...但正如预期的那样,它们是 glm。)要收集 Mahalanobis 距离(即使有因子且无需额外工作),您可以使用包
assertr< /code> 和函数
maha_dist
。基本 R 函数要求您手动将因子转换为值。It looks like they don't give you the distances if you use Mahalanobis. They calculate the results using that metric, though.
If you'd like to use Mahalanobis, you can use it along with another metric (like 'glm'). Alternatively, you can collect the distances separately.
I ran the
matchit
function with both the glm and Mahalanobis distances. Then I collected the Mahalonbis distances separately. (Really, I wanted to see if the distances were Malahanobis or glm...but as expected, they were glm.)To collect the Mahalanobis distances (even with factors and no extra work) you can use the package
assertr
and the functionmaha_dist
. The base R function requires you to manually convert factors to values.正如@Kat 指出的,matchit() 不返回该值。将其放在
distance
列中是不合适的;请参阅此处为什么。matchit
对象中的distance
输出是用词不当;它指的是倾向得分,每个单位有一个distance
值。这就是为什么它显示为distance = "glm"
;您正在估计倾向得分,然后使用该得分来计算单位之间的距离。matchit()
中的任何方法都不会实际返回两个配对单元之间的距离。提取这些信息需要花费大量的工作。
matchit()
不提供匹配中使用的马哈拉诺比斯距离矩阵(因为这对于大数据集来说太大了!)。但是,您可以在matchit()
之外计算距离矩阵,将其提供给distance
参数,然后通过在执行完操作后从矩阵中提取这些距离来访问单位之间的距离配对。您可以使用optmatch::match_on()
等方法计算马氏距离,但不能保证它与matchit()
内部使用的马氏距离相同。具体操作方法如下:由 reprex 包于 2022 年 2 月 24 日创建 (v2 .0.1)
这也适用于
replace = FALSE
,但在 k:1 匹配或完全匹配时需要更多工作。尽管您没有使用matchit()
的马哈拉诺比斯距离进行匹配,但上面输出中生成的距离确实与用于配对的距离相对应。As @Kat pointed out,
matchit()
does not return this value. It would be inappropriate to have this in thedistance
column; see here for why. Thedistance
output in thematchit
object is a misnomer; it refers to the propensity score, and each unit has onedistance
value. This is why it shows up withdistance = "glm"
; you are estimating a propensity score, which is then used to compute the distance between units. No methods inmatchit()
will actually return the distance between two paired units.It would take a fair bit of work to extract this information.
matchit()
does not provide the Mahalanobis distance matrix used in the matching (because this would be way too big for big datasets!). However, you can compute a distance matrix outsidematchit()
, supply it to thedistance
argument, and then access the distance between units by extracting those distances from the matrix after doing the pairing. You can compute the Mahalanobis distance using, e.g.,optmatch::match_on()
, though it is not guaranteed to be identical to the Mahalanobis distancematchit()
uses internally. Here is how you would do this:Created on 2022-02-24 by the reprex package (v2.0.1)
This works with
replace = FALSE
as well but would take a bit more work when k:1 matching or full matching. Although you are not matching usingmatchit()
's Mahalanobis distance, the distances produced in the output above do correspond to the distances used to pair.