使用 SQL 查找重症监护婴儿的数量
我有一个BIGTABLE:
| name |surname| date_of_birth |hospital| op_id | op_date | medical_branch |
|++++++|+++++++|+++++++++++++++|++++++++|+++++++|+++++++++|++++++++++++++++|
| | | | | | | |
这是一张表,里面有大约500家医院一年的医疗手术记录。大约有 3 亿行。
我的问题是:
对于每个 X 医院,有多少婴儿出生,其中有多少婴儿在出生后立即(同一天)在同一家医院接受重症监护?
例如在A医院,100名婴儿出生,其中20名婴儿于同一天被送往同一家医院的重症监护室。
P619920是出生时的操作id。出生时记录的是母亲的id信息,而不是婴儿的id信息。
I552015是重症监护的手术id。记录了宝宝的id信息。
首先,我通过以下方式隔离医院 A:
create table hospital_A as ( select * from BIGTABLE where hospital = 'A' )
hospital_A 的模式如下:
| name |surname| date_of_birth|hospital| op_id | op_date | medical_branch |
|+++++++|+++++++|++++++++++++++|++++++++|+++++++|+++++++++++|+++++++++++++++++ |
| Mary | White | 01.02.1981 | A |P619920| 09.12.2011| Gynecology |
| John | White | 09.12.2011 | A |I552015| 09.12.2011|Infantile diseases|
我尝试匹配:
a)婴儿和母亲的姓氏
b)手术日期。
select distinct
t.name
,t.surname
,t.op_date
,t.op_id
,t.medical_branch
,v.name
,v.surname
--,v.op_id
,v.medical_branch
,v.date_of_birth
,v.hospital
from hospital_A t, hospital_A v
where t.op_date=v.date_of_birth
and t.surname=v.surname
and t.op_id = 'P619920'
and v.op_id = 'I552015'
这给了我 20 行。
但是,当我对 BIGTABLE 应用相同的查询时,A 医院有 81 行。
据我了解,这种差异源于在其他医院出生但在同一天被送往 A 医院接受重症监护的婴儿。
在 BIGTABLE 中,模式如下:
| name |surname|date_of_birth|hospital| op_id | op_date | medical_branch |
|+++++++|+++++++|+++++++++++++|++++++++|+++++++++|++++++++++|++++++++++++++++++|
| Mary | White | 01.02.1981 | A | P619920 |09.10.2011| Gynecology |
| John | White | 09.10.2011 | A | I552015 |09.10.2011|Infantile diseases|
| Ellen | Young | 04.09.1978 | B | P619920 |07.12.2011| Gynecology |
| Robert| Young | 07.12.2011 | A | I552015 |07.12.2011|Infantile diseases|
因此,BIGTABLE 的结果与孤立的 Hospital_A 表不同,因为它包含 Hospital B。
我对 BIGTABLE 的查询应该提取发生在同一家医院。
结果表:
| hospital | total births | intensive_care_babies |
|----------|--------------|-----------------------|
| A | 100 | 20 |
| B | ... | .. |
I have a BIGTABLE:
| name |surname| date_of_birth |hospital| op_id | op_date | medical_branch |
|++++++|+++++++|+++++++++++++++|++++++++|+++++++|+++++++++|++++++++++++++++|
| | | | | | | |
It is a table with records of medical operations in about 500 hospitals for a year. There are about 300 million rows.
My question is:
For every Hospital X, how many babies were born and how many of them got intensive care immediately (same day) after birth on the same hospital?
For example in hospital A, 100 babies were born and 20 of them were taken to intensive care on the very same day in the same hospital.
P619920 is the operation id for birth. Id info of the mother is recorded for birth, not the baby.
I552015 is the operation id for intensive care. Id info of the baby is recorded.
First, I isolated hospital A by:
create table hospital_A as ( select * from BIGTABLE where hospital = 'A' )
Pattern of hospital_A is like:
| name |surname| date_of_birth|hospital| op_id | op_date | medical_branch |
|+++++++|+++++++|++++++++++++++|++++++++|+++++++|+++++++++++|+++++++++++++++++ |
| Mary | White | 01.02.1981 | A |P619920| 09.12.2011| Gynecology |
| John | White | 09.12.2011 | A |I552015| 09.12.2011|Infantile diseases|
I tried to match :
a) surnames of the baby and mother
b) the operation dates.
select distinct
t.name
,t.surname
,t.op_date
,t.op_id
,t.medical_branch
,v.name
,v.surname
--,v.op_id
,v.medical_branch
,v.date_of_birth
,v.hospital
from hospital_A t, hospital_A v
where t.op_date=v.date_of_birth
and t.surname=v.surname
and t.op_id = 'P619920'
and v.op_id = 'I552015'
This gave me 20 rows.
Bu when I apply the same query to BIGTABLE, there are 81 rows for Hospital A.
I understand this difference stems from babies which were born in other hospitals but were taken to Hospital A on the same day for intensive care.
In the BIGTABLE, pattern is like:
| name |surname|date_of_birth|hospital| op_id | op_date | medical_branch |
|+++++++|+++++++|+++++++++++++|++++++++|+++++++++|++++++++++|++++++++++++++++++|
| Mary | White | 01.02.1981 | A | P619920 |09.10.2011| Gynecology |
| John | White | 09.10.2011 | A | I552015 |09.10.2011|Infantile diseases|
| Ellen | Young | 04.09.1978 | B | P619920 |07.12.2011| Gynecology |
| Robert| Young | 07.12.2011 | A | I552015 |07.12.2011|Infantile diseases|
So, the results of the BIGTABLE is different from the isolated hospital_A table, as it contains Hospital B.
My query for BIGTABLE should extract the same day birth-intensive care combination which happened in the same hospital.
The resulting table:
| hospital | total births | intensive_care_babies |
|----------|--------------|-----------------------|
| A | 100 | 20 |
| B | ... | .. |
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
只需附加两行必须来自同一家医院的条件:
顺便说一句,您的声明中还缺少一个条件。您正在检查母亲的手术日期是否与婴儿的出生日期相同。您正在使用它来将婴儿与其母亲相匹配。但您忘记检查婴儿的OP日期是否与其出生日期相同。只有在出生当天被送往重症监护室的婴儿才需要满足这种条件。
回答您评论中的问题:
Simply append the condition that both rows must be from the same hospital:
BTW, you are missing one more condition in your statement. You are checking that the OP date of the mother is the same as the birth date of the baby. You are using this to match the baby to its mother. But you forgot to check that the OP date of the baby is the same as its birth date. This condition is required to get only the babies that are sent to intensive care on the day they were born.
To answer the question in your comment:
除了医院之外,还可以添加姓名和姓氏,或者更好的是,如果您有个人 ID,也可以添加它。有了这么多行,几乎可以肯定会有同名同姓的人。
In addition to hospital, join on name in addition to surname, or better yet, on a person-id if you have it. With this many rows, there are almost guaranteed to be persons with identical names and surnames.