If first character = E:
If 5th character = '':
Ignore
Else replace XXXXX with XXXX.X
Else If 4th-5th Char is not '': (XXXX or XXXXX)
replace XXXXX with XXX + . + remainder (XXX.XX or XXX.X)
(All remaining are XXX)
我使用两个 SQL Update 语句来实现:
数字 1,对于非 E 代码:
USE MainDb;
UPDATE "dbo"."icd9cm_diagnosis_codes"
SET "DIAGNOSIS CODE" = SUBSTRING("DIAGNOSIS CODE",1,3)+'.'+SUBSTRING("DIAGNOSIS CODE",4,5)
FROM "dbo"."icd9cm_diagnosis_codes"
WHERE
SUBSTRING("DIAGNOSIS CODE",4,5) != ''
AND
LEFT("DIAGNOSIS CODE",1) != 'E'
数字 2 - 对于 E 代码:
UPDATE "dbo"."icd9cm_diagnosis_codes"
SET "DIAGNOSIS CODE" = SUBSTRING("DIAGNOSIS CODE",1,4)+'.'+SUBSTRING("DIAGNOSIS CODE",5,5)
FROM "dbo"."icd9_Diagnosis_table"
WHERE
LEFT("DIAGNOSIS CODE",1) = 'E'
AND
SUBSTRING("DIAGNOSIS CODE",5,5) != ''
似乎对我有用(使用 SQL Server 2008)。
Just wanted to chime in on how to correct the code decimal places. First, there are four broad points to consider:
Standard codes have Decimal place XXX.XX
Some Codes Do not have trailing decimal places
V Codes also follow the XXX.XX format --> V54.31
E Codes follow XXXX.X --> E850.9
Thus the general logic of how to fix the errors is
If first character = E:
If 5th character = '':
Ignore
Else replace XXXXX with XXXX.X
Else If 4th-5th Char is not '': (XXXX or XXXXX)
replace XXXXX with XXX + . + remainder (XXX.XX or XXX.X)
(All remaining are XXX)
I implemented this with two SQL Update statements:
Number 1, for Non E-codes:
USE MainDb;
UPDATE "dbo"."icd9cm_diagnosis_codes"
SET "DIAGNOSIS CODE" = SUBSTRING("DIAGNOSIS CODE",1,3)+'.'+SUBSTRING("DIAGNOSIS CODE",4,5)
FROM "dbo"."icd9cm_diagnosis_codes"
WHERE
SUBSTRING("DIAGNOSIS CODE",4,5) != ''
AND
LEFT("DIAGNOSIS CODE",1) != 'E'
Number 2 - For E Codes:
UPDATE "dbo"."icd9cm_diagnosis_codes"
SET "DIAGNOSIS CODE" = SUBSTRING("DIAGNOSIS CODE",1,4)+'.'+SUBSTRING("DIAGNOSIS CODE",5,5)
FROM "dbo"."icd9_Diagnosis_table"
WHERE
LEFT("DIAGNOSIS CODE",1) = 'E'
AND
SUBSTRING("DIAGNOSIS CODE",5,5) != ''
Seemed to do the trick for me (Using SQL Server 2008).
I ran into this same issue a while back and ended up building my own solution from scratch. Recently, I put up an open API for the codes for others to use: http://aqua.io/codes/icd9/documentation
“ICD-9-CM主要和其他诊断代码的字段长度为六个字符,对于除V代码之外的所有诊断代码,小数点都隐含在第三位和第四位之间。小数点隐含在第二个和第三个数字之间的 V 代码中。”
因此,我能够获得完整的 ICD-9 列表并根据需要重新格式化。
I finally found the following:
"The field for the ICD-9-CM Principal and Other Diagnosis Codes is six characters in length, with the decimal point implied between the third and fourth digit for all diagnosis codes other than the V codes. The decimal is implied for V codes between the second and third digit."
So I was able to get a hold of a complete ICD-9 list and reformat as required.
It's unfortunate because they (oddly) are missing the decimal places, but as several other posters have pointed out, adding them is fairly easy since the rules are known. I was able to use a regular expression based "find and replace" in my text editor to add them. One thing to watch out for if you go that route is that you can end up with codes that have a trailing "." but no zero after it. That's not valid, so you might need to go through and do another find/replace to clean those up.
The annoying thing about the data files in the link above is that there is no relationship to categories. Which you might need depending on your application. I ended up taking one of the RTF-based category files I found online and re-formatting it to get the ranges of each category. That was still doable in a text editor with some creative regular expressions.
import org.apache.log4j.BasicConfigurator
import org.apache.log4j.Level
import org.apache.log4j.Logger
import java.util.regex.Matcher
import java.util.regex.Pattern
Logger log = Logger.getRootLogger()
BasicConfigurator.configure();
Logger.getRootLogger().setLevel(Level.INFO);
Map shortDescMap = [:]
new File('CMS31_DESC_SHORT_DX.txt').eachLine {String l ->
int split = l.indexOf(' ')
String code = l[0..split].trim()
String desc = l[split+1..-1].trim()
shortDescMap.put(code, desc)
}
int shortLenCheck = 40 // arbitrary lengths, but provide some sanity checking...
int longLenCheck = 300
File longDescFile = new File('CMS31_DESC_LONG_DX.txt')
Map cmsRows = [:]
Pattern p = Pattern.compile(/^(\w*)\s+(.*)$/)
new File('parsedICD9.csv').withWriter { out ->
out.write('ICD9 Code\tShort Description\tLong Description\n')
longDescFile.eachLine {String row ->
Matcher m = row =~ p
if (m.matches()) {
String code = m.group(1)
String shortDescription = shortDescMap.get(code)
String longDescription = m.group(2)
if(shortDescription.size() > shortLenCheck){
log.info("Not short? $shortDescription")
}
if(longDescription.size() > longLenCheck){
log.info("${longDescription.size()} == Too long? $longDescription")
}
log.debug("Match 1:${code} -- 2:${longDescription} -- orig:$row")
if (code.startsWith('V')) {
if (code.size() > 3) {
code = code[0..2] + '.' + code[3..-1]
}
log.info("Code: $code")
} else if (code.startsWith('E')) {
if (code.size() > 4) {
code = code[0..3] + '.' + code[4..-1]
}
log.info("Code: $code")
} else if (code.size() > 3) {
code = code[0..2] + '.' + code[3..-1]
}
if (code) {
cmsRows.put(code, ['longDesc': longDescription])
}
out.write("$code\t$shortDescription\t$longDescription\n")
} else {
log.warn "No match for row: $row"
}
}
}
我希望这对某人有帮助。
肖恩
I was able to use the helpful answers here an create a groovy script to decimalize the code and combine long and short descriptions into a tab separated list. In case this helps anyone, I'm including my code here:
import org.apache.log4j.BasicConfigurator
import org.apache.log4j.Level
import org.apache.log4j.Logger
import java.util.regex.Matcher
import java.util.regex.Pattern
Logger log = Logger.getRootLogger()
BasicConfigurator.configure();
Logger.getRootLogger().setLevel(Level.INFO);
Map shortDescMap = [:]
new File('CMS31_DESC_SHORT_DX.txt').eachLine {String l ->
int split = l.indexOf(' ')
String code = l[0..split].trim()
String desc = l[split+1..-1].trim()
shortDescMap.put(code, desc)
}
int shortLenCheck = 40 // arbitrary lengths, but provide some sanity checking...
int longLenCheck = 300
File longDescFile = new File('CMS31_DESC_LONG_DX.txt')
Map cmsRows = [:]
Pattern p = Pattern.compile(/^(\w*)\s+(.*)$/)
new File('parsedICD9.csv').withWriter { out ->
out.write('ICD9 Code\tShort Description\tLong Description\n')
longDescFile.eachLine {String row ->
Matcher m = row =~ p
if (m.matches()) {
String code = m.group(1)
String shortDescription = shortDescMap.get(code)
String longDescription = m.group(2)
if(shortDescription.size() > shortLenCheck){
log.info("Not short? $shortDescription")
}
if(longDescription.size() > longLenCheck){
log.info("${longDescription.size()} == Too long? $longDescription")
}
log.debug("Match 1:${code} -- 2:${longDescription} -- orig:$row")
if (code.startsWith('V')) {
if (code.size() > 3) {
code = code[0..2] + '.' + code[3..-1]
}
log.info("Code: $code")
} else if (code.startsWith('E')) {
if (code.size() > 4) {
code = code[0..3] + '.' + code[4..-1]
}
log.info("Code: $code")
} else if (code.size() > 3) {
code = code[0..2] + '.' + code[3..-1]
}
if (code) {
cmsRows.put(code, ['longDesc': longDescription])
}
out.write("$code\t$shortDescription\t$longDescription\n")
} else {
log.warn "No match for row: $row"
}
}
}
发布评论
评论(6)
只是想插话如何纠正代码小数位。首先,需要考虑四大要点:
因此,如何修复错误的一般逻辑是
我使用两个 SQL Update 语句来实现:
数字 1,对于非 E 代码:
数字 2 - 对于 E 代码:
似乎对我有用(使用 SQL Server 2008)。
Just wanted to chime in on how to correct the code decimal places. First, there are four broad points to consider:
Thus the general logic of how to fix the errors is
I implemented this with two SQL Update statements:
Number 1, for Non E-codes:
Number 2 - For E Codes:
Seemed to do the trick for me (Using SQL Server 2008).
我不久前遇到了同样的问题,最终从头开始构建了自己的解决方案。最近,我为代码提供了一个开放的API供其他人使用: http://aqua.io/codes /icd9/documentation
您可以下载 JSON 格式的所有代码 (http:// api.aqua.io/codes/beta/icd9.json)或提取单个代码(http://api.aqua.io/codes/beta/icd9/250-1.json)。提取单个代码不仅可以为您提供 ICD-10“人行横道”(等效项),还可以提供一些额外的好处,例如相关的维基百科链接。
I ran into this same issue a while back and ended up building my own solution from scratch. Recently, I put up an open API for the codes for others to use: http://aqua.io/codes/icd9/documentation
You can just download all codes in JSON (http://api.aqua.io/codes/beta/icd9.json) or pull an individual code (http://api.aqua.io/codes/beta/icd9/250-1.json). Pulling a single code not only gives you the ICD-10 "crosswalk" (equivalents), but also some extra goodies, like relevant Wikipedia links.
我终于找到了以下内容:
“ICD-9-CM主要和其他诊断代码的字段长度为六个字符,对于除V代码之外的所有诊断代码,小数点都隐含在第三位和第四位之间。小数点隐含在第二个和第三个数字之间的 V 代码中。”
因此,我能够获得完整的 ICD-9 列表并根据需要重新格式化。
I finally found the following:
"The field for the ICD-9-CM Principal and Other Diagnosis Codes is six characters in length, with the decimal point implied between the third and fourth digit for all diagnosis codes other than the V codes. The decimal is implied for V codes between the second and third digit."
So I was able to get a hold of a complete ICD-9 list and reformat as required.
您可能会发现 ICD-9 代码遵循以下格式:
检查此项出:http://en.wikipedia.org/wiki/List_of_ICD-9_codes
You might find that the ICD-9 codes follow the following format:
Check this out: http://en.wikipedia.org/wiki/List_of_ICD-9_codes
我自己也为这个问题苦苦挣扎了很长一段时间。我能找到的最好的资源是这里的 zip 文件:
https://www. cms.gov/ICD9ProviderDiagnosticCodes/06_codes.asp
不幸的是,它们(奇怪地)缺少小数位,但正如其他几位发帖者指出的那样,添加它们相当容易,因为规则是已知的。我能够在文本编辑器中使用基于正则表达式的“查找和替换”来添加它们。如果您走这条路,需要注意的一件事是您最终可能会得到带有尾随“.”的代码。但后面没有零。这是无效的,因此您可能需要进行另一次查找/替换来清理它们。
上面链接中的数据文件的烦人之处在于与类别没有关系。根据您的应用程序,您可能需要它。我最终采用了在网上找到的一个基于 RTF 的类别文件,并重新格式化它以获得每个类别的范围。在带有一些创造性正则表达式的文本编辑器中,这仍然是可行的。
I struggled with this issue myself for a long time as well. The best resource I have been able to find for these are the zip files here:
https://www.cms.gov/ICD9ProviderDiagnosticCodes/06_codes.asp
It's unfortunate because they (oddly) are missing the decimal places, but as several other posters have pointed out, adding them is fairly easy since the rules are known. I was able to use a regular expression based "find and replace" in my text editor to add them. One thing to watch out for if you go that route is that you can end up with codes that have a trailing "." but no zero after it. That's not valid, so you might need to go through and do another find/replace to clean those up.
The annoying thing about the data files in the link above is that there is no relationship to categories. Which you might need depending on your application. I ended up taking one of the RTF-based category files I found online and re-formatting it to get the ranges of each category. That was still doable in a text editor with some creative regular expressions.
我能够使用这里有用的答案创建一个常规脚本来十进制化代码并将长描述和短描述组合成制表符分隔的列表。如果这对任何人有帮助,我将我的代码放在这里:
我希望这对某人有帮助。
肖恩
I was able to use the helpful answers here an create a groovy script to decimalize the code and combine long and short descriptions into a tab separated list. In case this helps anyone, I'm including my code here:
I hope this helps someone.
Sean