信用卡跟踪数据的正则表达式
是否有任何已知的正则表达式可以验证信用卡轨道 1 和轨道 2 数据?
编辑:
来自 Wikipedia:
金融卡第一轨上的信息包含多种格式:A、保留供发卡机构专有使用,B,如下所述,CM,保留供 ANSI 小组委员会 X3B10 使用,NZ,可供各个发卡机构使用:
轨道 1,格式 B:
- 起始标记 — 一个字符(通常为“%”)
- 格式代码 =“B” — 一个字符(仅限字母)
- 主帐号 (PAN) — 最多 19 个字符。通常(但并非总是)与卡正面印制的信用卡号码相符。
- 字段分隔符 — 1 个字符(通常为“^”)
- 名称 — 2 到 26 个字符
- 字段分隔符 — 1 个字符(通常为“^”)
- 到期日期 — 4 个字符,格式为 YYMM。
- 服务代码 — 三个字符
- 任意数据 — 可能包括 Pin 验证密钥指示符(PVKI,1 个字符)、PIN 验证值(PVV,4 个字符)、卡验证值或卡验证码(CVV 或 CVK,3 个字符)
- 结束标记 — 1字符(通常为“?”)
- 纵向冗余校验(LRC)——它是一个字符,也是根据磁道上的其他数据计算出的有效字符。应该注意的是,大多数读卡器设备在卡刷到表示层时不会返回该值,而仅使用它来验证读卡器内部的输入。
轨道 2:此格式由银行业 (ABA) 开发。该磁道采用 5 位方案(4 个数据位 + 1 个奇偶校验)写入,允许使用 16 个可能的字符,即数字 0-9,加上 6 个字符:; < => ? 。选择六个标点符号可能看起来很奇怪,但实际上这十六个代码只是映射到 ASCII 范围 0x30 到 0x3f,它定义了十个数字字符加上这六个符号。数据格式如下:
- 起始标记 — 一个字符(通常为“;”)
- 主帐号 (PAN) — 最多 19 个字符。通常(但并非总是)与卡正面印制的信用卡号码相符。
- 分隔符 — 一个字符(通常为“=”)
- 到期日期 — 四个字符,格式为 YYMM。
- 服务代码 — 三个字符
- 任意数据 — 如磁道一中那样
- 结束标记 — 一个字符(通常为“?”)
- 纵向冗余校验 (LRC) — 它是一个字符,并且是根据磁道上的其他数据计算得出的有效字符。应该注意的是,大多数读卡器设备在卡刷到表示层时不会返回该值,而仅使用它来验证读卡器内部的输入。
Are there any known regular expressions out there to validate credit card track 1 and track 2 data?
EDIT:
From Wikipedia:
The information on track 1 on financial cards is contained in several formats: A, which is reserved for proprietary use of the card issuer, B, which is described below, C-M, which are reserved for use by ANSI Subcommittee X3B10 and N-Z, which are available for use by individual card issuers:
Track 1, Format B:
- Start sentinel — one character (generally '%')
- Format code="B" — one character (alpha only)
- Primary account number (PAN) — up to 19 characters. Usually, but not always, matches the credit card number printed on the front of the card.
- Field Separator — one character (generally '^')
- Name — two to 26 characters
- Field Separator — one character (generally '^')
- Expiration date — four characters in the form YYMM.
- Service code — three characters
- Discretionary data — may include Pin Verification Key Indicator (PVKI, 1 character), PIN Verification Value (PVV, 4 characters), Card Verification Value or Card Verification Code (CVV or CVK, 3 characters)
- End sentinel — one character (generally '?')
- Longitudinal redundancy check (LRC) — it is one character and a validity character calculated from other data on the track. It should be noted that most reader devices do not return this value when the card is swiped to the presentation layer, and use it only to verify the input internally to the reader.
Track 2: This format was developed by the banking industry (ABA). This track is written with a 5-bit scheme (4 data bits + 1 parity), which allows for sixteen possible characters, which are the numbers 0-9, plus the six characters : ; < = > ? . The selection of six punctuation symbols may seem odd, but in fact the sixteen codes simply map to the ASCII range 0x30 through 0x3f, which defines ten digit characters plus those six symbols. The data format is as follows:
- Start sentinel — one character (generally ';')
- Primary account number (PAN) — up to 19 characters. Usually, but not always, matches the credit card number printed on the front of the card.
- Separator — one char (generally '=')
- Expiration date — four characters in the form YYMM.
- Service code — three characters
- Discretionary data — as in track one
- End sentinel — one character (generally '?')
- Longitudinal redundancy check (LRC) — it is one character and a validity character calculated from other data on the track. It should be noted that most reader devices do not return this value when the card is swiped to the presentation layer, and use it only to verify the input internally to the reader.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
注意:track1 中的帐号可以包含美国运通卡的空格。
所以 :
Note: Account number in track1 can contain spaces for American Express cards.
So :
模式 = re.compile(r'\b\d{16}\b')
pattern = re.compile(r'\b\d{16}\b')
轨道 1,格式 B 转换为
对构成有效字符的一些假设。
当然,不会检查数据是否真正有意义,并且 LRC(如果存在)也无法验证。
你能对照一些真实数据来检查一下它是否有效吗?
轨道 2 翻译为
Track 1, Format B translates to
with some assumptions as to what constitutes a valid character.
Of course there are no checks whether the data is actually meaningful, and the LRC (if present) also can't be validated.
Can you check this against some real data and see if it works?
Track 2 translates to
我正准备在regular-expressions.info 上发布相同的链接,用于验证曲目的抄送号码部分。
现在,棘手的部分来了。跟踪数据的格式因发卡机构甚至读卡器而异。例如,“分隔符”字符并不总是相同。这同样适用于最后的“哨兵”。
维基百科给出了很好的概述:http://en.wikipedia.org/wiki/Magnetic_stripe_card
track2,卡号后跟“=”(或偶尔有“D”)。那么您的到期日期为 MMDD。之后,Track2 拥有“任意数据”,可以是任何数据。
过了这一步我就不会太担心了。如果是跟踪数据,您现在就可以确定了。我想这取决于您打算如何处理数据。
无论如何,对于 Track2,您可以做的比在 cc 正则表达式末尾添加 [=D][0-9]{4} 而不是 $ 更糟糕:
对于 track1,您可以做类似的事情... Track1 包含更多可变数据,因此可能会更加复杂。
祝你好运!
I was about to post the same link on regular-expressions.info, for verifying the cc number part of the track.
Now, comes the tricky part. Track data varies in format between card issuers and even card readers. For example the 'separator' characters aren't always the same. Same applies to the end 'sentinels'.
Wikipedia gives a good overview: http://en.wikipedia.org/wiki/Magnetic_stripe_card
With track2, the card number is followed by an '=' (or occasionally a 'D'). Then you have expiry date as MMDD. After that, Track2 has 'discretionary data', which could be anything.
I wouldn't worry too much after this point. If it's track data, you'll be pretty sure by now. I guess it depends on what you are aiming to do with the data.
Anyway, for Track2 you could do a lot worse than adding [=D][0-9]{4} instead of the $ at the end of the cc regex:
For track1, you could do something similar ... Track1 contains more variable data, so can be a touch more complicated.
Good luck!
以下两个正则表达式似乎验证了轨道 1 和轨道 2 的数据。请注意,这些正则表达式假设所使用的字符是上面维基百科信息中“通常”使用的字符。
假设 % 和 ?是哨兵字符,^ 用作字段分隔符。还假设帐号、日期和服务代码都是数字。
假设;和 ?是哨兵字符,= 是字段分隔符。还假设帐号、日期和服务代码都是数字。
我使用从 MagTek 读卡器读取的轨迹数据测试了这些表达式。以下两组跟踪数据与从阅读器读取的数据相匹配,并根据上面的两个正则表达式进行验证(数字显然已更改):
The following two regular expressions seem to validate the track 1 and track 2 data. Note that these regular expressions make assumptions that the characters used are the ones that are "generally" used in the Wikipedia information above.
Assumes that % and ? are the sentinel characters and that ^ is used as the field separator character. Also assumes that the account number, date, and service code are digits.
Assumes that ; and ? are the sentinel characters and that = is the field separator character. Also assumes that the account number, date, and service code are digits.
I tested these expressions using track data read from a MagTek card reader. The following two sets of track data match what was read from the reader and validate against the two regular expressions above (the numbers have obviously been changed):
这是一个可以让我选择轨道 1 和轨道 2 的正则表达式。将其与正则表达式选项“点与换行符不匹配”一起使用。
我用这些数据进行了测试(我的读卡器正在按顺序读取磁道 1 和磁道 2 记录,对于我测试的同一张卡 - 下面更改了数字和名称。)
上面的正则表达式使用命名捕获组(“?”)。从每个(组)开始),我看到的结果(使用 RegexBuddy)为:
请注意,第二场比赛不会识别轨道 2(比赛 2)中的 FC(格式代码)和 NM(名称),因为它们未在轨道 2.
如果您的正则表达式引擎不支持 NAMED GROUPS,只需删除“?”每个捕获组的一部分。然后,使用位置来确定每个组。
另外,我的单曲 SWIPE 包含轨道 1 和轨道 2(按顺序为轨道 1、crlf,然后是轨道 2)。根据原始问题中的维基百科链接,卡片最多可以有 3 个轨道,读者可能会同时阅读轨道 1 和 2(或其中一个),而很少阅读轨道 3。
出于这个原因,我认为使用它是一个安全的选择一个查找轨道 1 和轨道 2 的正则表达式,如果两者都找到,则可以忽略轨道 2(因为轨道 1 有更多数据)或任何您想要的内容。
由于两个轨道都出现在我的滑动中,因此 REGEX 引擎将返回 2 个与上面的 REGEX 匹配的内容(假设阅读器没有读取错误并且阅读器支持这两个轨道)。就我而言,这不会困扰我,我只会计划使用“第一个匹配”并忽略第二个。
如果您仅对轨道 1 感兴趣,请使用此正则表达式:
如果您仅对轨道 2 感兴趣,请使用正则表达式:
但我认为检查两者然后使用您获得的第一个,或者可能比较轨道没有什么坏处1 到跟踪 2 或许可以作为额外的错误检查步骤。
抱歉回答似乎已回答的问题!
Here is a REGEX that works for me to pick both Track 1 and Track 2. Use this with the regex option "Dot does NOT match newline".
I tested with this data (my reader is reading both a Track 1 and Track 2 record, in this order, for the same card I tested with - numbers and name changed below.)
The above REGEX uses NAMED CAPTURE GROUPS (the "?" that starts out each (group)) and I see the result (with RegexBuddy) as:
Note the second match does NOT identify FC (format code) and NM (name) in the Track 2 (match 2) since they are not used in track 2.
If your regex engine does not support NAMED GROUPS, just kill the "?" part of each of the capturing groups. Then, use position to determine each group.
Also, my single SWIPE contains BOTH track 1 and track 2 (in that order, track 1, a crlf and then track 2). According to the Wikipedia link in the original question, cards can have up to 3 tracks and readers might read tracks 1 and 2 both (or one or the other) and rarely track 3.
For this reason, I think it's a safe bet to use a REGEX that looks for both track 1 and track 2 and if you get both, you can ignore track 2 (since track 1 has more data) or whatever you wish.
Because both tracks are present in my swipes, the REGEX engine will return 2 matches with my REGEX above (assuming no read error from the reader and a reader that supports both tracks). In my case, this does not bother me and I'll simply plan to use the "first match" and ignore the second.
If you're interested only in track 1, use this regex:
If you're interested only in track 2, use the regex:
But I see no harm in checking for both and then using the first one you get, or perhaps comparing track 1 to track 2 as an additional error checking step perhaps.
Sorry to answer what seems to be answered!