银行实习生问题陈述
我在迪拜的一家银行看到了实习机会。他们有一个明确的问题陈述要在两个月内解决。他们只告诉我们两行 -
“基本上问题是关于名称匹配逻辑。 有两个字段(变量)——都是雇主名称,并且是一个自由文本字段。所以我们需要编写一个程序来匹配这两个变量。”
任何人都可以帮助我理解它吗?它只是一个简单的模式匹配的东西吗? 任何帮助/意见将不胜感激。
I saw an intern opportunity in a bank in dubai. They have a defined problem statement to be solved in 2 months. They told us just 2 lines -
"Basically the problem is about name matching logic.
There are two fields (variables) – both are employer names, and it’s a free text field. So we need to write a program to match these two variables."
Can anyone help me in understanding it? Is it just a simple pattern matching stuff?
Any help/comments would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为这就是他们所要求的:
他们有两个相关数据来源,例如,一个来自内部数据库,另一个来自名片输入。
由于这两个字段都是自由文本字段,因此会出现不一致的情况。例如,
Nitin Garg
、Garg, Nitin
或Mr. Nitin Garg 等这里是卡扎菲的一个极端案例。
你应该做的是找到一种方法将特定人的所有名字匹配在一起。
简而言之,将两条数据按雇主名称匹配在一起,并考虑到可能存在的不一致。
I think this is what they are asking for:
They have two sources of related data, for example, one from an internal database, and the other from name card input.
Because the two fields are free text fields, there will be inconsistency. For example,
Nitin Garg
, orGarg, Nitin
, orMr. Nitin Garg
, etc. Here is an extreme case of Gadaffi.What you are supposed to do is to find a way to match all the names for a specific person together.
In short, match two pieces of data together by employer names, taking possible inconsistency into account.
曾几何时,尽管存在拼写错误和不同的音译,但对于匹配名称的问题,有一个很好的简单答案 - Soundex。但是人们已经在这个问题上投入了大量的工作,所以现在您可能应该使用该工作的结果,它内置于数据库和附加组件中 - 有些是免费的。请参阅使用 T-SQL 进行模糊匹配和http://anastasiosyal.com/archive/2009/01/11/18.aspx 和 http://msdn.microsoft.com/en-us/magazine/cc163731.aspx
Once upon a time there was a nice simple answer to the problem of matching up names despite mis-spellings and different transliterations - Soundex. But people have put a lot of work into this problem, so now you should probably use the results of that work, which is built into databases and add-ons - some free. See Fuzzy matching using T-SQL and http://anastasiosyal.com/archive/2009/01/11/18.aspx and http://msdn.microsoft.com/en-us/magazine/cc163731.aspx