匹配金融交易和商人
我有一个具有数十亿笔交易的弹性指数。
TX示例:
{
"id": "54bfa9af-009a-437d-bd21-caaf651f7218",
"amount": 100.0,
"currency": "EUR",
"type": "expense",
...
"note": "CARD PAYMENT TO AZAMON.COM 100.0 EUR, RATE 0.86/GBP ON 05-05-2022"
}
我在RDBMS表中有数百万个商人(公司等):
id | Name
123 | Azamon Ltd
456 | Alple Inc.
789 | Goooogle
...
我可以轻松地将它们摄入另一个弹性索引中。
现在,分析了Transaction的Note
和商家的名称
。
我希望,对于每项索引的新事务,都可以用商家ID+名称丰富其内容。不过,它不必是完美的。一旦解决方案适用于大多数匹配的TX,就可以对阈值得分进行微调。
例如
对于上面的TX,我想获得“ 123+Azamon Ltd”
我是否应该只创建一个自定义令牌,分析器,并使用TX Note作为搜索词来针对“商人”索引进行查询?单语言TX/商人匹配的良好管道结构是什么?
还是可以解决该问题的更有效的解决方案? 我正在阅读有关NER,文件相似性和其他内容的信息,但我无法弄清楚(简单)情况中最好的方法是什么。
指出我相关且经过验证的文档页面将被视为可接受的答案。 泰
I have an Elastic index with billions of transactions.
An example tx:
{
"id": "54bfa9af-009a-437d-bd21-caaf651f7218",
"amount": 100.0,
"currency": "EUR",
"type": "expense",
...
"note": "CARD PAYMENT TO AZAMON.COM 100.0 EUR, RATE 0.86/GBP ON 05-05-2022"
}
I have a few millions merchants (companies, etc.) in a RDBMS table:
id | Name
123 | Azamon Ltd
456 | Alple Inc.
789 | Goooogle
...
I can easily ingest them in another Elastic index.
Now, both transaction's note
and merchant's name
are analyzed fields.
I would like, for every new transaction indexed, to enrich its content with a merchant ID+name. It doesn't have to be perfect, though. A threshold score could be fine tuned once the solution works for most of the matching tx.
e.g.
for the tx above, I would like to obtain "123+Azamon Ltd"
Should I just create a custom tokenizer, analyzer for that, and run a query against a "merchants" index using the tx note as a search term? What would be a good pipeline structure for single-language tx/merchants matching?
Or is there a out-of-the box more efficient solution for that problem?
I'm reading about NER, documents similarity and other stuff, but I can't figure out what's the best approach in my (simple) case.
Pointing me to relevant and proven doc pages will be considered an acceptable answer.
TY
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论