批量导入数据到ElasticSearch

发布于 2025-01-13 21:12:04 字数 1818 浏览 0 评论 0原文

我有 JSON 格式的 elasticsearch 数据,我想通过curl 立即上传

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/_bulk --data-binary @C:\Users\adm\Desktop\test.json

,但出现此错误:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_score]"}],"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_score]"},"status":400}

数据(test.json)如下所示:

{"index" :{"_index":"variationdetails","_type":"_doc","_id":"e17bd50b-fe65-423c-a9f8-4d45ecf56559","_score":1,"_source":{"entityname":"Cislo_f","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"daňový doklad č.","keypositionbottom":301,"keypositionleft":1482,"keypositionright":2000,"keypositiontop":251,"keytovaluedeltaleft":551,"keytovaluedeltatop":7,"userchanged":true,"valuepositionbottom":306,"valuepositionleft":2033,"valuepositionright":2387,"valuepositiontop":258,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}}
{"index" :{"_index":"variationdetails","_type":"_doc","_id":"c2a831f1-8156-434c-bd84-08db64c935a5","_score":1,"_source":{"entityname":"Datum_splatnosti","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"Datum splatnosti:","keypositionbottom":1154,"keypositionleft":1706,"keypositionright":2015,"keypositiontop":1112,"keytovaluedeltaleft":421,"keytovaluedeltatop":11,"userchanged":true,"valuepositionbottom":1149,"valuepositionleft":2127,"valuepositionright":2298,"valuepositiontop":1123,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"c2a831f1-8156-434c-bd84-08db64c935a5"}}}

我尝试将 _bulk 更改为variationdetails/_doc 但这没有帮助。 我无法在目标系统上使用 elasticdump(没有互联网或复制文件选项)

I have elasticsearch data in JSON that i wanna upload at once via curl

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/_bulk --data-binary @C:\Users\adm\Desktop\test.json

but I get this error:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_score]"}],"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_score]"},"status":400}

and the data (test.json) looks like this:

{"index" :{"_index":"variationdetails","_type":"_doc","_id":"e17bd50b-fe65-423c-a9f8-4d45ecf56559","_score":1,"_source":{"entityname":"Cislo_f","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"daňový doklad č.","keypositionbottom":301,"keypositionleft":1482,"keypositionright":2000,"keypositiontop":251,"keytovaluedeltaleft":551,"keytovaluedeltatop":7,"userchanged":true,"valuepositionbottom":306,"valuepositionleft":2033,"valuepositionright":2387,"valuepositiontop":258,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}}
{"index" :{"_index":"variationdetails","_type":"_doc","_id":"c2a831f1-8156-434c-bd84-08db64c935a5","_score":1,"_source":{"entityname":"Datum_splatnosti","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"Datum splatnosti:","keypositionbottom":1154,"keypositionleft":1706,"keypositionright":2015,"keypositiontop":1112,"keytovaluedeltaleft":421,"keytovaluedeltatop":11,"userchanged":true,"valuepositionbottom":1149,"valuepositionleft":2127,"valuepositionright":2298,"valuepositiontop":1123,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"c2a831f1-8156-434c-bd84-08db64c935a5"}}}

I tried changing _bulk to variationdetails/_doc but that didnt help.
I cant use elasticdump on the target system (no internet or copy files option)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

寒江雪… 2025-01-20 21:12:04

批量插入 API 的文档给出了所需输入的示例和描述。

对于要创建或更新的每条记录,您需要行 JSON:

  • 第一行指定要采取的操作,以及要执行该操作的文档。本质上,单个项目操作的 URL 和 HTTP 请求方法中的详细信息。
  • 第二行指定要使用的数据。本质上,是单个项目操作的正文中的详细信息。

因此,对于您的示例,它看起来像这样:

{"index" :{"_index":"variationdetails","_id":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}
{"_type":"_doc","_score":1,"_source":{"entityname":"Cislo_f","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"daňový doklad č.","keypositionbottom":301,"keypositionleft":1482,"keypositionright":2000,"keypositiontop":251,"keytovaluedeltaleft":551,"keytovaluedeltatop":7,"userchanged":true,"valuepositionbottom":306,"valuepositionleft":2033,"valuepositionright":2387,"valuepositiontop":258,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}
{"index" :{"_index":"variationdetails","_id":"c2a831f1-8156-434c-bd84-08db64c935a5"}}
{"_type":"_doc","_score":1,"_source":{"entityname":"Datum_splatnosti","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"Datum splatnosti:","keypositionbottom":1154,"keypositionleft":1706,"keypositionright":2015,"keypositiontop":1112,"keytovaluedeltaleft":421,"keytovaluedeltatop":11,"userchanged":true,"valuepositionbottom":1149,"valuepositionleft":2127,"valuepositionright":2298,"valuepositiontop":1123,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"c2a831f1-8156-434c-bd84-08db64c935a5"}}

我不确定 _source 是否应该是文档的一部分;如果没有,你可能想要这个:

{"index" :{"_index":"variationdetails","_id":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}
{"_type":"_doc","_score":1,"entityname":"Cislo_f","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"daňový doklad č.","keypositionbottom":301,"keypositionleft":1482,"keypositionright":2000,"keypositiontop":251,"keytovaluedeltaleft":551,"keytovaluedeltatop":7,"userchanged":true,"valuepositionbottom":306,"valuepositionleft":2033,"valuepositionright":2387,"valuepositiontop":258,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}
{"index" :{"_index":"variationdetails","_id":"c2a831f1-8156-434c-bd84-08db64c935a5"}}
{"_type":"_doc","_score":1,"entityname":"Datum_splatnosti","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"Datum splatnosti:","keypositionbottom":1154,"keypositionleft":1706,"keypositionright":2015,"keypositiontop":1112,"keytovaluedeltaleft":421,"keytovaluedeltatop":11,"userchanged":true,"valuepositionbottom":1149,"valuepositionleft":2127,"valuepositionright":2298,"valuepositiontop":1123,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"c2a831f1-8156-434c-bd84-08db64c935a5"}

The documentation for the bulk insert API gives an example and description of the required input.

For each record you want to create or update, you need two lines of JSON:

  • The first line specifies the action to take, and the document to take it on. Essentially, the details which would be in the URL and HTTP request method on a single-item action.
  • The second line specifies the data to use. Essentially, the details which would be in the body of a single-item action.

So for your example, it would look like this:

{"index" :{"_index":"variationdetails","_id":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}
{"_type":"_doc","_score":1,"_source":{"entityname":"Cislo_f","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"daňový doklad č.","keypositionbottom":301,"keypositionleft":1482,"keypositionright":2000,"keypositiontop":251,"keytovaluedeltaleft":551,"keytovaluedeltatop":7,"userchanged":true,"valuepositionbottom":306,"valuepositionleft":2033,"valuepositionright":2387,"valuepositiontop":258,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}
{"index" :{"_index":"variationdetails","_id":"c2a831f1-8156-434c-bd84-08db64c935a5"}}
{"_type":"_doc","_score":1,"_source":{"entityname":"Datum_splatnosti","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"Datum splatnosti:","keypositionbottom":1154,"keypositionleft":1706,"keypositionright":2015,"keypositiontop":1112,"keytovaluedeltaleft":421,"keytovaluedeltatop":11,"userchanged":true,"valuepositionbottom":1149,"valuepositionleft":2127,"valuepositionright":2298,"valuepositiontop":1123,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"c2a831f1-8156-434c-bd84-08db64c935a5"}}

I'm not sure if _source is supposed to be part of the document or not; if not, you probably want this:

{"index" :{"_index":"variationdetails","_id":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}}
{"_type":"_doc","_score":1,"entityname":"Cislo_f","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"daňový doklad č.","keypositionbottom":301,"keypositionleft":1482,"keypositionright":2000,"keypositiontop":251,"keytovaluedeltaleft":551,"keytovaluedeltatop":7,"userchanged":true,"valuepositionbottom":306,"valuepositionleft":2033,"valuepositionright":2387,"valuepositiontop":258,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"e17bd50b-fe65-423c-a9f8-4d45ecf56559"}
{"index" :{"_index":"variationdetails","_id":"c2a831f1-8156-434c-bd84-08db64c935a5"}}
{"_type":"_doc","_score":1,"entityname":"Datum_splatnosti","keyFieldObject":null,"keynotcolumn":false,"keyphrase":"Datum splatnosti:","keypositionbottom":1154,"keypositionleft":1706,"keypositionright":2015,"keypositiontop":1112,"keytovaluedeltaleft":421,"keytovaluedeltatop":11,"userchanged":true,"valuepositionbottom":1149,"valuepositionleft":2127,"valuepositionright":2298,"valuepositiontop":1123,"variationguid":"a20e3d7a-bf38-4eae-9f23-fb100b539d08","vddid":"c2a831f1-8156-434c-bd84-08db64c935a5"}
月下伊人醉 2025-01-20 21:12:04

如果您想将数据从集群传输到另一个集群,那么最好的选择是使用 elasticsearch 的快照和恢复 API。

如果您想使用 _bulk API 那么你需要遵循批量 api 格式,并且你的 json 格式只能是以下格式。您可以为批量 api 创建 ndjson 格式的 json 文件。

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }

您收到 _score 错误,因为它是 elasticsearch 的内部字段/变量,并且它根据您的查询显示相关性得分的值。

If you want to transfer data from cluster to another then best option is to use Snapshot and Restore API of elasticsearch.

If you want to use _bulk API then you need to follow bulk api format and your json format should be in below format only. You can create your json file in ndjson format for bulk api.

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }

You are getting error for _score because it is internal field / variable of elasticsearch and it show value of relevancy score based on your query.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文