在Java中扁平的JSON - 时间复杂性

发布于 2025-01-28 08:06:50 字数 1127 浏览 2 评论 0原文

{
  "id": "12345678",
  "data": {
    "address": {
      "street": "Address 1",
      "locality": "test loc",
      "region": "USA"
    },
    "country_of_residence": "USA",
    "date_of_birth": {
      "month": 2,
      "year": 1988
    },
    "links": {
      "self": "https://testurl"
    },
    "name": "John Doe",
    "nationality": "XY",
    "other": [
      { 
        "key1": "value1",
        "key2": "value2
      },
      { 
        "key1": "value1",
        "key2": "value2"
      }
    ],
    "notified_on": "2016-04-06"
  }
}

我正在尝试从返回分页的JSON响应的GraphQL API中读取数据。我需要将其写入CSV。我一直在探索春季批次以进行实施，其中我将在ItemReader中读取JSON数据并将每个JSON条目（在ItemProcessor中）弄平，然后将此扁平的数据写入CSV（在ItemWriter中）。虽然我可以使用杰克逊之类的东西来使JSON变平，但我担心如果JSON数据被大量嵌套，我可能会影响性能。

预期输出：

id, data.address.street, data.address.locality, data.address.region, data.country_of_residence, data.date_of_birth.month, data.date_of_birth.year, data.links.self, data.name, data.nationality, data.other (using jsonPath), data.notified_on

我需要进行一百万多个记录。虽然我认为将CSV变平是线性操作o（n），但我仍然想知道是否有其他警告是否会严重嵌套。

原文

{
  "id": "12345678",
  "data": {
    "address": {
      "street": "Address 1",
      "locality": "test loc",
      "region": "USA"
    },
    "country_of_residence": "USA",
    "date_of_birth": {
      "month": 2,
      "year": 1988
    },
    "links": {
      "self": "https://testurl"
    },
    "name": "John Doe",
    "nationality": "XY",
    "other": [
      { 
        "key1": "value1",
        "key2": "value2
      },
      { 
        "key1": "value1",
        "key2": "value2"
      }
    ],
    "notified_on": "2016-04-06"
  }
}

I am trying to read data from a GraphQL API that returns paginated JSON response. I need to write this into a CSV. I have been exploring Spring Batch for implementation where I would read JSON data in the ItemReader and flatten each JSON entry (in ItemProcessor) and then write this flattened data into a CSV (in ItemWriter). While I could use something like Jackson for flattening the JSON, I am concerned about possible performance implications if the JSON data is heavily nested.

expected output:

id, data.address.street, data.address.locality, data.address.region, data.country_of_residence, data.date_of_birth.month, data.date_of_birth.year, data.links.self, data.name, data.nationality, data.other (using jsonPath), data.notified_on

I need to do process more than a million records. While I believe flattening the CSV would be a linear operation O(n), I was still wondering if there could be other caveats if the JSON structure gets severely nested.

分享到QQ

分享到微博