0

I am having a nested json file like below:

test.json

{
  "resourceType": "test",
  "url": "/test/abc-oq-001",
  "version": "1.0",
  "title": "xyz test",
  "status": "active",
  "date": "1990-01-01T10:10:10+08:00",
  "description": "one",
  "code": [
    {
      "system": "http://aaa.com/ValueSet",
      "code": "test"
    }
  ],
  "item": [
    {
      "linkId": "twb",
      "text": "twb",
      "type": "group",
      "required": true,
      "item": [
        {
          "linkId": "twb.1",
          "text": "Input",
          "type": "group",
          "item": [
            {
              "linkId": "twb.1.1",
              "code": [
                {
                  "system": "http://aaa.com/ValueSete",
                  "code": "oq001"
                }
              ],
              "text": "sfd",
              "type": "quantity",
              "required": true
            },
            {
              "linkId": "twb.1.2",
              "code": [
                {
                  "system": "http://aaa.com/ValueSet",
                  "code": "oq002"
                }
              ],
              "text": "fsd",
              "type": "quantity",
              "required": true
            }
          ]
        }
      ]
    },
    {
      "linkId": "sfds",
      "text": "sfds",
      "type": "group",
      "required": true,
      "item": [
        {
          "linkId": "fds-history.1",
          "code": [
            {
              "system": "http://aaa.com/ValueSet",
              "code": "oq004"
            }
          ],
          "prefix": "1",
          "text": "Are you a dsfdf?",
          "type": "choice",
          "required": true,
          "answerValueSet": "http://aaa.com/ValueSet"
        }
      ]
    }
  ]
}

I need to normalize this file to have the respective columns and rows

file_0 = r'test.json'

with open(file_0) as json_file_0:
    json_data_0 = json.load(json_file_0)
    
# pd.json_normalize(json_data_0)
pd.json_normalize(json_data_0, record_path=['code'], meta=['resourceType', 'url', 'version', 'title', 'status', 'date',
       'description'])

However, there are more than one column that is having nested list of dictionaries. How should this be completely normalized. I understand flatten would completely flatten the data to the respective columns, but i would miss the hierarchy.

How should i completely normalize a json file having multipe objects having nested list of dicts.

blackfury
  • 675
  • 3
  • 11
  • 22
  • 1
    What are "the respective columns and rows" supposed to look like? Please describe your expected output. – Tranbi Jul 04 '23 at 08:25
  • I think this should help https://stackoverflow.com/questions/47242845/pandas-io-json-json-normalize-with-very-nested-json – Ali Massoud Jul 04 '23 at 08:38

0 Answers0