1

I am trying to load a YAML file to dictionary then handle dict and safe YAML stream to a file but facing the following issue.

The source y.yaml file includes some string with double quotes :

---
lev1:
  lev2:
    - name: schema
      templates:
        - name:  temp1
          deploy_onpr: "yes"
          deploy_cl: "no"

What i got in dest.yaml :

---
lev1:
  lev2:
    - name: schema
      new_item: test
      templates:
        - deploy_cl: no
          deploy_onpr: yes
          name: temp1

We can see that quote are removed from yes and no string.

What i expected in dest.yaml is that double quotes should be not delete from strings :

---
lev1:
  lev2:
    - name: schema
      new_item: test
      templates:
        - deploy_cl: "no"
          deploy_onpr: "yes"
          name: temp1

the code :

from fnmatch import fnmatch
import json
import os
from json import loads, dumps
import ruamel.yaml

yaml = ruamel.yaml.YAML(typ='safe', pure=True)
yaml.preserve_quotes = True

yaml.explicit_start = True
yaml.default_flow_style = False
yaml.indent(mapping=2, sequence=4, offset=2)

def to_dict(input_ordered_dict):
    """Convert the inventory data stream to dict"""
    print(dumps(input_ordered_dict))
    return loads(dumps(input_ordered_dict))

def _list_files():
    '''
    list all yaml files in the yaml directory recursive
    '''
    pattern = "*.yaml"
    in_folder="./"
    all_files = []
    for path, subdirs, files in os.walk(in_folder):
        for name in files:
            if fnmatch(name, pattern):
                f = os.path.join(path, name)
                all_files.append(f)
        return all_files

def load_files():
    '''
    load directory recursive and generate a dict tree
    data allocated in the structure under the key "file name"
    '''
    dict_tree = {}
    for sfile in _list_files():
        with open(sfile, 'r') as stream:
            print(type(yaml))
            data_loaded = yaml.load(stream)
            print(data_loaded)
            dict_tree[sfile] = data_loaded
    return dict_tree


load_yaml_files = load_files()
inventoty_data = to_dict(load_yaml_files)
inventoty_data['./y.yaml']["lev1"]["lev2"][0]["new_item"]="test"
new_dict=inventoty_data["./y.yaml"]

dst_file="dest.yaml"
with open(dst_file, 'w') as yaml_file:
    yaml.dump(new_dict, yaml_file)
Anthon
  • 69,918
  • 32
  • 186
  • 246
ovntatar
  • 416
  • 1
  • 3
  • 17
  • It looks like to get the behavior you want you will need to use `ruamel.yaml.round_trip_load` and `ruamel.yaml.round_trip_dump` as described in https://stackoverflow.com/a/42097202/147356 – larsks Apr 29 '22 at 14:38
  • no really because the example above not includes JSON part and If I'm not wrong the issue is by dumps ordereddict – ovntatar Apr 29 '22 at 14:53
  • @larsks those functions should no longer be used use a `YAML()` instance instead. I updated that answer. – Anthon Apr 29 '22 at 14:57
  • @ovntatar I don't understand why you have a `to_dict` function, IMO it is not needed. You should also look at `pathlib.Path` which allows you to do something like `inventoty_data = {str(path): yaml.load(path) for path in Path('**/*.yaml')}` – Anthon Apr 29 '22 at 15:12

2 Answers2

2

This looks like YAML 1.1, where yes and no where supposed to be boolean values (and thus needed quoting if they were meant as scalar strings).

The preserve_quotes attribute only works when using the default round-trip loader and dumper (that uses a subclass of the SafeLoader, and is also safe to use on unknown YAML sources):

import sys
import ruamel.yaml
from pathlib import Path

    
yaml = ruamel.yaml.YAML()
yaml.explicit_start = True
yaml.indent(sequence=4, offset=2)
yaml.preserve_quotes = True
data = yaml.load(Path('y.yaml'))
yaml.dump(data, sys.stdout)

which gives:

---
lev1:
  lev2:
    - name: schema
      templates:
        - name: temp1
          deploy_onpr: "yes"
          deploy_cl: "no"

Even if you assign data as a value to a key in a normal dict, the special versions of the string created during loading will be preserved and dumped out with quotes.

If you need to add/update some values in Python that need quotes for a string "abc" then do:

DQ = ruamel.yaml.scalarstring.DoubleQuotesScalarString

some_place_in_your_data_structure = DQ("abc")`
Anthon
  • 69,918
  • 32
  • 186
  • 246
  • thx @Anthon for your feedback by eliminate to_dict it woks if ruamel.yaml.round_trip_dump and ruamel.yaml.round_trip_load used with preserve_quotes = True option – ovntatar Apr 29 '22 at 15:26
  • Those functions are long deprecated, and will be gone in the next release, so don't use them. – Anthon Apr 29 '22 at 15:31
1

The following code works as expected:

from fnmatch import fnmatch
import json
import os
from json import loads, dumps
import ruamel.yaml

import sys
import ruamel.yaml
from pathlib import Path

ruamel.yaml.representer.RoundTripRepresenter.ignore_aliases = lambda x, y: True
yaml = ruamel.yaml.YAML()
yaml.explicit_start = True
yaml.indent(sequence=4, offset=2)
yaml.preserve_quotes = True


def _list_files():
    '''
    list all yaml files in the yaml directory recursive
    '''
    pattern = "*.yaml"
    in_folder="./"
    all_files = []
    for path, subdirs, files in os.walk(in_folder):
        for name in files:
            if fnmatch(name, pattern):
                f = os.path.join(path, name)
                all_files.append(f)
        return all_files

def load_files():
    '''
    load directory recursive and generate a dict tree
    data allocated in the structure under the key "file name"
    '''
    dict_tree = {}
    for sfile in _list_files():
        with open(sfile, 'r') as stream:
            data_loaded = yaml.load(stream)
            print(data_loaded)
            dict_tree[sfile] = data_loaded
    return dict_tree


load_yaml_files = load_files()
inventoty_data = load_yaml_files
inventoty_data['./y.yaml']["lev1"]["lev2"][0]["new_item"]="test"
new_dict=inventoty_data["./y.yaml"]

dst_file="dest.yaml"
with open(dst_file, 'w') as yaml_file:
    yaml.dump(new_dict, yaml_file)

output:

---
lev1:
  lev2:
    - name: schema
      templates:
        - name: temp1
          deploy_onpr: "yes"
          deploy_cl: "no"
      new_item: test
Anthon
  • 69,918
  • 32
  • 186
  • 246
ovntatar
  • 416
  • 1
  • 3
  • 17
  • You can accept your own answer if that solves the problem. Instead of writing 'thanks' in an answer (or a question), you should consider upvoting any comment or answer that helped you to show your appreciation. – Anthon Apr 30 '22 at 16:30