2

How do I create a list of strings in python where the strings can contain commas such that the interpreter doesn't break on commas prematurely.

I am reading in a template csv file, created by MSExcel. The contents are 2 columns: col1 = color, col2 = a sentence that can contain a comma. The content is as follows:

Light Green,"Matches type, instance, etc."
Red,Happens in BD10 but not BD20
Blue,"Instance and time matches, type doesn't match"
Yellow,Caught on replay

When I read in and output what was read, it breaks the input line on all commas.

#my code snippet, please excuse any typos during sanitization
with open('outfile.csv','r') as infile:
   blah = infile.readlines()
   for i in blah:
      line = i.strip().split(",")
      print line

My output:
['Light Green', '"Matches type', ' instance', ' etc."']
['Red', 'Happens in BD10 but not BD20']
['Blue', '"Instance and time matches', ' type doesn\'t match"']
['Yellow', 'Caught on replay']

How do I tell python to ignore commas when it's within "" and break at commas all other times?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Classified
  • 5,759
  • 18
  • 68
  • 99

3 Answers3

4

python provides tools to parse csv files ... you should use them

print list(csv.reader(open("outfile.csv","r")))

maybe?

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
4

The cvs module makes this easy as pie.

Given:

$ cat /tmp/so.csv
Light Green,"Matches type, instance, etc."
Red,Happens in BD10 but not BD20
Blue,"Instance and time matches, type doesn't match"
Yellow,Caught on replay

Try:

import csv
with open('/tmp/so.csv') as f:
    for line in csv.reader(f):
        print line    

Prints:

['Light Green', 'Matches type, instance, etc.']
['Red', 'Happens in BD10 but not BD20']
['Blue', "Instance and time matches, type doesn't match"]
['Yellow', 'Caught on replay']
dawg
  • 98,345
  • 23
  • 131
  • 206
2

Stop the split after the first comma:

line = i.strip().split(",", 1)
Brent Washburne
  • 12,904
  • 4
  • 60
  • 82
  • @brent washburne, this solution IS cool. Just to be comprehensive, what if the sentence is in col1 and color is in col2, how would I tell python to not break on commas inside the sentence since your trick wouldn't work due to the breaking comma not being in the first occurrence? – Classified Sep 10 '15 at 00:11