-2

Possible Duplicate:
Split line with perl

I have a line:

regizor: Betty Thomas Distribuţia: Sandra Bullock (Gwen Cummings) Viggo Mortensen (Eddie Boone) Dominic West (Jasper) rendező: David Mamet, Robert Elswit szereplő(k): Chiwetel Ejiofor (Mike Terry) Alice Braga (Sondra Terry) Emily Mortimer (Laura Black)

I want to split with perl in:

regizor: Betty Thomas
Distribuţia: Sandra Bullock (Gwen Cummings) Viggo Mortensen (Eddie Boone) Dominic West (Jasper)
rendező: David Mamet Robert Elswit
szereplő(k): Chiwetel Ejiofor (Mike Terry) Alice Braga (Sondra Terry) Emily Mortimer (Laura Black)
Community
  • 1
  • 1
user935420
  • 113
  • 1
  • 7

3 Answers3

7

How about:

my @splitBits = split /(?=\S+: )/, $str;

This will split the string before every occurrence of a "word" (a sequence of non-space characters) followed by a colon and a space (and without producing an empty field at the beginning).

jwodder
  • 54,758
  • 12
  • 108
  • 124
5

You could use the following regex:

$line =~ s/(\S+:)/\n$1/sg;

This says "Find any non-space character (\S), at least once (+), which has a colon after it, and stick a new line in front of it."

You'll get a leading newline which you can chop off easily.

When I ran it on your line, I got

regizor: Betty Thomas 
Distribuţia: Sandra Bullock (Gwen Cummings) Viggo Mortensen (Eddie Boone) Dominic West (Jasper) 
rendező: David Mamet, Robert Elswit 
szereplő(k): Chiwetel Ejiofor (Mike Terry) Alice Braga (Sondra Terry) Emily Mortimer (Laura Black)
gatlin
  • 159
  • 1
  • 7
2
perl -p -e 's/ ([^ ]*?:)/\n$1/g' <file.txt

Gives:

regizor: Betty Thomas
Distribu.ia: Sandra Bullock (Gwen Cummings) Viggo Mortensen (Eddie Boone) Dominic West (Jasper)
rendez.: David Mamet, Robert Elswit
szerepl.(k): Chiwetel Ejiofor (Mike Terry) Alice Braga (Sondra Terry) Emily Mortimer (Laura Black)
Francisco R
  • 4,032
  • 1
  • 22
  • 37