This is a RegEx question.
Thanks for any help and please be patient as RegEx is definitely not my strength !
Entirely as background...my reason for asking is that I want to use RegEx to parse strings similar to SVG path data segments. I’ve looked for previous answers that parse both the segments and their segment-attributes, but found nothing that does the latter properly.
Here are some example strings like the ones I need to parse:
M-11.11,-22
L.33-44
ac55 66
h77
M88 .99
Z
I need to have the strings parsed into arrays like this:
["M", -11.11, -22]
["L", .33, -44]
["ac", 55, 66]
["h", 77]
["M", 88, .99]
["Z"]
So far I found this code on this answer: Parsing SVG "path" elements with C# - are there libraries out there to do this? The post is C#, but the regex was useful in javascript:
var argsRX = /[\s,]|(?=-)/;
var args = segment.split(argsRX);
Here's what I get:
[ "M", -11.11, -22, <empty element> ]
[ "L.33", -44, <empty>, <empty> ]
[ "ac55", <empty>, <empty>, <empty>, 66 <empty> ]
[ "h77", <empty>, <empty>
[ "M88", .99, <empty>, <empty> ]
[ "Z", <empty> ]
Problems when using this regex:
- An unwanted empty array element is being put at the end of each string's array.
- If multiple spaces are delimiters, an unwanted empty array element is being created for each extra space.
- If a number immediately follows the opening letters, that number is being attached to the letters, but should become a separate array element.
Here are more complete definitions of incoming strings:
- Each string starts with 1 or more letters (mixed case).
- Next are zero or more numbers.
- The numbers might have minus signs (always preceeding).
- The numbers might have a decimal point anywhere in the number (except the end).
- Possible delimiters are: comma, space, spaces, the minus sign.
- A Comma with space(s) in front or back is also a possible delimiter.
- Even though minus signs are delimiters, they must also remain with their number.
- A number might immediately follow the opening letters (no space) and that number should be separate.
Here is test code I've been using:
<!doctype html>
<html>
<head>
<link rel="stylesheet" type="text/css" media="all" href="css/reset.css" /> <!-- reset css -->
<script type="text/javascript" src="http://code.jquery.com/jquery.min.js"></script>
<style>
body{ background-color: ivory; }
</style>
<script>
$(function(){
var pathData = "M-11.11,-22 L.33-44 ac55 66 h77 M88 .99 Z"
// separate pathData into segments
var segmentRX = /[a-z]+[^a-z]*/ig;
var segments = pathData.match(segmentRX);
for(var i=0;i<segments.length;i++){
var segment=segments[i];
//console.log(segment);
var argsRX = /[\s,]|(?=-)/;
var args = segment.split(argsRX);
for(var j=0;j<args.length;j++){
var arg=args[j];
console.log(arg.length+": "+arg);
}
}
}); // end $(function(){});
</script>
</head>
<body>
</body>
</html>