Regex Needed To Split Javascript String On "|" But Not "\|"
Solution 1:
Instead of a split, do a global match (the same way a lexical analyzer would):
- match anything other than
\\
or|
- or match any escaped char
Something like this:
var str = "1|2|3\\|4|5";
var matches = str.match(/([^\\|]|\\.)+/g);
A quick explanation: ([^\\|]|\\.)
matches either any character except '\'
and '|'
(pattern: [^\\|]
) or (pattern: |
) it matches any escaped character (pattern: \\.
). The +
after it tells it to match the previous once or more: the pattern ([^\\|]|\\.)
will therefor be matches once or more. The g
at the end of the regex literal tells the JavaScript regex engine to match the pattern globally instead of matching it just once.
Solution 2:
What you're looking for is a "negative look-behind matching regular expression".
This isn't pretty, but it should split the list for you:
var output = input.replace(/(\\)?|/g, function($0,$1){ return$1?$1:$0+'\n';});
This will take your input string and replace all of the '|' characters NOT immediately preceded by a '\' character and replace them with '\n' characters.
Solution 3:
A regex solution was posted as I was looking into this. So I just went ahead and wrote one without it. I did some simple benchmarks and it is -slightly- faster (I expected it to be slower...).
Without using Regex, if I understood what you desire, this should do the job:
functiondoSplit(input) {
var output = [];
var currPos = 0,
prevPos = -1;
while ((currPos = input.indexOf('|', currPos + 1)) != -1) {
if (input[currPos-1] == "\\") continue;
var recollect = input.substr(prevPos + 1, currPos - prevPos - 1);
prevPos = currPos;
output.push(recollect);
}
var recollect = input.substr(prevPos + 1);
output.push(recollect);
returnoutput;
}
doSplit('1|2|3\\|4|5'); //returns [ '1', '2', '3\\|4', '5' ]
Post a Comment for "Regex Needed To Split Javascript String On "|" But Not "\|""