/    Sign up×
Community /Pin to ProfileBookmark

Regular Expressions and exec()

The regular expression I wrote works. But I don’t know how to get the results out of the array returned by exec().

Take this text:

<p>paragraph 1</p>
<p>paragraph 2</p>
<p>paragraph 3</p>

What I’d like to get back from the array returned by exec() is the following:

paragraph 1
paragraph 2
paragraph 3

My code looks like this:

var text = ‘<p>paragraph 1</p>’;
text += ‘<p>paragraph 2</p>’;
text += ‘<p>paragraph 3</p>’;

var re = /(?:<p[^>]*>)(.*?)(?:</p>)/gi;
var arrMatch = [];
var match;

while (match = re.exec(text)) {
arrMatch.push(match[0]);
}

My hope for the code is this:

1) Match but don’t capture <p> and </p>. Capture whatever comes between them, however. Each capture forms a group

2) Loop through matches and grab the 0th group (the stuff between the paragraph tags). In this case, there would be three matches, each having a group of one member (the 0th group).

However, what I’m getting back doesn’t give me this at all. I’m not really sure what it gives back, actually. The 0th group sometimes seems to be the entire match including paragraph tags, and sometimes it’s something else.

I can’t seem to formulate a Google search that would give me back an example of how to proceed.

Is there any concept of capture groups or match collections in JavaScript? Can someone help?!

Thanks.

–Brent

to post a comment
JavaScript

3 Comments(s)

Copy linkTweet thisAlerts:
@mrhooNov 06.2009 — The 0th match includes everything that matches.

Use parentheses to mark a subset to return.

Also, your while statement may not work.

[CODE]var text= '<p>paragraph 1</p>';
text += '<p>paragraph 2</p>';
text += '<p>paragraph 3</p>';
var match, arrMatch= [], rx=/<p[^>]*>([^<]+)/g
while((match= rx.exec(text))!= null){
arrMatch.push(match[1]);
}

alert(arrMatch.join('n'))[/CODE]
Copy linkTweet thisAlerts:
@rnd_meNov 07.2009 — i often find it simpler to cheat using String.replace().

replace accepts functions as the second argument, which really opens up a lot of potential.

it moves the focus from esoteric regexp methods and properties to generic function code, which i think is easier to use/edit.

it also simplifies the regexps themselves, which can improve performance in many cases.

It also allows more re-use, since the expression is in a function, not hard-coded, it can be named and called from more than one place.



[CODE]

var s="
<p>paragraph 1</p>
<p>paragraph 2</p>
<p>paragraph 3</p> ";

var cRay=[]; //holder for replace hack

s.replace(/<p>([wW]+?)</p>/g,
function(total,sub){cRay.push(sub);}
);

alert(cRay) // shows:"paragraph 1,paragraph 2,paragraph 3"[/CODE]


the only downside is that you can't to the interactive-style global RegExp step-through, but i don't see that used often anyway...
Copy linkTweet thisAlerts:
@BJBiglerauthorNov 12.2009 — Thanks for your help! I got it working using both suggestions, although the second one, which I don't quite yet understand, seems to work better in Firefox.
×

Success!

Help @BJBigler spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 5.19,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @AriseFacilitySolutions09,
tipped: article
amount: 1000 SATS,

tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,
)...