How to use Regex in JavaScript — Part 3
In this article we will learn how we can use repetitions in our regex patterns. If you feel you lack your hold on basic regex patterns feel free to jump to my previous articles in which we covered the basics and character sets.
Let’s dive in now…
Using Repetition in Regex patterns
We can use the following metacharacters to check repetition in a pattern
- “+” : Matches one or more occurrences
- “?” : Matches zero or one occurrence
- “*” : Matches zero or more occurrences
If we use “?” instead of “+” which matches zero occurrences as well, we get a lot of results. This happens since we are matching basically everything by matching zero occurrence.
Same is the case with “*” which again matches zero or more occurrences.
However if we try to execute the same in JavaScript we get as follows:
let txt = `Betty Botter bought a bit of butter. "But," she said, "the butter's bitter. If I put it in my batter, it will make my batter bitter. But, a bit of better butter will make my batter better.`
let regex1 = /[A-Z]?/g // zero or one occurence
let regex2 = /[A-Z]*/g // zero or more occurences
// returns true since match is found
console.log(regex1.test(txt)) // true
// returns the first occurrence
console.log(regex1.exec(txt))
/* ['',
index: 1,
input: 'Betty Botter bought a bit of butter.
\"But,\" she said, \"the butter's bitter.
If I put it in my batter, it will make my
batter bitter. But, a bit of better butter
will make my batter better',
groups: undefined]
*/
// returns true since match is found
console.log(regex2.test(txt)) // true
// returns the first occurrence
console.log(regex2.exec(txt))
/* ['',
index: 1,
input: 'Betty Botter bought a bit of butter.
\"But,\" she said, \"the butter's bitter.
If I put it in my batter, it will make my
batter bitter. But, a bit of better butter
will make my batter better',
groups: undefined]
*/
A practical example to use the “+” metacharacter is when we want to match as many matches as possible (greedy) as follows:
Consider a case where we want to find how many times the reference of a word comes in a string.
In the above example, we find matches for both “orange” and “oranges” since the character “s” followed by “?” can have zero or one occurrence after the character group “orange”.
Most regex patterns are greedy by nature and they try to match as much as possible. Let’s learn more about it in the next section.
Greediness and Laziness in Regex
Consider a case where we want to find matches for the content starting from <p> tag and ending with a </p> tag in an HTML string.
Since regex patterns are greedy by nature, the “.” character tries to match everything and then backtracks to find the closing </p> tag. Thus we only get a single match consisting of two </p> tags instead of one.
However we can make this expression lazy by adding “?” as follows.
In the above example since “?” traverses sequentially trying to find if there is any character in between the <p> and </p>, as soon as it encounters the first occurrence of </p> it returns the match. There is no backtracking involved here and we get the expected result.
Specifying Repetition Amount
Till now we have only seen how we can match zero, one or more occurrences. In addition to those we can also specify repetition amount within curly braces in a regex pattern in following ways:
- {min, max} : We specify the minimum and the maximum number of occurrences.
- {min} : We specify exactly how many times the character should occur.
- {min, 0} : We specify the minimum times it should occur, it can be well above that as well with no limit set.
That’s all for this article, hope you found this informative. In the next part we will learn how to use anchored expressions.
Thanks for giving this a read, stay hooked for the next part ;)