clubmate.fi

A good[ish] website

Web development blog, loads of UI and JavaScript topics

Match a string between two ”markers“ using regex

Filed under: JavaScript— Tagged with: regex

Here’s a short guide into lookahead and lookbehind assertions, and how to find matches between two points.

Match between “markers”

I recently had a need for a regex that matches everything between two points: between a start marker and a stop marker. Something like the following (the highlighted lines should be matched):

import React from 'react'

// start-match
const Foo = () => {  return 'Everything between here should be matched'}// end-match

export default Foo

Here’s a helper function that uses a lookbehind and lookahead assertions to accomplish that:

const matcher = (string = '') => {
  return string.match(/(?<=\/\/ ?start-match\n)[\s\S]*(?=\/\/ ?end-match)/)[0]
}

Detailed explanation

Lookbehind and lookahead assertions do not include themselves in the match ever, which is exactly what we want.

                 A multiline match one or
                        more times
  Lookbehind ─┐             │          ┌──Lookahead
   assertion  │             │          │  assertion
 ┌────────────┴─────────┐┌──┴──┐┌──────┴──────────┐
/(?<=\/\/ ?start-match\n)[\s\S]*(?=\/\/ ?end-match)/

At first when I was tinkering with this, I wondered why the lookbehind comes first? Shouldn’t it come last? No, because this [\s\S]* is the point of reference, not the assertion.

  • [\s\S]*: a character class with a singe whitespace character and singe non-whitespace character short-hand character class, zero or more times. This is a way of doing multiline matches. You’d think you could do multiline matches with .*, but you can’t.
  • (?<=\/\/ ?start-match\n): lookbehind that matches [\s\S] only if its’s preceded by the string "// start-match".
  • (?=\/\/ ?end-match): lookahead that matches [\s\S] only if it’s followed by "// end-match".

In this SO tread there is interesting jabber about multiline matches. You could also use: [^], or (.|\r|\n), or (.|[\r\n]).

Conclusions

There’s also Negative lookahead assertion and Negative lookbehind assertion. See MDN for more details. If you’re new to regex check my beginner-friendly article on the topic.

Comments would go here, but the commenting system isn’t ready yet, sorry. Tweet me @intterne if you want to make a correction etc.

  • © 2022 Antti Hiljá
  • About
  • Follow my new Twitter account → @intterne
  • All rights reserved yadda yadda.
  • I can put just about anything here, no one reads the footer anyways.
  • I love u!