Match URLs with regex

A regex that captures http and https URLs in free text. Covers query strings, fragments, ports, and unicode-safe alternatives.

# pattern

/https?://[\w./?=&%#:-]+/gi

→ Open in regex tester (pre-filled)

# how it works

Matches http or https (the `s?` makes the s optional), then `://`, then a run of URL-safe characters. The character class accepts letters, digits, dots, slashes, question marks, equals, ampersands, percent signs, hashes, colons, and hyphens — covering query strings, fragments, and port numbers.

# sample input

See https://bytefork.tools and http://example.com/path?q=1#section or check https://docs.python.org:8080/3/library/re.html for details.

# pitfalls

Will match a trailing period if the URL ends a sentence. Use a lookbehind or trim after the match.
Does not validate the URL is well-formed. For that, parse with new URL(match) and catch the throw.
Unicode IDN domains (xn--...) match via the punycode form but are case-insensitive — keep /i on.

Match URLs with regex

# pattern

# how it works

# sample input

# pitfalls

# other patterns