This has come up a few times recently, so a little note:
/./
will match any character—except a newline.
/[\s\S]/
, /[\w\W]/
and so forth are the only way to portably match every character including newlines.
what about /./m
?
/./m
does not match a newline (usually—see below).
The m
modifier does not alter .
in any way (usually—see below).
m
changes what ^
and $
match—normally they’ll match only the start and end of the string, but m
alters them to also match the start and end of any line, i.e. immediately after and prior a newline.
Mnemonic: m
for “multiline”.
what about /./s
?
Yes, that will do what you want—unless you’re using Ruby or JavaScript.
In Ruby, the s
modifier is inexplicably unused—in 1.8, it stays on the Regexp
object, but in no way affects matching, and in 1.9, it doesn’t even stay on the object, it just disappears. Note that these behaviours are different to using a complete nonsense modifier, like f
, which causes a SyntaxError
1.
Even more inexplicable, this feature has been rolled into m
instead! So in Ruby, you can use /./m
to match a newline, and you can also use /^a/m
to match an a
at the beginning of the string, or after any newline in the string.
In JavaScript, the s
modifier is absent entirely, and it’s not rolled into anything else. Use /[\s\S]/
.
Mnemonic: s
for “single line” (the string is treated as one line, in a twisted sense).
conclusion
Yay, regular expressions. Be sure what you mean.
To match any character including newlines:
- In JavaScript or any other language lacking the
s
modifier, use/[\s\S]/
. - In Ruby, use
/./m
, but be aware that this also modifies^
and$
. If unsure, use/[\s\S]/
. - If you have true PCREs, you may safely use
/./s
.
Note that this is often what you really want—rarely do you want to explicitly match every character except newlines. If you do that on purpose, at least leave a comment to that effect, otherwise your coworkers will just assume you didn’t know what you were doing.
-
Or just gets thrown away if you use
Regexp.new
directly. ↩