I’ve been trying to learn regular expressions for years, but never had a good use for them, because the tools that used them were so obscure. Now with TextMate, I have plenty of uses.
In the blogging bundle, the preferences format was:
Blog Name http://www.blogurl.com/
The regular expression for parsing that was:
^(.+?)\s+(https?:\/\/\.+)
I’ll break this down, mainly as an exercise for myself (talking helps understanding) and others can chime in. I’ll bold what I am commenting on.
- ^(.+?)\s+(https?:\/\/.+) – Start at beginning of the line.
- ^(.+?)\s+(https?:\/\/.+) – Grab at least one character, reluctantly, which means pay attention to the following patterns
- ^(.+?)\s+(https?:\/\/.+) – Set a variable $1 to whatever is found inside the parenthesis
- ^(.+?)\s+(https?:\/\/.+) – Find any breaking space (space, tab), one or more of them. 0 and the pattern fails
- ^(.+?)\s+(https?:\/\/.+) – followed by http
- ^(.+?)\s+(https?:\/\/.+) – followed by an optional s (the ? means 0 or 1 times)
- ^(.+?)\s+(https?:\/\/.+) – followed by a colon
- ^(.+?)\s+(https?:\/\/.+) – followed by a / (/ is a special char, so we need to escape it, with \)
- ^(.+?)\s+(https?:\/\/.+) – followed by a second /
- ^(.+?)\s+(https?:\/\/\.+) – followed any any characters, at least one of them
Whew! That is a lot of stuff. Ok, but I wanted to add an optional timeout value, spaces or tabs followed by numbers. Here is what I came up with:
/^(.+?)\s+(https?:\/\/\S+)\s*(\d+)?/
- /^(.+?)\s+(https?:\/\/\S+)\s*(\d+)?/ – Here, I changed the .+, which was overly aggressive, to \S+, which means match any non-space characters, one or more
- /^(.+?)\s+(https?:\/\/\S+)\s*(\d+)?/ – followed by white space, 0 or more. Has to be 0 or more, or a line without a timeout would fail
- /^(.+?)\s+(https?:\/\/\S+)\s*(\d+)?/ – followed by one or more digits
- /^(.+?)\s+(https?:\/\/\S+)\s*(\d+)?/ – capture those into a third variable
- /^(.+?)\s+(https?:\/\/\S+)\s*(\d+)?/ – Specify that we can have either 0 or 1 of the digit patterns
Thats it. Now both of the following lines are valid blog entry lines:
Blog Name http://www.blogurl.com/
Blog Name http://www.blogurl.com/ 60
Thanks to Digi on #mac for the assistance.
Heh, I have a t-shirt reading /(bb|[^b]{2})/ and it’s surprising how many programmers even can’t decipher that …
I can’t heh.
Surprisingly, TextMate doesn’t seem to like the pattern. I just beep looking for ‘bb” or “as”
heh
OH, lose the / /s in textmate’s find.