The other day, I tweeted a link to a regex-based mute filter for Tweetbot. The full URL looks like:
It uses Tweetbot’s URL scheme1, to provide a predefined mute filter. This way, a filter can be shared without copy and pasting it, which is quite nice. In trying to work out how to do this, I ended up browsing around Twitter trying to find an example of how to pass along a regex in the URL, as I had seen it done before. As I imagine I’ll do this again, I thought I’d write it up.
Before it is encoded, the regex looks like this:
It’s designed to match;
- “breaking bad”
- “Breaking Bad”
and similar variations, but not;
when used alone.
The first bit,
(?i) sets the regex to be case insensitive.
#? means it may or
may not start with a hash (matching hashtags or not). Then, the rest comprises of a
A lookahead assertion tests to see if a given set of characters are followed by another set. It’s considered an assertion because it doesn’t consume these characters, it will only match them. For testing regular expressions, I use Patterns, so you get something that looks like this:
You will see that whilst this matches the appropriate lines, it doesn’t match the
whole term. In this situation, this is fine (Tweetbot filters any tweet that would
breaking(?= ?bad) looks for the word “breaking” followed by “bad”,
with or without a space between them. The lookahead assertion is the bit in
Assembling the URL
The next bit is to make it valid inside a URL. I cheated and used Eric Meyer’s URL Decoder/Encoder, but the invalid characters are below:
There are obviously many tools to help do that bit.
So now, the next time I want to avoid spoilers to the finale of a pretty good TV show, I’ll be save in the knowledge that once, I wrote down how it did it.