Mail Monitor: regular expressions - Queries / Submissions
Started by Denis


Rate this topic
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5


12 posts in this topic
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
09-22-2003, 07:45 AM -
#1
Hi all. Smile

I thought I'd start a thread where queries about the use of regular expressions in the creation of filter rules could be posted.
It could also be used to share cool regular expressions rules that are found to work well with spam. Wink
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
09-22-2003, 07:51 AM -
#2
Submitted regular expression.

If the subject contains regular expression:
Code:
[[:space:]]{2,}
This will flag all the annoying spam that have a subject containing 2 or more consecutive spaces, like:
[code]
"Hello stranger    r673atvu"
"Goods news  
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
09-22-2003, 11:29 AM -
#3
Submitted regular expression.

Sometimes, spammers replace the letter "i" by the number "1", specially in words spam-killers would easily catch.
In such cases, you can use a regular expression similar to the following one. Wink

Subject contains regular expression:
Code:
v(1|i)agra
This will flag "viagra" as well as "v1agra". Smile
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
09-29-2003, 01:25 PM -
#4
Submitted regular expression.

This one is similar to the previous regular expression, but will catch even more spam. Smile

Subject contains regular expression:
Code:
v[[:print:]]?agra
This will flag "viagra", "v1agra", "v|agra", "v;agra", "v:agra", "vagra", etc etc.
Basically, anything containing the word "viagra" where the "i" might have been replaced by any printable character or simply removed. Wink
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
09-29-2003, 01:31 PM -
#5
Submitted regular expression.

Very useful, this regular expression will flag emails that do not have at least one two-letter word in their subject. Smile

Subject does not contain regular expression:
Code:
[[:alpha:]]{2,}
This will flag "a", "c2", "d f", etc etc.
Bob Freeman
Member
***


0
104 posts 18 threads Joined: Aug 2001
10-01-2003, 08:49 AM -
#6
I tested the expression v[[:print:]]?agra and s[[:print:]]?xual and both work very well.

I don't have any experience in programming so if you have others then please post them.

Thank you for making these.
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
10-01-2003, 05:42 PM -
#7
"Thank you for making these."

You're welcome. Smile

I don't have much experience with regular expressions either, which is why I thought it could be useful to share my victories in the war against spam. Wink

I only wish more TLB/Mail Monitor users would participate to the forums.
It would give it more of a community feel. ???
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
10-01-2003, 05:52 PM -
#8
Submitted regular expression.

Today I have received an email where the spammer used another trick, using a space to replace a character:
"V agra"
I had to change the regular expression I previously submitted.

Subject contains regular expression:
Quote:v[[:print:]]?[[Confusedpace:]]?agra
This will flag "v agra", "vi agra", "viagra", "v1agra", "v|agra", "v;agra", "v:agra", "vagra", etc etc.
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
10-01-2003, 06:07 PM -
#9
Submitted regular expression.

This is a condition that mixes regular expression with "standard" text comparison.
I find it very usefull to flag spam that have a string of recipients (it could be in the "To" or "Cc" field), sharing the same ISP in their email address.
Ex: "To: denis@talk21.com; paul@talk21.com; billy@talk21.com;...".


"To" (or Cc) contains (my address):
Quote:denis@talk21.com
And "To" contains regular expression:
Quote:[^s]@talk21.com
Basically, it says, if the "To" contains my address, and also contains another "talk21.com" address where the last letter before the "@" is not "s" (the last letter in my name), flag the message.

This can be applied to the "Cc", but also to both the "To" and "Cc" (unless you use a very common address, in which case this could be a genuine email, also sent to a friend with the same ISP).

"To" contains (my address):
Quote:denis@talk21.com
And "Cc" contains regular expression:
Quote:[^s]@talk21.com
or

"To" contains regular expression:
Quote:[^s]@talk21.com
And "Cc" contains (my address):
Quote:denis@talk21.com
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
10-01-2003, 06:19 PM -
#10
Useful Condition

Not a regular expression, but very simple and useful.
If the "To" contains none of your addresses and the "Cc" is not present, flag it.

"To" does not contain:
Quote:denis@talk21.com
And "To" does not contain:
Quote:denis@teaching-tools.com
And "To" does not contain:
Quote:denis@another-of-my-addresses.com
And "Cc" is not present.

When using conditions like this one, I normally flag it as a "Deletion Test", until I am happy it will not flag genuine messages (I then choose to have the flagged messages automatically deleted).
If it does, I simply tweak the condition.
But it could safely be flagged as spam. Wink
Denis
Member
***


0
192 posts 19 threads Joined: Dec 2002
10-02-2003, 07:35 AM -
#11
Submitted regular expression.

Sometimes, it looks like the subject of spam messages has been randomly created, i.e "wtsyuw", "ptrvwik", etc etc.
It is fairly tricky to catch them all just using the subject field, but here is a simple condition to flag the ones that do not contain a vowel ("kndwvw", "zxtcx",...)

If the subject does not contain regular expression:
Code:
[aeiouy]+
If the subject doesn't contain at least one "a", "e", "i", "o", "u", or "y", it will be flagged. Wink
Centaur
Junior Member
**


0
9 posts 3 threads Joined: Oct 2001
10-23-2003, 09:20 AM -
#12
To == Yuri <yuri@our-domain> — live people address me by first name AND last name, or by e-mail address ONLY.

To == "Yuri" <yuri@our-domain> — ditto.

From =~ [[:digit:]]+@((aol)|(hotmail))\.com — this is for those "Make your own DVDs".

To contains @www.our-domain — live people write to @our-domain, even though <!-- w --><a class="postlink" href="http://www.our-domain">www.our-domain</a><!-- w --> and our-domain both resolve to the same IP address.

X-Authentication-Warning contains "claimed to be our-domain" — “A stranger came and told me that he is myself but he really looked like someone else. And handed me a bunch of letters.” Let’s drop them on the floor. We don’t talk to ourselves.


Forum Jump:


Users browsing this thread: 1 Guest(s)