Harry Putnam
2017-05-27 14:58:02 UTC
all.SCORE:
((mark -100)
("from"
("***@gmail" -101 nil r)
("sina\.com" -101 nil r)
("@aol\\.com" -101 nil r)
("@[0-9]+\\.com>" -101 nil r)
("***@gmail" -101 nil r)
("s[ea]l[el]\\|discount\\|free\\|wholesale\\|paypal" -101 nil r))
("subject"
("~~" -101 nil r)
("~~\\|>>>\\|\\[A-Z\\]\\{4\\}" -101 nil r)
("!!\\|free\\|discount\\|wholesale" -101 nil r)))
I've forgotten how that was generated but would like to hand edit it.
You can see the term `free' in two places... in the last `from' element
and the last `subject' element.
I want the `free' at the last `from' element to be more restrictive as
it is hitting quite a few false positives due to network name with
various combinations of free with a dot like: `free.', `.free' and
`.free.'
This is happening in groups with thousands and thousands of messages
so I don't want to get it wrong... not sure how to re-run it.
So something like (please ignore the elisions (`[...]')):
[...] |[^\.]free[^\.]\\|[...]
But does it need the double slashes like:
[...] |\\[^\.\\]free\\[^\.\\] [...]
^^ ^^ ^^
Will that even accomplish what I am after; to allow `free' in any
combination of: `.free', `free.' or `.free.' to not be down scored?
Is there a handy way to test the regex?
Is there a handy way to rerun all those messages thru `all.SCORE'?
((mark -100)
("from"
("***@gmail" -101 nil r)
("sina\.com" -101 nil r)
("@aol\\.com" -101 nil r)
("@[0-9]+\\.com>" -101 nil r)
("***@gmail" -101 nil r)
("s[ea]l[el]\\|discount\\|free\\|wholesale\\|paypal" -101 nil r))
("subject"
("~~" -101 nil r)
("~~\\|>>>\\|\\[A-Z\\]\\{4\\}" -101 nil r)
("!!\\|free\\|discount\\|wholesale" -101 nil r)))
I've forgotten how that was generated but would like to hand edit it.
You can see the term `free' in two places... in the last `from' element
and the last `subject' element.
I want the `free' at the last `from' element to be more restrictive as
it is hitting quite a few false positives due to network name with
various combinations of free with a dot like: `free.', `.free' and
`.free.'
This is happening in groups with thousands and thousands of messages
so I don't want to get it wrong... not sure how to re-run it.
So something like (please ignore the elisions (`[...]')):
[...] |[^\.]free[^\.]\\|[...]
But does it need the double slashes like:
[...] |\\[^\.\\]free\\[^\.\\] [...]
^^ ^^ ^^
Will that even accomplish what I am after; to allow `free' in any
combination of: `.free', `free.' or `.free.' to not be down scored?
Is there a handy way to test the regex?
Is there a handy way to rerun all those messages thru `all.SCORE'?