Trying to figure out spamassassin globbing rules - spamassassin

How do the globbing rules work for spamassassin work? I've looked at the docs, but they are not clear as to whether sub-domains are included in a whitelist rule. For example, does:
whitelist_from *#somewhere.com
also whitelist addresses from subdomain.somewhere.com? This seems not to be the case, as subdomains are still labeled as spam, if they fail checking.
Should I use something like this:
whitelist_from *#*.somewhere.com
I've added this to some addresses to find out and it passes spamassassin --lint, but it may be a while before I get another email from one of those subdomain, so I thought I've just as here.
Thanks

I eventually found the answer. I can use the whitelist_from_rcvd directive instead.

Related

Documentation for SpamAssassin rules (HTML_30_40)

I'd like to refine the password reset mails which are sent by my web application to avoid them to be mistaken as spam; a customer forwarded a mail header to me which contains several SpamAssassin rule names.
Some of the rules I could find, e.g. BAYES_40, but others I couldn't find there; those are:
HTML_30_40
TO_NO_BRKTS_HTML_ONLY
TO_NO_BRKTS_NORDNS
TO_NO_BRKTS_NORDNS_HTML
What do these rules mean; are there documentation pages somewhere?
The SpamAssassin which reported them is version 3.3.2; the latest version as of now is 3.4.1. Do those rules still exist?
The HTML_30_40 rule is no longer included in SpamAssassin, but if I remember correctly it was some test that concluded the email consisted of 30-40% HTML codes. Why that has any relevance for spam filtering I cannot see, and probably that is why it is no longer present.. :)
Those other rules still exist in SpamAssassin version 3.4.1. There is no explicit documentation per rule, other than an occasional comment or description along the rule implementation itself:
describe TO_NO_BRKTS_HTML_ONLY To: misformatted and HTML only
describe TO_NO_BRKTS_NORDNS_HTML To: misformatted and no rDNS and HTML only
You are probably sending emails from an ip-address with no reverse-DNS name, and the To: line is poorly formatted. Things should improve significantly if you get the DNS problems fixed (or relay the emails via your ISP) and format the To: line in the email properly, e.g.
To: "J Random User" <jrnd#email>

Match all email addresses belonging to a specific domain and its subdomains

I am looking to match all email addresses from a specific domain.
Any email coming from example.com or foo.example.com should match, everything else should be rejected. To do this, I could do some basic string matching to check if the given string ends with, or contains, example.com which would work fine but it also means that something like fooexample.com will pass.
Hence, based on the above requirements, I started working on a pattern that would pass the domain and its sub-domain. I was able to come up with the following regex pattern:
`/\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.example.com\b/i`
This only matched subdomains, but I have seen the pattern at "How to match all email addresses at a specific domain using regex?" which handles the main domain.
Is there a way to combine these two into something that works for any address from example.com.
How about
/\b(?:(?![_.-])(?!.*[_.-]{2})[a-z0-9_.-]+(?<![_.-]))#(?:(?!-)(?!.*--)[a-z0-9-]+(?<!-)\.)*example\.com\b/i
This one would also match 'tagged' and 'tagged-subdomain' mails like a+b#example.com and a+b#i.example.com
(([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+#(?:(?!-)(?!.*--)[a-z0-9-]+(?<!-)\.)*example\.com\b
Hope it helps you
I'd recommend reading "Stop Validating Email Addresses With Your Complex Regex".
From that point, I'd look for:
/#.*\bexample\.com/
For instance:
%w[foo#example.com foo#barexample.com foo#subdomain.example.com].grep(/#.*\bexample\.com/)
=> ["foo#example.com", "foo#subdomain.example.com"]
It's too easy to end up with a regex that is a maintenance nightmare, and that doesn't accomplish what you need. I highly recommend keeping it simple.

Zeus Rewrite Rules

I have a website that renders the URL:
/work.php?cat=identity
Normally I would research how to use mod_rewrite but unfortunately my hosting (Namesco) uses Zeus and not Apache, which is strange. How would I use Zeus' rewrite rules to convert to:
/work/identity
This is a much cleaner, nicer SEO friendly version. On top of this, I still need the $_GET variable to be active because it requests information about the variable cat from the database.
I've never rewritten URLs before so I've no idea where to begin. I've attempted the change with this rewrite.script file which is saved within my web folder
match URL into $ with ^/work.php?cat=/(.*)
if matched set URL= /work/$
Unfortunately it doesn't work. Can anyone help or perhaps offer an alternative?
had a quick play with this, and I believe I have proven to myself that the Request Rewriting is not able to manipulate the query element of the URL.
There is a potential solution, but it gets even more ugly!
You could use the "Perl Extensions" of ZWS to achieve this. Essentially you pass the request to the Perl engine within ZWS run a script against it, then pass the result back to the ZWS.
I am afraid this is a bit beyond my capabilities however! I am a "Zeus Traffic Manager" sort of chap...
Nick
Zeus Rewrite Rules are able to access the query part of a URL string. The key thing your missing it looks like is the 1 following the $ on the output URL and the slash should be removed:
match URL into $ with ^/work.php?cat=/(.*)
if matched set URL= /work/$
should be
match URL into $ with ^/work.php?cat=(.*)
if matched set URL= /work/$1
I am wondering if the rewrite rules are available for the query portion of the URI? The docs do seem to only speak about the path element.
http://support.zeus.com/zws/docs/2005/12/16/zeus_web_server_4_3_documentation
page 141 seems to be the start of it...
I will attempt to fire up a ZWS VM and test this myself.
Nick

Should I allow underscores in first and last name?

We have a form that has fields for first and last name. I was asked to allow underscores. I don't know of any sql injection that uses underscores, but I also don't know of anyone with an underscore in their name. Is there a good reason to allow or not allow underscores in names?
EDIT: I'm using parameters and server side validation. This is for client side validation via the jQuery validation plugin.
EDIT 2: I didn't mean for this to become a discussion on whether or not I should do any validation...I just wanted to know know if there was any compelling reason to accept underscores, like I should accept Irish people or hyphens. Based on that, I'm accepting Oren's answer.
You should be as liberal as possible in what you allow as a name. There is no good reason to disallow an underscore, so why do it? There are many horror stories of people who try to utilize software that disallows their actual name. Have a look at Falsehoods Programmers Believe About Names for assumptions you should not make.
DO NOT PREVENT SQL INJECTION USING WHITELISTS!
Have you come across an O'Neill yet?
Instead, use parameters.
I will admit, though, that whitelists will work better than blacklists
Re: EDIT:
You should not do such validation at all.
If your server-side code can handle it, there's nothing wrong with the name --'!#--_.
If your server-side code cannot handle it, it should.
You're doing your validation wrong. When preventing sql injection, just use placeholders or your database library's escape function to escape the data. What characters you use in the name doesn't matter then.
You'll need to allow apostrophes and hyphens (O'Reilly, Double-Barrel). Never heard of an underscore in a name though.
Ideally, you should be able to allow any characters and not have a problem with SQL injection because you are using parameterized queries etc.
Do you disallow '? How do you think Mr O'Reilly likes that?
If you prevent underscores with the assumption that we are not aware of names with underscores, would you do the same for the other dozens (hundreds) of other "special characters"?
Unless there is some reason to block underscores, I would leave it up to the user to be able to enter their name as they want.

Creating user/search engine friendly URLs

I want to create a url like www.facebook.com/username just like Facebook does it. Can we use mod_rewrite to do it. Username is name of the user in a table. It is not a sub directory. Please advise.
Sure, mod_rewrite can do that. Here is a tutorial on it.
Yes you can do this but you might have a couple of initial hurdles to get it going correctly.
The first is that you will have to use a regular expression to match it. If you don't know regex then this can be confusing at first.
The second is that you will need to take into account that of you are going to rewrite the top path on the domain you will have to have some mechanism for only rewriting if the file doesn't exist.
I guess if mod_rewrite supports testing if the url points at a real file that will be easy. If not you might have to use a blacklist of words that it wont rewrite as you will need to have some reserved words.
This would include at the least the folder that contains your images, css, js, etc and the index.php your site runs off, plus any other php files you have kicking around.
I would like to be more help but I am a .net guy and I usually help out in asp.net url rewriting issues with libraries such as UrlRewriter.net which have different configurations than mod_rewrite.
To match the username I would use a regex like this:
^/(\w*)/?$
this would then put the bit in the brackets into a variable you can use in the rewrite like
/index.php?profileName={0}
The regex I provided means:
^ nothing before this
/ forward slash
(\w*) any number of letters or numbers
/? optional forward slash
$ nothing after this

Resources