Regex validation, was Re: Programmers with network engineering skills

Joel Maslak jmaslak at
Tue Mar 13 10:03:48 CDT 2012

On Mon, Mar 12, 2012 at 9:18 PM, Mark Andrews <marka at> wrote:

> Only if you don't properly quote/escape the arguments you are passing.

You're using your OS wrong if you are quoting/escaping the arguments.
You do not need a shell involved to use fork() + exec() + wait(), as
the shell is not involved (assuming Unix; I also suspect libc has a
nice packaged function for this that is not insecure like system(),
but it's not all that hard to roll your own).  In Perl, use the
multi-argument form of system(), not the single argument version().
In both cases you should clear the environment as well prior to the
exec()/system() unless you know nobody can play with LD_PRELOAD, IFS,

This is one of my pet peeves about programming - programmers calling
out to insecure functions when secure alternatives are available.

The same goes for SQL statements - if you need to quote things to
prevent SQL injection, you're using your SQL database wrong.  Look up
prepared statements.  Generally, it's very bad practice to dynamically
build SQL strings.  It's also very common practice, hence why so many
applications have SQL injection vulnerabilities.  It's the Perl/PHP
equivalent of the buffer overflow that simply wouldn't exist if
developers, instead of trying to figure out how to quote everything,
simply used prepared statements and placeholders.

As for checking for bogus email addresses, read the RFC and code it
right.  That's not with a too-simple regex, nor is it with a complex
regex.  You need a parser, which is the right tool for the job.  Regex
is not.  But there is value in not passing utter garbage to another
program (it has a tendency to clog mail queues, if for no other
reason) - just make sure you do it right.

I might add that the same goes for names.  People don't just have a
first name and a last name - some people just have one name, some
people have three or four names, some people have surnames with
spaces, hypens, or apostrophes (remember what I said about SQL?!),
etc.  Yet most systems I see assume people have two names with no
spaces, apostrophies, hyphens, etc.  Big mistake.  And don't get me
started on addresses, which might have one address line, two address
lines, even 5 address lines, to say nothing that international
addresses may or may not put the "street" part first.  It's certainly
not easily regex-able.

Okay, I'll step off the soap box and let the next person holler about
how I was wrong about all this!

More information about the NANOG mailing list