From: "Ashley M. Kirchner" on
I'm not a regexp person (wish I was though), and I'm hoping someone can give
me a hand here. Consider the following strings:



- domain\username(a)example.org

- domain\username

- the same as above but with / instead of \ (hey, it happens)

- username(a)example.org

- username



Essentially I have a sign-up form where folks will be typing in their
username. The problem is, in our organization, when you tell someone to
enter their username, it could end up being any of the above examples
because they're used to a domain log in procedure where in some cases they
type the whole thing, in other cases just the e-mail, or sometimes just the
username.



So what I'd like is a way to capture just the 'username' part, regardless of
what other pieces they put in. In the past I would write a rather
inefficient split() routine and eventually get what I need. With split()
getting deprecated, I figured I may as well start looking into how to do it
properly. There's preg_split(), str_split(), explode() . possibly others.



So, what's the proper way to do this? How can I capture just the part I
need, regardless of how they typed it in?



Thanks!



A

From: Jochem Maas on
Op 3/15/10 1:54 AM, Ashley M. Kirchner schreef:
> I'm not a regexp person (wish I was though), and I'm hoping someone can give
> me a hand here. Consider the following strings:
>
>
>
> - domain\username(a)example.org
>
> - domain\username
>
> - the same as above but with / instead of \ (hey, it happens)
>
> - username(a)example.org
>
> - username
>
>
>
> Essentially I have a sign-up form where folks will be typing in their
> username. The problem is, in our organization, when you tell someone to
> enter their username, it could end up being any of the above examples
> because they're used to a domain log in procedure where in some cases they
> type the whole thing, in other cases just the e-mail, or sometimes just the
> username.
>
>
>
> So what I'd like is a way to capture just the 'username' part, regardless of
> what other pieces they put in. In the past I would write a rather
> inefficient split() routine and eventually get what I need. With split()
> getting deprecated, I figured I may as well start looking into how to do it
> properly. There's preg_split(), str_split(), explode() . possibly others.
>
>
>
> So, what's the proper way to do this? How can I capture just the part I
> need, regardless of how they typed it in?
>

<?php

$inputs = array(
// double slashes just due to escaping in literal strings
"domain\\username(a)example.org",
"domain\\username",
"domain/username(a)example.org",
"domain/username",
"username(a)example.org",
"username",
);
foreach ($inputs as $input) {
// four slashes = two slashes in the actual regexp!
preg_match("#^(?:.*[\\\\/])?([^@]*)(?:@.*)?$#", $input, $match);
var_dump($match[1]);
}

?>

.... just off the top of my head, probably could be done better than this.
I would recommend reverse engineering the given regexp to find out what
it's doing exactly ... painstaking but worth it to go through it char by char,
it might be the start of a glorious regexp career :)

>
>
> Thanks!
>
>
>
> A
>
>

From: Al on


On 3/14/2010 9:54 PM, Ashley M. Kirchner wrote:
> I'm not a regexp person (wish I was though), and I'm hoping someone can give
> me a hand here. Consider the following strings:
>
>
>
> - domain\username(a)example.org
>
> - domain\username
>
> - the same as above but with / instead of \ (hey, it happens)
>
> - username(a)example.org
>
> - username
>
>
>
> Essentially I have a sign-up form where folks will be typing in their
> username. The problem is, in our organization, when you tell someone to
> enter their username, it could end up being any of the above examples
> because they're used to a domain log in procedure where in some cases they
> type the whole thing, in other cases just the e-mail, or sometimes just the
> username.
>
>
>
> So what I'd like is a way to capture just the 'username' part, regardless of
> what other pieces they put in. In the past I would write a rather
> inefficient split() routine and eventually get what I need. With split()
> getting deprecated, I figured I may as well start looking into how to do it
> properly. There's preg_split(), str_split(), explode() . possibly others.
>
>
>
> So, what's the proper way to do this? How can I capture just the part I
> need, regardless of how they typed it in?
>
>
>
> Thanks!
>
>
>
> A
>
>

The basic problem is that the slashes are legitimate characters for usernames
per the RFC 5322; http://en.wikipedia.org/wiki/E-mail_address

However, per the "Notwithstanding the addresses permitted by these
standards...." paragraph, I disallow the "! # $ % * / ? ^ ` { | } ~" and have
not found a problem.

You didn't mention whether you had control over the whole submission process. If
so, I'd suggest checking the submitted address and sending back to the client a
message asking them to correct their submission and resending it.

You can check for any of the above characters and bounce the submission back to
the client.

Check out filter_var($emailAddr, FILTER_VALIDATE_EMAIL) It will catch a lot of
errors.

Also, you'll find the helpful preg_match("%[[:alnum:][:punct:]]%", $addr); These
are legit characters. First remove everything following the @ and the @ itself.
Obviously, you can use if(!preg_match....) to catch anything not valid.

There are several str functions that'll do it or simply preg_replace("%@.*%",
'', $addr)

From: shiplu on
Here is the regex for you.


$company_domain = '\w+'; // replace with your own company domain pattern.
$user_name = '\w+'; // replace with your own username pattern
$email_domain = '\w+\.\w{2,4}'; // google for standard domain name
regex pattern and replace it.

$regexp = "~({$company_domain}[\\\\/])?(?P<username>$user_name)(@$email_domain)?~";

preg_match($regexp, $text, $matches);

print_r($matches); // $matches['username'] will contain username.

--
Shiplu Mokaddim
My talks, http://talk.cmyweb.net
Follow me, http://twitter.com/shiplu
SUST Programmers, http://groups.google.com/group/p2psust
Innovation distinguishes bet ... ... (ask Steve Jobs the rest)