Saturday, April 11, 2009

Regular Expression

Regular Expression:

are specially formatted strings used to find patterns in text.

Like what you do when you search for something like files in windows by providing search pattern like *.exe or ms*.exe.

what is the meaning of specially formatted?

This mean that it is not as any string it has specific format to match the string you want. for example:

Zip code must be constructed with 5 digits no spaces or (-)or(_) or(.) etc....

also not less than 5 or more than 5

like: 12365 accepted

but: 123 or 444444 not accepted.

So the Regular expression to match what we want may be like that:

"^\d{5}$"


don't you see that it is special formated ?!

this means to find any digit "\d" five times {5} in the whole string "^....$"

if it is like this "\d{5}"

then this number will be accepted: 123456 which consiste of 6 digit & we want only 5.

We will describe some of Regular Expression symbols on our Tutorial.

When you are trying to provide form validation, it is usefull to use Regular Expressions to your work.

Here we will test Class Regex in namespace System.Text.RegularExpressions,

you have to reference at the begining that you want to use this namespace:

// To enable you to use RegularExpressions
// namespace's Classes like Regex & Match.
using System.Text.RegularExpressions;


Regex class contains several static methods that allow you to use a regular expression without explicitly creating a Regex object. Using a static method is equivalent to constructing a Regex object, using it once and then destroying it.

Regex has the static method Match, it searches an input string for an occurrence of a regular expression & returns the precise result as a single Match object -object of class Match-.

Example:

suppose we want to validate a name which must start with Letter in Uppercase and has any number of characters but not numbers.

Ex:

Mohammed: accepted

john: not accepted because it is starting with small letter [j].

Step 1:

what is the suitable Regular Expression for this work?

it is "^[A-Z]+[a-zA-Z]*$"

what is this expression mean?

^ and $ match the positions at the beginning and end of the string, which mean search the entire string for the specific patter.

[A-Z] Means a range of characters between A to Z in Uppercase; means the 1st Character of the string must be in Capital letter.

+ mean Matches one or more occurrences of the patter [A-Z].

[a-zA-Z] range of characters in upper or lower case.

* mean Matches zero or more occurrences of the patter [a-zA-Z]. (+ or * comes after the patter you are looking).

now we can read that expression as follow:

A name that starts with a Capital letter(A-Z) followed by any number of Letters in upper or lower case.

Step 2: How can we code this in C# using Regex Class?

suppose we have a textBox called txtName.

Code:

// public static Match Match(string input,string pattern);
// this is the parameter of the Match method in Regex Class,
// this method returns a referenc to Match Class
// so that we can access its properties and methods
// through this reference.
Match MyRegMatch = Regex.Match(txtName.Text,"^[A-Z]+[a-zA-Z]*$")
// using Success property in Our Match object to
// check if the match occure or not.
if(!MyRegMatch.Success)
{
//Name was incorrect Cause no Match occur
MessageBox.Show("Invalid First Name",
"Error",MessageBoxButtons.OK,MessageBoxIcon.Error);
txtName.Focus();
return;
}


also we can Regex.Match(txtName.Text,"^[A-Z]+[a-zA-Z]*$").Success directly in if condition

This is very simple example of what is Regular expression and how it can be used. Later we will introduce more about Regular expression and its use in other tutorials.

Notes: Usefull resources:

Regular Expressions Symbols in details in VS.Net Documentation:

Link #1

More about Regex Class:

Link #2

---------------------------------------------------------------------------- example of Regular Expression Patter with its description:

"^[0-9]+\s+([a-zA-z]+|[a-zA-z]+\s+[a-zA-z]+)$"


[0-9]+ means Start with one or more Numbers like 1234

\s+ followed by one or more white spaces Like

([a-zA-z]+|[a-zA-z]+\s+[a-zA-z]+) this is complex one which means:

even if the next word start with one or more characters

in any lettercase that is not a number, | means OR

or it can be start with one or more characters followed

by white one or more white spaces followed by one or more characters.

example that is accepted:

123 NasrCity Cairo

1234 Giza

examples that is not accepted

123

Labels: ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home