Windows Vista Forums
Vista Forums Home Join Vista Forums Windows 7 Forum Vista Tutorials Tags
Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks.

Go Back   Vista Forums > Misc Newsgroups > PowerShell

Vista - Regex help please?

Reply
 
Old 04-21-2009   #1 (permalink)
Jobbsy


 
 

Regex help please?

HI

I feel this particular one is complicated and unfortunately despite trying
on a good few occaisions I am unable to get my head around even moderately
complex Regex

I have two strings as follows (assume each one runs on one line):

$Str1 = " 20% (1%) 0% ( 0%) Apr 20 19:00
RUB-FOK-P01(0118054600)_vr_SCORT1_2.21 (somethingthatmayormaynotbehere)"

$Str2 = " 20% ( 0%) 0% ( 0%) Apr 20 18:57
RUB-FOK-P01(0118054600)_vr_SCORT1_2.22"

from $Str1 I require the 'RUB-FOK-P01(0118054600)_vr_SCORT1_2.21' part as a
variable

from $Str2 I require the 'RUB-FOK-P01(0118054600)_vr_SCORT1_2.22' part as a
variable

I know there are some Regex Gurus out there - can you please help? - also a
good llink for learning Regex would be appreciated also - Thanks in advance
--
jobbsy@xxxxxx

My System SpecsSystem Spec
Old 04-21-2009   #2 (permalink)
Robert Robelo


 
 

Re: Regex help please?

Try this Regex:

$pat = '^.+\s+(\w+.+\.\d+).*$'
$Str1, $Str2 -replace $pat,'$1'

# good tutorial
http://www.regular-expressions.info/tutorialcnt.html
# good reference
http://msdn.microsoft.com/en-us/library/az24scfc.aspx

--
Robert
My System SpecsSystem Spec
Old 04-21-2009   #3 (permalink)
Robert Robelo


 
 

Re: Regex help please?

# the RegEx can be even simpler:
$pat = '^.+\s+(\w\S+).*$'
$var1, $var2 = $Str1, $Str2 -replace $pat,'$1'
$var1
$var2

--
Robert
My System SpecsSystem Spec
Old 04-21-2009   #4 (permalink)
Alex K. Angelopoulos


 
 

Re: Regex help please?

Jobbsy:

(First, I'm NOT a regex guru, just a regular "user". However, here are some
things that might help.

On your search problem:
Due to the nature of regexes, putting together a good regex for a particular
problem usually comes down to as precise a description as possible of what
you are trying to find. Depending on the text it is in, you may instead find
it easier to define what you _don't_ want to find. Both of these can
eventually be turned into a pattern. For some thorny cases, you may use both
together.

Could you come up with a generalized description of that string you're
after? This might help; I've extracted a few possible "rules" based on what
you show, as well as some questions that could lead to rules. You could
comment on whether they're correct or not, and for the questions try to fill
in the details.

(1) The target element doesn't have any whitespace characters within it (no
spaces or tabs).

(2) It only contains certain character types: letters (upper or lower case),
numerals, and the characters
( ) . _
Mention any others it might include.

(3) Is the element always a particular length? Is it the longest element in
the string?

(4) Is there a structure to the element? Specifically, are there any
characters that are always there, AND is there some kind of form that it
always conforms o.
This could be very useful. For example, IF your strings always begin with
RUB-FOK, you could use that as a literal anchor at the beginning of the
substring. If not - I assume you just repeated that for simplicity - it
still may conform to a pattern of 3 sets of 3 alphanumeric characters joined
by dashes, followed by a parenthesized numeric sequence, followed by 3
varying-length number+letter+period sequences joined by underscores.

As a demo, I'll assume I know what you're after. I assume it's constructed
of 3 sequences of alphanumeric characters with dashes separating them. This
can be described as follows in a .NET regex:

([a-z0-9]+-){2}([a-z0-9]+)

The [a-z0-9]+- means multiple alphanumerics followed by "-". By
parenthesizing it (.NET regexes use unescaped parentheses for grouping) and
following it with {2}, we say we are looking for exactly two matches. This
is then followed a look for just another alphanumeric sequence.

The middle section - which I assume is always really ( ) surrounding pure
numbers - would be done like this, with \ to tell .NET the parentheses are
just parentheses symbols, not for grouping.
\([0-9]+\)
The final sequence looks like it is always composed of 3 elements of an
underscore followed by letters/numbers/decimal point. That would be
described like this:
(_[a-z0-9.]+){3}

Putting the whole thing together, you have the following pattern:

$pattern = "([a-z0-9]+-){2}[a-z0-9]+\([0-9]+\)(_[a-z0-9.]+){3}"

and you can look for it like this:

$str1 -match $pattern; $matches[0]


References on regular expressions:

There are several books out purely on the topic of regular expressions,
usually related to Perl. There is one VERY important thing to remember:
REGULAR EXPRESSION SYNTAX IS NOT UNIVERSAL. There are some common standards,
but specific tools or languages have their own idiosyncracies. There's no
real standard for them; there's a POSIX syntax which is used by many Unixy
tools, but (like most other convenient regex engines) the .NET framework
uses Perl-derived regex syntax.

Anyway, Perl references on regular expressions will probably be useful. For
specifics of what is or is not correct (and also for some decent examples)
you will want to see the .NET regex docs as well. Try this link as a
starter:

http://msdn.microsoft.com/en-us/library/hs600312.aspx

If you'd like to get some context for regular expressions, you may want to
try the Wikipedia page as well, and just skip over the bits that make your
eyes roll back into your head (I did ):

http://en.wikipedia.org/wiki/Regular_Expression


"Jobbsy" <Jobbsy@xxxxxx> wrote in message
news:A44F8B15-6237-491C-9AE6-9C2194D962D9@xxxxxx
Quote:

> HI
>
> I feel this particular one is complicated and unfortunately despite trying
> on a good few occaisions I am unable to get my head around even moderately
> complex Regex
>
> I have two strings as follows (assume each one runs on one line):
>
> $Str1 = " 20% (1%) 0% ( 0%) Apr 20 19:00
> RUB-FOK-P01(0118054600)_vr_SCORT1_2.21 (somethingthatmayormaynotbehere)"
>
> $Str2 = " 20% ( 0%) 0% ( 0%) Apr 20 18:57
> RUB-FOK-P01(0118054600)_vr_SCORT1_2.22"
>
> from $Str1 I require the 'RUB-FOK-P01(0118054600)_vr_SCORT1_2.21' part as
> a
> variable
>
> from $Str2 I require the 'RUB-FOK-P01(0118054600)_vr_SCORT1_2.22' part as
> a
> variable
>
> I know there are some Regex Gurus out there - can you please help? - also
> a
> good llink for learning Regex would be appreciated also - Thanks in
> advance
> --
> jobbsy@xxxxxx
My System SpecsSystem Spec
Old 04-21-2009   #5 (permalink)
Alex K. Angelopoulos


 
 

Re: Regex help please?

That's much simpler. From what I can see, the only special assumption this
makes is that the target must be the first element containing letters in the
line?

"Robert Robelo" <Kiron@xxxxxx> wrote in message
news3BD08C3-C6BD-4092-8791-27BD336DC189@xxxxxx
Quote:

> # the RegEx can be even simpler:
> $pat = '^.+\s+(\w\S+).*$'
> $var1, $var2 = $Str1, $Str2 -replace $pat,'$1'
> $var1
> $var2
>
> --
> Robert
My System SpecsSystem Spec
Old 04-22-2009   #6 (permalink)
Robert Robelo


 
 

Re: Regex help please?

> the only special assumption this makes is that the target must be the first element containing letters in the line?

'^.+\s+(\w\S+).*$'

No. Because of the anchors, specially the EOS/EOL ($), the RegEx captures the next-to last, or last, matching substring preceded by Whitespace characters (^.+\s+) that begins with a Word character (\w) followed by any Non-Whitespace characters (\S+) until, and excluding, a Whitespace character (implicitly .*) -next-to last matching substring- or EOS/EOL ($) -last matching substring.

--
Robert
My System SpecsSystem Spec
Reply

Thread Tools


Similar Threads
Thread Forum
regex help PowerShell
regex help PowerShell
Regex Help PowerShell
Regex Help .NET General
regex PowerShell


Vista Forums is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media Ltd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46