Windows Vista Forums
Vista Forums Home Join Vista Forums Donate Vista Tutorials Tags

Welcome to Vista Forums we are your forum to discuss Windows Vista x64 and x86 systems. Whether you need help or just want to post an idea you have on Vista, this is the forum for you.
Register at Vista forums...the world biggest Windows Vista resource Join Vista Forums Now

Go Back   Vista Forums > Microsoft Technical Newsgroups > PowerShell

I'm pretty sure I found a bug in Powershell (get-content -readcount)

Update your Vista Drivers Update Your Drivers Now!!
Closed Thread
 
Thread Tools Display Modes
Old 12-18-2007   #1 (permalink)
Jacob Saaby Nielsen
Guest


 

I'm pretty sure I found a bug in Powershell (get-content -readcount)

Hey all,

I created this script today, that I use to trawl smtp log files (had to do
one myself Shay, gotta learn ! I'll also try yours though ).

Problem is, I'm getting wildly variating results when using the readcount
parameter of get-content. I just couldn't figure out why
some e-mail addresses didn't show up in my list, since I just knew that my
regular expressions were true (using RegexBuddy, where
I loaded the file and tested my regexes on it, where it hit that exact same
e-mail address).

So I tested it out a bit... I'll first post my script, so you can check out
if it's my script which is at fault.

I'll first post my script, then the results I got from each test

Script

#region Script Settings
#<ScriptSettings xmlns="http://tempuri.org/ScriptSettings.xsd">
# <ScriptPackager>
# <process>powershell.exe</process>
# <arguments />
# <extractdir>%TEMP%</extractdir>
# <files />
# <usedefaulticon>true</usedefaulticon>
# <showinsystray>false</showinsystray>
# <altcreds>false</altcreds>
# <efs>true</efs>
# <ntfs>true</ntfs>
# <local>false</local>
# <abortonfail>true</abortonfail>
# <product />
# <version>1.0.0.1</version>
# <versionstring />
# <comments />
# <includeinterpreter>false</includeinterpreter>
# <forcecomregistration>false</forcecomregistration>
# <consolemode>false</consolemode>
# <EnableChangelog>false</EnableChangelog>
# <AutoBackup>false</AutoBackup>
# </ScriptPackager>
#</ScriptSettings>
#endregion

# ================================================================================================================
#
# Script Information
#
# Title: get-smtperrors.ps1
# Author: Jacob Saaby Nielsen, jsy@xxxxxx
# Originally created: 18-12-2007 - 09:21:50
# Original path: C:\Documents and Settings\jsy\My Documents\AdminScriptEditor\get-smtperrors.ps1
# Description: Gets all e-mails from smtp log lines with a 5xx status code,
and the reason for the 5xx.
#
# ================================================================================================================

function isSMTPErr([string]$LogString)
{
$Regex = [regex] "\s5\d{2}\+\d{1}.\d{1}.\d{1}\+|\s5\d{2}\+"

if ($Regex.Ismatch($LogString) -eq $true)
{ return $true }
else
{ return $false }
}

function get-SMTPErr([string]$LogString)
{
$Regex = [regex] "\s5\d{2}\+"
$Match = $Regex.Match($LogString)
return ($Match.Value).Substring(1,3)
}

function get-StatusCode([string]$LogString)
{
$Regex = [regex] "\+\d{1}.\d{1}.\d{1}\+"
$Match = $Regex.Match($LogString)
if (($match.value).length -gt 1)
{
return ($Match.Value).Substring(1,5)
}
else
{ return ""}
}

function get-Email([string]$LogString)
{
$Regex = [regex] "([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})"
$Match = $Regex.Match($LogString)
return $Match.Value
}

[string]$ResultsDir = "c:\"
[string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date -format
HHmm ) + ".csv"
[string]$SMTPLogDir = "c:\test"

$userArray = @{ }

Write-Host "Counting how many logfiles are available..."

$TotalNumberOfFiles = (Get-ChildItem $SMTPLogDir -recurse | where {!$_.psiscontainer
}).length
$FileCounter = 0

foreach ($SMTPlog in Get-ChildItem $SMTPLogDir -recurse)
{
if (!$SMTPlog.PSIsContainer)
{
$status = "Processing file {0} of {1}: {2}" -f $FileCounter, $TotalNumberOfFiles,
$SMTPlog.Fullname
Write-Progress $status -PercentComplete ((100 / $TotalNumberOfFiles) *
$FileCounter ) -Activity "Processing logfiles" -ID 1

foreach ($logline in Get-Content $SMTPlog.Fullname -readcount 1)
{
if ((isSMTPErr $logline) -eq $true)
{
$userArray[( get-Email $logline )] = @{ SMTPErr = get-SMTPErr($logline);
SMTPStatus = get-Statuscode($logline); SMTPlogFile = $SMTPlog.Fullname}
}
}

$FileCounter++
}

foreach ($email in $userArray.Keys | Sort-Object)
{
Add-Content -path($ResultsDir + $ResultsFile) -value($email + "," + $userArray[$email].SMTPErr
+ "," + $userArray[$email].SMTPStatus + "," + $userArray[$email].SMTPlogFile)
-ErrorAction continue
}

$userArray = @{ }
}

Readcount Results

The value is the one I set -readcount to, result is how many rows were in
my .csv file. Value = 0 means no readcount parameter at all.

Value 0, Results 320
Value 1, Results 320
Value 10, Results 2492
Value 50, Results 1435
Value 100, Results 845
Value 200, Results 475
Value 500, Results 230
Value 1000, Results 142
Value 5000, Results 53
2nd Try Value 0, Results 320
3rd Try Value 0, Results 320

All tests are done on the same set of files, which were NOT altered at any
point in time (I copied the SMTP logs to a local
folder on my machine, and used the local copies to test on, while creating
my script).

I would also like to point out, that I did my tests with the EXACT same version
of the script, only thing I changed was the
number of objects read by -readcount !

So, no script changes except for the -readcount value.

I still have the script as seen here, along with the unaltered set of test-files,
and the set of result files.

Anyone with any input on the issue ? Should I file a bug-report with Microsoft,
and if yes, where do I do that ?

Best Regards,
Jacob Saaby Nielsen
mailto:jacob.saaby@xxxxxx



My System SpecsSystem Spec
Old 12-18-2007   #2 (permalink)
Marco Shaw [MVP]
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

Quote:

> The value is the one I set -readcount to, result is how many rows were
> in my .csv file. Value = 0 means no readcount parameter at all.
>
> Value 0, Results 320
> Value 1, Results 320
> Value 10, Results 2492
> Value 50, Results 1435
> Value 100, Results 845
> Value 200, Results 475
> Value 500, Results 230
> Value 1000, Results 142
> Value 5000, Results 53
> 2nd Try Value 0, Results 320
> 3rd Try Value 0, Results 320
I may need some data to test against, but tried something basic but
withe the v2 CTP:

v2 CTP>(gc 100.tmp).count
101
v2 CTP>(gc -readcount 0 100.tmp).count
101
v2 CTP>(gc -readcount 5 100.tmp).count
21
v2 CTP>(gc -readcount 10 100.tmp).count
11
v2 CTP>(gc -readcount 99 100.tmp).count
2
v2 CTP>(gc -readcount 101 100.tmp).count
101
v2 CTP>(gc -readcount 100 100.tmp).count
2
v2 CTP>(gc -readcount 101 100.tmp).count
101
v2 CTP>(gc -readcount 102 100.tmp).count
101
v2 CTP>(gc -readcount 999 100.tmp).count
101

I'm assuming you're running PSH v1...

I don't know if you're willing to zip it all up and send it to me or
anyone else running v2 (marco DOT shaw AT gmail).

For feedback/bugs:
https://connect.microsoft.com/site/s...aspx?SiteID=99

--
Microsoft MVP - Windows PowerShell
http://www.microsoft.com/mvp

PowerGadgets MVP
http://www.powergadgets.com/mvp

Blog:
http://marcoshaw.blogspot.com
My System SpecsSystem Spec
Old 12-18-2007   #3 (permalink)
Jacob Saaby Nielsen
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

Hey Marco,

I can probably convince my superiors to let me send the logfiles to Microsoft,
but I'm pretty sure I'm not
allowed to send them to anyone else.

However, if you grab X number of standard smtp logfiles from Exchange, and
run the test the same way I
did, that should do equally well (and I'm betting you'll be getting varying
results too).

Yup, I'm running PowerShell v1

I hear my boss on the phone outside my office, I'll ask him how to proceed.

Best Regards,
Jacob Saaby Nielsen
mailto:jacob.saaby@xxxxxx
Quote:
Quote:

>> The value is the one I set -readcount to, result is how many rows
>> were in my .csv file. Value = 0 means no readcount parameter at all.
>>
>> Value 0, Results 320
>> Value 1, Results 320
>> Value 10, Results 2492
>> Value 50, Results 1435
>> Value 100, Results 845
>> Value 200, Results 475
>> Value 500, Results 230
>> Value 1000, Results 142
>> Value 5000, Results 53
>> 2nd Try Value 0, Results 320
>> 3rd Try Value 0, Results 320
> I may need some data to test against, but tried something basic but
> withe the v2 CTP:
>
> v2 CTP>(gc 100.tmp).count
> 101
> v2 CTP>(gc -readcount 0 100.tmp).count
> 101
> v2 CTP>(gc -readcount 5 100.tmp).count
> 21
> v2 CTP>(gc -readcount 10 100.tmp).count
> 11
> v2 CTP>(gc -readcount 99 100.tmp).count
> 2
> v2 CTP>(gc -readcount 101 100.tmp).count
> 101
> v2 CTP>(gc -readcount 100 100.tmp).count
> 2
> v2 CTP>(gc -readcount 101 100.tmp).count
> 101
> v2 CTP>(gc -readcount 102 100.tmp).count
> 101
> v2 CTP>(gc -readcount 999 100.tmp).count
> 101
> I'm assuming you're running PSH v1...
>
> I don't know if you're willing to zip it all up and send it to me or
> anyone else running v2 (marco DOT shaw AT gmail).
>
> For feedback/bugs:
> https://connect.microsoft.com/site/s...aspx?SiteID=99
> PowerGadgets MVP
> http://www.powergadgets.com/mvp
> Blog:
> http://marcoshaw.blogspot.com

My System SpecsSystem Spec
Old 12-18-2007   #4 (permalink)
Shay Levi
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)


-readcount controls how many lines of content to send sent through the pipeline
at a time.


I read a file 60 lines long:

PS > (gc ExBackup.ps1).count
60

If I do:
PS > (gc ExBackup.ps1 -ReadCount 10).count
6

which means it send 6 times 10 rows to the pipeline.

#PS > (gc ExBackup.ps1 -ReadCount 2).count
#30

So the count result may vary depend on the file lines length.



Also, this:
[string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date -format
HHmm ) + ".csv"

can be shortend to:
[string]$ResultsFile = (Get-Date -format "ddMMyyyy_HHmm") +".csv"


As Marco said, it's hard to say without the real files.

-----
Shay Levi
$cript Fanatic
http://scriptolog.blogspot.com
Hebrew weblog: http://blogs.microsoft.co.il/blogs/scriptfanatic


Quote:

> Hey all,
>
> I created this script today, that I use to trawl smtp log files (had
> to do one myself Shay, gotta learn ! I'll also try yours though ).
>
> Problem is, I'm getting wildly variating results when using the
> readcount
> parameter of get-content. I just couldn't figure out why
> some e-mail addresses didn't show up in my list, since I just knew
> that my
> regular expressions were true (using RegexBuddy, where
> I loaded the file and tested my regexes on it, where it hit that exact
> same
> e-mail address).
> So I tested it out a bit... I'll first post my script, so you can
> check out if it's my script which is at fault.
>
> I'll first post my script, then the results I got from each test
>
> Script
>
> #region Script Settings
> #<ScriptSettings xmlns="http://tempuri.org/ScriptSettings.xsd">
> # <ScriptPackager>
> # <process>powershell.exe</process>
> # <arguments />
> # <extractdir>%TEMP%</extractdir>
> # <files />
> # <usedefaulticon>true</usedefaulticon>
> # <showinsystray>false</showinsystray>
> # <altcreds>false</altcreds>
> # <efs>true</efs>
> # <ntfs>true</ntfs>
> # <local>false</local>
> # <abortonfail>true</abortonfail>
> # <product />
> # <version>1.0.0.1</version>
> # <versionstring />
> # <comments />
> # <includeinterpreter>false</includeinterpreter>
> # <forcecomregistration>false</forcecomregistration>
> # <consolemode>false</consolemode>
> # <EnableChangelog>false</EnableChangelog>
> # <AutoBackup>false</AutoBackup>
> # </ScriptPackager>
> #</ScriptSettings>
> #endregion
> #
> ======================================================================
> ==========================================
> #
> # Script Information
> #
> # Title: get-smtperrors.ps1
> # Author: Jacob Saaby Nielsen, jsy@xxxxxx
> # Originally created: 18-12-2007 - 09:21:50
> # Original path: C:\Documents and Settings\jsy\My
> Documents\AdminScriptEditor\get-smtperrors.ps1
> # Description: Gets all e-mails from smtp log lines with a 5xx
> status code,
> and the reason for the 5xx.
> #
> #
> ======================================================================
> ==========================================
> function isSMTPErr([string]$LogString)
> {
> $Regex = [regex] "\s5\d{2}\+\d{1}.\d{1}.\d{1}\+|\s5\d{2}\+"
> if ($Regex.Ismatch($LogString) -eq $true)
> { return $true }
> else
> { return $false }
> }
> function get-SMTPErr([string]$LogString)
> {
> $Regex = [regex] "\s5\d{2}\+"
> $Match = $Regex.Match($LogString)
> return ($Match.Value).Substring(1,3)
> }
> function get-StatusCode([string]$LogString)
> {
> $Regex = [regex] "\+\d{1}.\d{1}.\d{1}\+"
> $Match = $Regex.Match($LogString)
> if (($match.value).length -gt 1)
> {
> return ($Match.Value).Substring(1,5)
> }
> else
> { return ""}
> }
> function get-Email([string]$LogString)
> {
> $Regex = [regex]
> "([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA
> -Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})"
> $Match = $Regex.Match($LogString)
> return $Match.Value
> }
> [string]$ResultsDir = "c:\"
> [string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date
> -format
> HHmm ) + ".csv"
> [string]$SMTPLogDir = "c:\test"
> $userArray = @{ }
>
> Write-Host "Counting how many logfiles are available..."
>
> $TotalNumberOfFiles = (Get-ChildItem $SMTPLogDir -recurse | where
> {!$_.psiscontainer
> }).length
> $FileCounter = 0
> foreach ($SMTPlog in Get-ChildItem $SMTPLogDir -recurse)
> {
> if (!$SMTPlog.PSIsContainer)
> {
> $status = "Processing file {0} of {1}: {2}" -f $FileCounter,
> $TotalNumberOfFiles,
> $SMTPlog.Fullname
> Write-Progress $status -PercentComplete ((100 / $TotalNumberOfFiles)
> *
> $FileCounter ) -Activity "Processing logfiles" -ID 1
> foreach ($logline in Get-Content $SMTPlog.Fullname -readcount 1)
> {
> if ((isSMTPErr $logline) -eq $true)
> {
> $userArray[( get-Email $logline )] = @{ SMTPErr =
> get-SMTPErr($logline);
> SMTPStatus = get-Statuscode($logline); SMTPlogFile =
> $SMTPlog.Fullname}
> }
> }
> $FileCounter++
> }
> foreach ($email in $userArray.Keys | Sort-Object)
> {
> Add-Content -path($ResultsDir + $ResultsFile) -value($email + "," +
> $userArray[$email].SMTPErr
> + "," + $userArray[$email].SMTPStatus + "," +
> $userArray[$email].SMTPlogFile)
> -ErrorAction continue
> }
> $userArray = @{ }
> }
> Readcount Results
>
> The value is the one I set -readcount to, result is how many rows were
> in my .csv file. Value = 0 means no readcount parameter at all.
>
> Value 0, Results 320
> Value 1, Results 320
> Value 10, Results 2492
> Value 50, Results 1435
> Value 100, Results 845
> Value 200, Results 475
> Value 500, Results 230
> Value 1000, Results 142
> Value 5000, Results 53
> 2nd Try Value 0, Results 320
> 3rd Try Value 0, Results 320
> All tests are done on the same set of files, which were NOT altered at
> any
> point in time (I copied the SMTP logs to a local
> folder on my machine, and used the local copies to test on, while
> creating
> my script).
> I would also like to point out, that I did my tests with the EXACT
> same version of the script, only thing I changed was the number of
> objects read by -readcount !
>
> So, no script changes except for the -readcount value.
>
> I still have the script as seen here, along with the unaltered set of
> test-files, and the set of result files.
>
> Anyone with any input on the issue ? Should I file a bug-report with
> Microsoft, and if yes, where do I do that ?
>
> Best Regards,
> Jacob Saaby Nielsen
> mailto:jacob.saaby@xxxxxx

My System SpecsSystem Spec
Old 12-18-2007   #5 (permalink)
Jacob Saaby Nielsen
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

Hey Marco,

the Connect URL doesn't let me do anything but post feedback. No "bug submission"
that I can see.
Where do I go when I log into Connect ?

I went to the Windows PowerShell program, but I can only submit 2.0 CTP feedback
there.

Best Regards,
Jacob Saaby Nielsen
mailto:jacob.saaby@xxxxxx

My System SpecsSystem Spec
Old 12-18-2007   #6 (permalink)
Marco Shaw [MVP]
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

Jacob Saaby Nielsen wrote:
Quote:

> Hey Marco,
>
> I can probably convince my superiors to let me send the logfiles to
> Microsoft, but I'm pretty sure I'm not
> allowed to send them to anyone else.
>
> However, if you grab X number of standard smtp logfiles from Exchange,
> and run the test the same way I
> did, that should do equally well (and I'm betting you'll be getting
> varying results too).
I understand. I can get access to my own, or just build them randomly.

Can you provide a sample filename, and sample format? How much data are
you running this against?

X files? X directories? X MBs?

Marco


--
Microsoft MVP - Windows PowerShell
http://www.microsoft.com/mvp

PowerGadgets MVP
http://www.powergadgets.com/mvp

Blog:
http://marcoshaw.blogspot.com
My System SpecsSystem Spec
Old 12-18-2007   #7 (permalink)
Jacob Saaby Nielsen
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

Hey Marco Shaw [MVP],

Sample filename: ex07121714.log (we run hourly logs)
1 folder containing 37 logfiles, 19.2Mb.

Best Regards,
Jacob Saaby Nielsen
mailto:jacob.saaby@xxxxxx

Quote:

> Can you provide a sample filename, and sample format? How much data
> are you running this against?
>
> X files? X directories? X MBs?
>
> Marco
>
> PowerGadgets MVP
> http://www.powergadgets.com/mvp
> Blog:
> http://marcoshaw.blogspot.com

My System SpecsSystem Spec
Old 12-18-2007   #8 (permalink)
Jacob Saaby Nielsen
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

Hey Shay,

as I understand it, -readcount lets me read X number of lines at a time,
which dramatically speeds up the
reading of e.g. logfiles.

No matter what I as a scripter set it to, PowerShell itself should control
that itself. I'm basically don't care
how many reads it does of the file. All I'm interested in, is reading the
logfile as fast as possible, and PS
needs to control that the data that flows through, is treated consistently.

My logfiles are standard Exchange smtp logfiles, no hokus pokus. I'll even
tell you which fields I log, if you'd
like me to.

But we have defense contracts etc., so I can't just send the original logfiles
to anyone. Anyway, my closest
superior will ask his superior if we can send them to Microsoft directly,
if they can guarantee us complete
secrecy regarding the email addresses in the logfiles.

Best Regards,
Jacob Saaby Nielsen
mailto:jacob.saaby@xxxxxx
Quote:

> -readcount controls how many lines of content to send sent through the
> pipeline at a time.
>
> I read a file 60 lines long:
>
PS>> (gc ExBackup.ps1).count
PS>>
Quote:

> 60
>
> If I do:
>
PS>> (gc ExBackup.ps1 -ReadCount 10).count
PS>>
Quote:

> 6
>
> which means it send 6 times 10 rows to the pipeline.
>
> #PS > (gc ExBackup.ps1 -ReadCount 2).count
> #30
> So the count result may vary depend on the file lines length.
>
> Also, this:
> [string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date
> -format
> HHmm ) + ".csv"
> can be shortend to:
> [string]$ResultsFile = (Get-Date -format "ddMMyyyy_HHmm") +".csv"
> As Marco said, it's hard to say without the real files.
>
> -----
> Shay Levi
> $cript Fanatic
> http://scriptolog.blogspot.com
> Hebrew weblog: http://blogs.microsoft.co.il/blogs/scriptfanatic
Quote:

>> Hey all,
>>
>> I created this script today, that I use to trawl smtp log files (had
>> to do one myself Shay, gotta learn ! I'll also try yours though ).
>>
>> Problem is, I'm getting wildly variating results when using the
>> readcount
>> parameter of get-content. I just couldn't figure out why
>> some e-mail addresses didn't show up in my list, since I just knew
>> that my
>> regular expressions were true (using RegexBuddy, where
>> I loaded the file and tested my regexes on it, where it hit that
>> exact
>> same
>> e-mail address).
>> So I tested it out a bit... I'll first post my script, so you can
>> check out if it's my script which is at fault.
>> I'll first post my script, then the results I got from each test
>>
>> Script
>>
>> #region Script Settings
>> #<ScriptSettings xmlns="http://tempuri.org/ScriptSettings.xsd">
>> # <ScriptPackager>
>> # <process>powershell.exe</process>
>> # <arguments />
>> # <extractdir>%TEMP%</extractdir>
>> # <files />
>> # <usedefaulticon>true</usedefaulticon>
>> # <showinsystray>false</showinsystray>
>> # <altcreds>false</altcreds>
>> # <efs>true</efs>
>> # <ntfs>true</ntfs>
>> # <local>false</local>
>> # <abortonfail>true</abortonfail>
>> # <product />
>> # <version>1.0.0.1</version>
>> # <versionstring />
>> # <comments />
>> # <includeinterpreter>false</includeinterpreter>
>> # <forcecomregistration>false</forcecomregistration>
>> # <consolemode>false</consolemode>
>> # <EnableChangelog>false</EnableChangelog>
>> # <AutoBackup>false</AutoBackup>
>> # </ScriptPackager>
>> #</ScriptSettings>
>> #endregion
>> #
>> =====================================================================
>> =
>> ==========================================
>> #
>> # Script Information
>> #
>> # Title: get-smtperrors.ps1
>> # Author: Jacob Saaby Nielsen, jsy@xxxxxx
>> # Originally created: 18-12-2007 - 09:21:50
>> # Original path: C:\Documents and Settings\jsy\My
>> Documents\AdminScriptEditor\get-smtperrors.ps1
>> # Description: Gets all e-mails from smtp log lines with a 5xx
>> status code,
>> and the reason for the 5xx.
>> #
>> #
>> =====================================================================
>> =
>> ==========================================
>> function isSMTPErr([string]$LogString)
>> {
>> $Regex = [regex] "\s5\d{2}\+\d{1}.\d{1}.\d{1}\+|\s5\d{2}\+"
>> if ($Regex.Ismatch($LogString) -eq $true)
>> { return $true }
>> else
>> { return $false }
>> }
>> function get-SMTPErr([string]$LogString)
>> {
>> $Regex = [regex] "\s5\d{2}\+"
>> $Match = $Regex.Match($LogString)
>> return ($Match.Value).Substring(1,3)
>> }
>> function get-StatusCode([string]$LogString)
>> {
>> $Regex = [regex] "\+\d{1}.\d{1}.\d{1}\+"
>> $Match = $Regex.Match($LogString)
>> if (($match.value).length -gt 1)
>> {
>> return ($Match.Value).Substring(1,5)
>> }
>> else
>> { return ""}
>> }
>> function get-Email([string]$LogString)
>> {
>> $Regex = [regex]
>> "([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-z
>> A
>> -Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})"
>> $Match = $Regex.Match($LogString)
>> return $Match.Value
>> }
>> [string]$ResultsDir = "c:\"
>> [string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date
>> -format
>> HHmm ) + ".csv"
>> [string]$SMTPLogDir = "c:\test"
>> $userArray = @{ }
>> Write-Host "Counting how many logfiles are available..."
>>
>> $TotalNumberOfFiles = (Get-ChildItem $SMTPLogDir -recurse | where
>> {!$_.psiscontainer
>> }).length
>> $FileCounter = 0
>> foreach ($SMTPlog in Get-ChildItem $SMTPLogDir -recurse)
>> {
>> if (!$SMTPlog.PSIsContainer)
>> {
>> $status = "Processing file {0} of {1}: {2}" -f $FileCounter,
>> $TotalNumberOfFiles,
>> $SMTPlog.Fullname
>> Write-Progress $status -PercentComplete ((100 / $TotalNumberOfFiles)
>> *
>> $FileCounter ) -Activity "Processing logfiles" -ID 1
>> foreach ($logline in Get-Content $SMTPlog.Fullname -readcount 1)
>> {
>> if ((isSMTPErr $logline) -eq $true)
>> {
>> $userArray[( get-Email $logline )] = @{ SMTPErr =
>> get-SMTPErr($logline);
>> SMTPStatus = get-Statuscode($logline); SMTPlogFile =
>> $SMTPlog.Fullname}
>> }
>> }
>> $FileCounter++
>> }
>> foreach ($email in $userArray.Keys | Sort-Object)
>> {
>> Add-Content -path($ResultsDir + $ResultsFile) -value($email + "," +
>> $userArray[$email].SMTPErr
>> + "," + $userArray[$email].SMTPStatus + "," +
>> $userArray[$email].SMTPlogFile)
>> -ErrorAction continue
>> }
>> $userArray = @{ }
>> }
>> Readcount Results
>> The value is the one I set -readcount to, result is how many rows
>> were in my .csv file. Value = 0 means no readcount parameter at all.
>>
>> Value 0, Results 320
>> Value 1, Results 320
>> Value 10, Results 2492
>> Value 50, Results 1435
>> Value 100, Results 845
>> Value 200, Results 475
>> Value 500, Results 230
>> Value 1000, Results 142
>> Value 5000, Results 53
>> 2nd Try Value 0, Results 320
>> 3rd Try Value 0, Results 320
>> All tests are done on the same set of files, which were NOT altered
>> at
>> any
>> point in time (I copied the SMTP logs to a local
>> folder on my machine, and used the local copies to test on, while
>> creating
>> my script).
>> I would also like to point out, that I did my tests with the EXACT
>> same version of the script, only thing I changed was the number of
>> objects read by -readcount !
>> So, no script changes except for the -readcount value.
>>
>> I still have the script as seen here, along with the unaltered set of
>> test-files, and the set of result files.
>>
>> Anyone with any input on the issue ? Should I file a bug-report with
>> Microsoft, and if yes, where do I do that ?
>>
>> Best Regards,
>> Jacob Saaby Nielsen
>> mailto:jacob.saaby@xxxxxx

My System SpecsSystem Spec
Old 12-18-2007   #9 (permalink)
Jacob Saaby Nielsen
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

Hey Shay,

thanks a lot for the tip

Best Regards,
Jacob Saaby Nielsen
mailto:jacob.saaby@xxxxxx
Quote:

> Also, this:
> [string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date
> -format
> HHmm ) + ".csv"
> can be shortend to:
> [string]$ResultsFile = (Get-Date -format "ddMMyyyy_HHmm") +".csv"
> As Marco said, it's hard to say without the real files.

My System SpecsSystem Spec
Old 12-18-2007   #10 (permalink)
Keith Hill [MVP]
Guest


 

Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)

"Jacob Saaby Nielsen" <jacob.saaby@xxxxxx> wrote in message
news:97b6e7b81f828ca0fb12dbb7f92@xxxxxx
Quote:

> Hey Shay,
>
> as I understand it, -readcount lets me read X number of lines at a time,
> which dramatically speeds up the
> reading of e.g. logfiles.
>
> No matter what I as a scripter set it to, PowerShell itself should control
> that itself. I'm basically don't care
> how many reads it does of the file. All I'm interested in, is reading the
> logfile as fast as possible, and PS
> needs to control that the data that flows through, is treated
> consistently.
Just an FYI - I did a write-up on this and found that on my PC reading a 75
MB log file that a readcount of 1000 is optimal but only slightly better
than a read count of 100.

http://keithhill.spaces.live.com/blo...3A97!756.entry

--
Keith

My System SpecsSystem Spec
Closed Thread

Thread Tools
Display Modes



Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: PowerShell is actually pretty cool. Chris Warwick PowerShell 0 05-10-2008 04:45 AM
PowerShell is actually pretty cool. RickB PowerShell 4