![]() |
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
|
Welcome to Vista Forums we are your forum to discuss Windows Vista x64 and x86 systems. Whether you need help or just want to post an idea you have on Vista, this is the forum for you.
br> br> |
| |||||||
![]() |
| | Thread Tools | Display Modes |
| | #1 (permalink) |
| Guest | I'm pretty sure I found a bug in Powershell (get-content -readcount) Hey all, I created this script today, that I use to trawl smtp log files (had to do one myself Shay, gotta learn ! I'll also try yours though ).Problem is, I'm getting wildly variating results when using the readcount parameter of get-content. I just couldn't figure out why some e-mail addresses didn't show up in my list, since I just knew that my regular expressions were true (using RegexBuddy, where I loaded the file and tested my regexes on it, where it hit that exact same e-mail address). So I tested it out a bit... I'll first post my script, so you can check out if it's my script which is at fault. I'll first post my script, then the results I got from each test Script #region Script Settings #<ScriptSettings xmlns="http://tempuri.org/ScriptSettings.xsd"> # <ScriptPackager> # <process>powershell.exe</process> # <arguments /> # <extractdir>%TEMP%</extractdir> # <files /> # <usedefaulticon>true</usedefaulticon> # <showinsystray>false</showinsystray> # <altcreds>false</altcreds> # <efs>true</efs> # <ntfs>true</ntfs> # <local>false</local> # <abortonfail>true</abortonfail> # <product /> # <version>1.0.0.1</version> # <versionstring /> # <comments /> # <includeinterpreter>false</includeinterpreter> # <forcecomregistration>false</forcecomregistration> # <consolemode>false</consolemode> # <EnableChangelog>false</EnableChangelog> # <AutoBackup>false</AutoBackup> # </ScriptPackager> #</ScriptSettings> #endregion # ================================================================================================================ # # Script Information # # Title: get-smtperrors.ps1 # Author: Jacob Saaby Nielsen, jsy@xxxxxx # Originally created: 18-12-2007 - 09:21:50 # Original path: C:\Documents and Settings\jsy\My Documents\AdminScriptEditor\get-smtperrors.ps1 # Description: Gets all e-mails from smtp log lines with a 5xx status code, and the reason for the 5xx. # # ================================================================================================================ function isSMTPErr([string]$LogString) { $Regex = [regex] "\s5\d{2}\+\d{1}.\d{1}.\d{1}\+|\s5\d{2}\+" if ($Regex.Ismatch($LogString) -eq $true) { return $true } else { return $false } } function get-SMTPErr([string]$LogString) { $Regex = [regex] "\s5\d{2}\+" $Match = $Regex.Match($LogString) return ($Match.Value).Substring(1,3) } function get-StatusCode([string]$LogString) { $Regex = [regex] "\+\d{1}.\d{1}.\d{1}\+" $Match = $Regex.Match($LogString) if (($match.value).length -gt 1) { return ($Match.Value).Substring(1,5) } else { return ""} } function get-Email([string]$LogString) { $Regex = [regex] "([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})" $Match = $Regex.Match($LogString) return $Match.Value } [string]$ResultsDir = "c:\" [string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date -format HHmm ) + ".csv" [string]$SMTPLogDir = "c:\test" $userArray = @{ } Write-Host "Counting how many logfiles are available..." $TotalNumberOfFiles = (Get-ChildItem $SMTPLogDir -recurse | where {!$_.psiscontainer }).length $FileCounter = 0 foreach ($SMTPlog in Get-ChildItem $SMTPLogDir -recurse) { if (!$SMTPlog.PSIsContainer) { $status = "Processing file {0} of {1}: {2}" -f $FileCounter, $TotalNumberOfFiles, $SMTPlog.Fullname Write-Progress $status -PercentComplete ((100 / $TotalNumberOfFiles) * $FileCounter ) -Activity "Processing logfiles" -ID 1 foreach ($logline in Get-Content $SMTPlog.Fullname -readcount 1) { if ((isSMTPErr $logline) -eq $true) { $userArray[( get-Email $logline )] = @{ SMTPErr = get-SMTPErr($logline); SMTPStatus = get-Statuscode($logline); SMTPlogFile = $SMTPlog.Fullname} } } $FileCounter++ } foreach ($email in $userArray.Keys | Sort-Object) { Add-Content -path($ResultsDir + $ResultsFile) -value($email + "," + $userArray[$email].SMTPErr + "," + $userArray[$email].SMTPStatus + "," + $userArray[$email].SMTPlogFile) -ErrorAction continue } $userArray = @{ } } Readcount Results The value is the one I set -readcount to, result is how many rows were in my .csv file. Value = 0 means no readcount parameter at all. Value 0, Results 320 Value 1, Results 320 Value 10, Results 2492 Value 50, Results 1435 Value 100, Results 845 Value 200, Results 475 Value 500, Results 230 Value 1000, Results 142 Value 5000, Results 53 2nd Try Value 0, Results 320 3rd Try Value 0, Results 320 All tests are done on the same set of files, which were NOT altered at any point in time (I copied the SMTP logs to a local folder on my machine, and used the local copies to test on, while creating my script). I would also like to point out, that I did my tests with the EXACT same version of the script, only thing I changed was the number of objects read by -readcount ! So, no script changes except for the -readcount value. I still have the script as seen here, along with the unaltered set of test-files, and the set of result files. Anyone with any input on the issue ? Should I file a bug-report with Microsoft, and if yes, where do I do that ? Best Regards, Jacob Saaby Nielsen mailto:jacob.saaby@xxxxxx |
My System Specs![]() |
| | #2 (permalink) | ||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount)
withe the v2 CTP: v2 CTP>(gc 100.tmp).count 101 v2 CTP>(gc -readcount 0 100.tmp).count 101 v2 CTP>(gc -readcount 5 100.tmp).count 21 v2 CTP>(gc -readcount 10 100.tmp).count 11 v2 CTP>(gc -readcount 99 100.tmp).count 2 v2 CTP>(gc -readcount 101 100.tmp).count 101 v2 CTP>(gc -readcount 100 100.tmp).count 2 v2 CTP>(gc -readcount 101 100.tmp).count 101 v2 CTP>(gc -readcount 102 100.tmp).count 101 v2 CTP>(gc -readcount 999 100.tmp).count 101 I'm assuming you're running PSH v1... I don't know if you're willing to zip it all up and send it to me or anyone else running v2 (marco DOT shaw AT gmail). For feedback/bugs: https://connect.microsoft.com/site/s...aspx?SiteID=99 -- Microsoft MVP - Windows PowerShell http://www.microsoft.com/mvp PowerGadgets MVP http://www.powergadgets.com/mvp Blog: http://marcoshaw.blogspot.com | ||||||||||||
My System Specs![]() | |||||||||||||
| | #3 (permalink) | ||||||||||||||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) Hey Marco, I can probably convince my superiors to let me send the logfiles to Microsoft, but I'm pretty sure I'm not allowed to send them to anyone else. However, if you grab X number of standard smtp logfiles from Exchange, and run the test the same way I did, that should do equally well (and I'm betting you'll be getting varying results too). Yup, I'm running PowerShell v1 ![]() I hear my boss on the phone outside my office, I'll ask him how to proceed. Best Regards, Jacob Saaby Nielsen mailto:jacob.saaby@xxxxxx
| ||||||||||||||||||||||||
My System Specs![]() | |||||||||||||||||||||||||
| | #4 (permalink) | ||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) -readcount controls how many lines of content to send sent through the pipeline at a time. I read a file 60 lines long: PS > (gc ExBackup.ps1).count 60 If I do: PS > (gc ExBackup.ps1 -ReadCount 10).count 6 which means it send 6 times 10 rows to the pipeline. #PS > (gc ExBackup.ps1 -ReadCount 2).count #30 So the count result may vary depend on the file lines length. Also, this: [string]$ResultsFile = (Get-Date -format ddMMyyyy) + "_" + ( Get-Date -format HHmm ) + ".csv" can be shortend to: [string]$ResultsFile = (Get-Date -format "ddMMyyyy_HHmm") +".csv" As Marco said, it's hard to say without the real files. ----- Shay Levi $cript Fanatic http://scriptolog.blogspot.com Hebrew weblog: http://blogs.microsoft.co.il/blogs/scriptfanatic
| ||||||||||||
My System Specs![]() | |||||||||||||
| | #5 (permalink) | ||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) Hey Marco, the Connect URL doesn't let me do anything but post feedback. No "bug submission" that I can see. Where do I go when I log into Connect ? I went to the Windows PowerShell program, but I can only submit 2.0 CTP feedback there. Best Regards, Jacob Saaby Nielsen mailto:jacob.saaby@xxxxxx
| ||||||||||||
My System Specs![]() | |||||||||||||
| | #6 (permalink) | ||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) Jacob Saaby Nielsen wrote:
Can you provide a sample filename, and sample format? How much data are you running this against? X files? X directories? X MBs? Marco -- Microsoft MVP - Windows PowerShell http://www.microsoft.com/mvp PowerGadgets MVP http://www.powergadgets.com/mvp Blog: http://marcoshaw.blogspot.com | ||||||||||||
My System Specs![]() | |||||||||||||
| | #7 (permalink) | ||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) Hey Marco Shaw [MVP], Sample filename: ex07121714.log (we run hourly logs) 1 folder containing 37 logfiles, 19.2Mb. Best Regards, Jacob Saaby Nielsen mailto:jacob.saaby@xxxxxx
| ||||||||||||
My System Specs![]() | |||||||||||||
| | #8 (permalink) | ||||||||||||||||||||||||||||||||||||||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) Hey Shay, as I understand it, -readcount lets me read X number of lines at a time, which dramatically speeds up the reading of e.g. logfiles. No matter what I as a scripter set it to, PowerShell itself should control that itself. I'm basically don't care how many reads it does of the file. All I'm interested in, is reading the logfile as fast as possible, and PS needs to control that the data that flows through, is treated consistently. My logfiles are standard Exchange smtp logfiles, no hokus pokus. I'll even tell you which fields I log, if you'd like me to. But we have defense contracts etc., so I can't just send the original logfiles to anyone. Anyway, my closest superior will ask his superior if we can send them to Microsoft directly, if they can guarantee us complete secrecy regarding the email addresses in the logfiles. Best Regards, Jacob Saaby Nielsen mailto:jacob.saaby@xxxxxx
PS>>
PS>>
| ||||||||||||||||||||||||||||||||||||||||||||||||
My System Specs![]() | |||||||||||||||||||||||||||||||||||||||||||||||||
| | #9 (permalink) | ||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) Hey Shay, thanks a lot for the tip ![]() Best Regards, Jacob Saaby Nielsen mailto:jacob.saaby@xxxxxx
| ||||||||||||
My System Specs![]() | |||||||||||||
| | #10 (permalink) | ||||||||||||
| Guest | Re: I'm pretty sure I found a bug in Powershell (get-content -readcount) "Jacob Saaby Nielsen" <jacob.saaby@xxxxxx> wrote in message news:97b6e7b81f828ca0fb12dbb7f92@xxxxxx
MB log file that a readcount of 1000 is optimal but only slightly better than a read count of 100. http://keithhill.spaces.live.com/blo...3A97!756.entry -- Keith | ||||||||||||
My System Specs![]() | |||||||||||||
![]() |
| Thread Tools | |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Re: PowerShell is actually pretty cool. | Chris Warwick | PowerShell | 0 | 05-10-2008 04:45 AM |
| PowerShell is actually pretty cool. | RickB | PowerShell | 4 | |