![]() |
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks. |
| |||||||
![]() |
| |
| | #11 (permalink) |
| | Re: Searching large text files So whenever -readCount is set to a value -gt 1 (because the file's content is very large), and these collections are sent through the pipeline to Where-Object, one could: a) split each collection passed through and filter them again to get each item that matches the first filter criteria; b) set -readCount to 1, or c) avoid Where-Object and filter the collections in a Foreach-Object loop. -- Kiron |
My System Specs![]() |
| | #12 (permalink) |
| | Re: Searching large text files "Kiron" <Kiron@discussions.microsoft.com> wrote in message news:B400D5CC-303C-4DA3-86AA-FFCD62A36FE5@microsoft.com... > So whenever -readCount is set to a value -gt 1 (because the file's content > is very large), and these collections are sent through the pipeline to > Where-Object, one could: > a) split each collection passed through and filter them again to get each > item that matches the first filter criteria; > b) set -readCount to 1, or > c) avoid Where-Object and filter the collections in a Foreach-Object loop. You've got it. BTW A and C look good to me but I would avoid B on large files - a readCount of 1 just kills performance. On a 75 MB text file, on my machine, reading the file line by line and counting lines takes over 3 minutes: 3> measure-command { gc large.txt -read 1 | measure } Days : 0 Hours : 0 Minutes : 3 Seconds : 22 Milliseconds : 574 Ticks : 2025743467 TotalDays : 0.00234461049421296 TotalHours : 0.0562706518611111 TotalMinutes : 3.37623911166667 TotalSeconds : 202.5743467 TotalMilliseconds : 202574.3467 While effectively getting the same count info using a -readCount of 1000 takes less than half that time: 5> measure-command { gc large.txt -read 1000 | %{$_} | measure } Days : 0 Hours : 0 Minutes : 1 Seconds : 28 Milliseconds : 501 Ticks : 885018447 TotalDays : 0.00102432690625 TotalHours : 0.02458384575 TotalMinutes : 1.475030745 TotalSeconds : 88.5018447 TotalMilliseconds : 88501.8447 FYI I decided to benchmark the various different readCount value and it seems that for my 75 MB text file a readCount of 1000 was optimal: ReadCount ElapsedTime --------- ----------- 1 00:03:07.4161690 10 00:01:38.5779661 100 00:01:17.9219998 1000 00:01:14.9202370 10000 00:01:22.1434037 100000 00:01:17.8457756 1000000 00:01:17.9850525 10000000 00:01:19.0217524 Here's the script I used to test this: 23> $ht = @{};for ($i = 1; $i -le 10MB; $i *= 10) { >> write-progress "Measuring gc -readCount $i" "% Complete" ` >> -perc ([math]::log10($i)*100/[math]::log10(10MB)) >> $ts = measure-command { gc large.txt -read $i | %{$_} | measure } >> $ht[$i] = $ts >> } >> 24> $ht.Keys | sort | select @{n='ReadCount';e={$_}}, @{n='ElapsedTime';e={$ht[$_].ToString()}} | ft -a and it you have PowerGadgets you can chart this like so: 33> $ht.Keys | sort | select @{n='ReadCount';e={"RC: $_"}}, @{n='ElapsedTime';e={$ht[$_].TotalSeconds}} | out-chart -title 'Optimal ReadCount for 75MB Text File' -- Keith |
My System Specs![]() |
| | #13 (permalink) |
| | Re: Searching large text files Thanks for the benchmark script and PowerGadget's graph, but no PowerGadget here ...not yet. Although it is a nice tool.Definitely, -readCount 1 won't do on large files. I brought this up because I thought something wasn't right. Thanks again! -- Kiron |
My System Specs![]() |
![]() |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Forum | |||
| Searching message text | Live Mail | |||
| searching text within word documents | Vista General | |||
| Searching for content in text files with powershell | PowerShell | |||
| Help searching text within XLS files | Vista file management | |||
| Searching for specific target text | Vista General | |||