|
Re: Searching large text files "Kiron" <Kiron@discussions.microsoft.com> wrote in message
news:O2H3piwrHHA.4364@TK2MSFTNGP04.phx.gbl...
> Now try filtering each object with an If statement inside a Foreach-Object
> scriptblock. Count is constantly 10 as expected.
> Where-Object and Get-Content's -ReadCount <-gt 1> don't get along:
>
Yeah what I said wasn't quite right. Setting -readcount to something like 5
will read five lines and send that down the pipeline as two array objects
each with 5 strings in it:
64> gc test.txt -read 5 | get-typename # get-typename from PSCX
Object[]
Object[]
65> gc test.txt -read 5 | %{$_} | get-typename
String
String
String
String
String
String
String
String
String
String
"Typically" these arrays are dealt with in the same way as if you had sent
the strings one at a time but not in all cases. In the for each loop above,
it sends the array down the pipeline which shreds the array and sends the
individual elements. In the case of -like, it will work on an array as well
as a scalar:
66> gc test.txt -read 5 | where {$_ -like 'a*'}
a
ab
abc
abcd
abcde
abcdef
abcdefg
abcdefgh
abcdefghi
abcdefghij
or
68> (ql a ab abc abcd) -like "a*" # ql or quote-list from PSCX
a
ab
abc
abcd
Many cmdlets will accept an array of input and then operate on each element
individually. However in your case, what you are measuring with
measure-object is the fact the Where-Object cmdlets just sends the
"original" object (which is an array) on down the pipeline if the expression
evaluates to true. Fortunately both -like and -match operate on arrays and
return just the elements that match:
2> (ql ab ba cd af) -match '^a'
ab
af
3> (ql ab ba cd af) -like 'a*'
ab
af
What I'm not seeing is get-content ballooning the memory requirements of
PowerShell. I run the following command on a 77 MB text file:
84> measure-command { gc large.txt | ?{$_ -match 'dg\s*$'} }
Days : 0
Hours : 0
Minutes : 2
Seconds : 18
Milliseconds : 162
Ticks : 1381622340
TotalDays : 0.00159909993055556
TotalHours : 0.0383783983333333
TotalMinutes : 2.3027039
TotalSeconds : 138.162234
TotalMilliseconds : 138162.234
and PowerShell never gets above ~53 MB of private memory.
--
Keith |