Windows Vista Forums
Vista Forums Home Join Vista Forums Windows 7 Forum Vista Tutorials Tags
Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks.

Go Back   Vista Forums > Misc Newsgroups > PowerShell

Vista - Re: Handling Large Numbers of Files

Reply
 
Old 12-03-2008   #1 (permalink)
Keith Hill [MVP]


 
 

Re: Handling Large Numbers of Files

I would consider using a hash table to collect all the files in the archive
dir. Comparing large arrays would be very slow. If you use a hash table
then you can use a very quick key test to see if the file you are processing
exists in the archive dir:

# First populate the hashtable with all the file names in the archive dir
$archive = @{}; gci . -r | ?{!$_.PSIsContainer} |
%{$archive[$_.name]=(Get-Hash $_)}

Then you have two options for testing. Simple:

if ($archive[$sourceFile.name]) { .... }

or to see if two files are really the same and not just named the same:

$hash = Get-Hash $sourceFile
if ($archive[$sourceFile.name].HashString -eq $hash) { .... }

Note that Get-Hash is from the PowerShell Community Extensions.

--
Keith
http://www.codeplex.com/powershellcx


"Michael Powe" <michael+gnus@xxxxxx> wrote in message
news:87k5ahgq6d.fsf@xxxxxx
Quote:

> Hello,
>
> I have a file maintenance task that involves a large number of files.
> I have a target directory which acts as an archive and a source
> directory which contains log files. I want to delete from the source
> directory all the files that already exist in the archive
> directory. "Large number" is about 130,000 files right now in source
> and about 190,000 files in target. After the initial processing, I'll
> be processing probably in the neighborhood of 10,000 files at a time
> in the source -- not so bad. But, of course, archive will keep
> getting bigger for a while yet.
>
> I wrote a simple function that collects the names of the files into
> two arrays and then compares them. Dupes are deleted with
> remove-item. That works but is slower than a herd of turtles. It has
> taken all morning to remove about 4000 duplicates.
>
> I thought it would be much better if I could collect some finite
> number of files at a time, like 1000, process them, collect the next
> 1000 and so forth. But, I don't see how to do that.
>
> I would appreciate any advice on how I can make this process quick
> enough that I won't need another haircut before it completes.
>
> Thanks.
>
> mp
>
> --
> Michael Powe michael@xxxxxx Naugatuck CT USA
>
>
> Q: How do you keep a moron in suspense?

My System SpecsSystem Spec
Old 12-04-2008   #2 (permalink)
Flowering Weeds


 
 

Re: Handling Large Numbers of Files

Quote:

>
> I have a file maintenance task that involves
> a large number of files.
> I have a target directory which acts as an archive
> and a source directory which contains log files. I
> want to delete from the source directory all the files
> that already exist in the archive directory. "Large
> number" is about 130,000 files right now in source
> and about 190,000 files in target.
Mmm data parsing the file system
and data parsing files!

Perhaps try Microsoft's IIS (local or remote)
data parser, Log Parser 2.2.

Log Parser runs in almost any Windows process,
even within the Windows admin's automation tool,
(now GUI based) powershell.exe.

For more help, ask any PowerShell user or MVP or
any IIS user or MVP (where Log Parser comes from,
notice that there is no need to have IIS installed in
order to use Log Parser within the Windows admin's
automation tool, powershell.exe) or just perhaps:

LogParser.exe -h

Have fun data parsing
within powershell.exe!


My System SpecsSystem Spec
Reply

Thread Tools


Similar Threads
Thread Forum
Re: Windows Vista slow file handling with large numbers of files. Vista file management
large numbers of email addresses Vista mail
Copying large numbers of files from external disk Vista file management
Memory leak when copying large numbers of files Vista file management
Folders with large numbers of video files crash Vista music pictures video


Vista Forums is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media Ltd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46