Windows Vista Forums
Vista Forums Home Join Vista Forums Windows 7 Forum Vista Tutorials Tags
Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks.

Go Back   Vista Forums > Vista Newsgroups > Vista file management

Vista - Indexing - How it does/should work? - Debate

Reply
 
Old 04-20-2007   #1 (permalink)
Julian


 
 

Indexing - How it does/should work? - Debate

I read often hereabouts that the index does not update in real time, to which
the obvious question is: why not?

The indexer receives notifications of events that relate to finding & use
(such as create and delete - I don't know what the whole list is...) so it
seems to me logical that it should cache recent events.

Searches for files should be masked with the results the cache: if the main
index says the file exists but the cache says it doesn't then don't show that
result; if a file is moved then redirect the result to the new location using
the cache; etc. etc.; if the file is only in the cache then... well, there's
no conflict with the main index. (NB I do hope that when a file is moved the
Indexer doesn't delete and rebuild for that file - I hope it's clever enough
to save that effort...)

As for the cache itself... well, if indexing responds to various events at
some later time, it must be keeping a record of them somewhere.

Now, I can appreciate that users create few files manually, but can delete
very large numbers - and could create very large numbers programmatically.

It would therefore make sense to separate file attributes from content as
far indexing and searching is concerned - to a certain degree... you could
separate search results for new files based on basic OS attributes only from
searches based on the content of those files (highlight them the way new apps
get highlighted after install?)

No one can reasonably expect an arbitrarily large amount of data to be
indexed by content in an arbitrarily short space of time - but if the file
can be created/deleted/moved and not lost in the process one can reasonably
expect the Search function to know about basic attributes such as the
filename at the very least..

And with regard to indexing removable media, if the medium is r/w why not
store the index on the medium - space/bandwidth permitting. You could even
give the user the choice:... no index, index of filenames only, full index,
etc. especially since at the filename only level it surely wouldn't take too
long compared to the time to update the directory anyway - would it?

[And BTW - since searching for shortcuts produces oddities... what does the
Save Shortcut Properties indexing filter do?]

I'd be interested to know more about how and why it works (gremlins and
their offspring excepted) ... and other user's opinions.

Julian

My System SpecsSystem Spec
Old 04-23-2007   #2 (permalink)
Ilia Sacson [MS]


 
 

Re: Indexing - How it does/should work? - Debate

Dear Julian,

Microsoft is not an open source company, yet; thus I won't be able to
satisfy your curiosity in full. Here are some vague and hand-waving answers
though. We do listen to USN journal change notifications, as a matter of
fact we do pretty much everything you've suggested and more, except for
supporting multiple catalogs. The previous incarnation of the indexer (the
one Yellow Dog of XP told you to turn on) could do that, maybe we'll
resurrect that later, or maybe we won't.

The reason why we still do not update stuff in real time is quite simple -
you don't want us to. First thing any XP optimization guide suggests is
turning off CISVC. Indexing is an expensive hobby, since it is very heavy on
disk IO. One of our main objectives is to stay out of the way and let you
get stuff done while the indexer is up and running, so that you wouldn't
turn it off to begin with. We haven't achieved the perfect balance yet, but
don't expect index to ever update instantaneously.

If you want to index millions of files without significant perf impact, make
sure your data and your catalog (and preferably your swap file) are on
different physical HDDs, use multicore CPU, 2Gb of RAM or more, and throw
some readyboost in for a good measure. Then you can go and set values
starting with DisableBackOff under
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows Search\Gathering Manager to
various interesting numbers, and see indexer behaving quite differently.
Purely hypothetically speaking, of course...

Thanks,
Ilia

"Julian" <Julian@discussions.microsoft.com> wrote in message
newsC0F61C4-527F-4D20-83C2-88EDAE9D09B3@microsoft.com...
>I read often hereabouts that the index does not update in real time, to
>which
> the obvious question is: why not?
>
> The indexer receives notifications of events that relate to finding & use
> (such as create and delete - I don't know what the whole list is...) so it
> seems to me logical that it should cache recent events.
>
> Searches for files should be masked with the results the cache: if the
> main
> index says the file exists but the cache says it doesn't then don't show
> that
> result; if a file is moved then redirect the result to the new location
> using
> the cache; etc. etc.; if the file is only in the cache then... well,
> there's
> no conflict with the main index. (NB I do hope that when a file is moved
> the
> Indexer doesn't delete and rebuild for that file - I hope it's clever
> enough
> to save that effort...)
>
> As for the cache itself... well, if indexing responds to various events at
> some later time, it must be keeping a record of them somewhere.
>
> Now, I can appreciate that users create few files manually, but can delete
> very large numbers - and could create very large numbers programmatically.
>
> It would therefore make sense to separate file attributes from content as
> far indexing and searching is concerned - to a certain degree... you could
> separate search results for new files based on basic OS attributes only
> from
> searches based on the content of those files (highlight them the way new
> apps
> get highlighted after install?)
>
> No one can reasonably expect an arbitrarily large amount of data to be
> indexed by content in an arbitrarily short space of time - but if the file
> can be created/deleted/moved and not lost in the process one can
> reasonably
> expect the Search function to know about basic attributes such as the
> filename at the very least..
>
> And with regard to indexing removable media, if the medium is r/w why not
> store the index on the medium - space/bandwidth permitting. You could even
> give the user the choice:... no index, index of filenames only, full
> index,
> etc. especially since at the filename only level it surely wouldn't take
> too
> long compared to the time to update the directory anyway - would it?
>
> [And BTW - since searching for shortcuts produces oddities... what does
> the
> Save Shortcut Properties indexing filter do?]
>
> I'd be interested to know more about how and why it works (gremlins and
> their offspring excepted) ... and other user's opinions.
>
> Julian


My System SpecsSystem Spec
Old 04-24-2007   #3 (permalink)
Julian


 
 

Re: Indexing - How it does/should work? - Debate

Hi Ilia

Thanks for the reply, [I notice it's only Ms/Mr Briefcase at MS who is
keeping their head down <g> ]... it's extremely good to get even hand-wavings
from the source and it all seemed very sensible... Yes, indexing was one of
the first things to turn off in XP and I didn't miss it.

I don't think I was as clear as I might have been re "instant" update of
indexing - I tried to acknowledge the impracticability of RT content
indexing... I meant to emphasise masking any type of result from the full
index with simple file property information from the "cache" (UNS (Update
Notification Service?) Journal?) , specifically to avoid issues such as "I
deleted that file but the indexer still lists it".

Or was that what you meant when you said "we do most of that...and more" -
on re-read I think perhaps you did...

[I did just create a txt file on the desktop - instantly there from Start
Search, instantly gone when I renamed it, but I have read of other user's
issues and have been puzzling over indexing's operations for a while
recently.]

Whether there is a return of multiple indices, indexability of removable
media would be a big plus.

I do have a dual core CPU and 2GB RAM, readyboost soon maybe - but don't
think the laptop will be getting second disc, so very interesting as the
hypothesis is (thanks for the provocative thoughts!) it won't be tested for a
while

And finally, emphasising that I'm not being ironic or aggressively critical,
whilst it is unlikely that MS will become Open Source soon (<g> how long
before a rumour starts from a random hit for "MS" "Open Source"), easier
access to functional specifications might be an interesting idea... I bet all
the good stuff gets patented ASAP so the implementations are protected. (Not
meaning to start a long debate about that either: I can already hear it in my
head)

Thank you again... when someone from MS hears the question, the answers are
worth listening to.

(An equally forthcoming response to the more immediate Briefcase issues
would be appreciated... but the deafening silence isn't your fault!)

Julian
My System SpecsSystem Spec
Old 04-24-2007   #4 (permalink)
Ilia Sacson [MS]


 
 

Re: Indexing - How it does/should work? - Debate

Hi Julian,
We are doing our best You can read on USN Journal here:
http://msdn2.microsoft.com/en-us/library/aa363803.aspx. Delets not being
processed is a bug, we'll fix it as soon as we can repro it locally (which
we still can't btw) and that will be the end of that. It's not
representative of indexer perf in general. Re open source, check this out:
http://www.microsoft.com/resources/s...g/default.mspx.
Thanks,
Ilia

"Julian" <Julian@discussions.microsoft.com> wrote in message
news:FBE0EB26-3BC3-497C-A3D9-0CC5B04C18BA@microsoft.com...
> Hi Ilia
>
> Thanks for the reply, [I notice it's only Ms/Mr Briefcase at MS who is
> keeping their head down <g> ]... it's extremely good to get even
> hand-wavings
> from the source and it all seemed very sensible... Yes, indexing was one
> of
> the first things to turn off in XP and I didn't miss it.
>
> I don't think I was as clear as I might have been re "instant" update of
> indexing - I tried to acknowledge the impracticability of RT content
> indexing... I meant to emphasise masking any type of result from the full
> index with simple file property information from the "cache" (UNS (Update
> Notification Service?) Journal?) , specifically to avoid issues such as "I
> deleted that file but the indexer still lists it".
>
> Or was that what you meant when you said "we do most of that...and more" -
> on re-read I think perhaps you did...
>
> [I did just create a txt file on the desktop - instantly there from Start
> Search, instantly gone when I renamed it, but I have read of other user's
> issues and have been puzzling over indexing's operations for a while
> recently.]
>
> Whether there is a return of multiple indices, indexability of removable
> media would be a big plus.
>
> I do have a dual core CPU and 2GB RAM, readyboost soon maybe - but don't
> think the laptop will be getting second disc, so very interesting as the
> hypothesis is (thanks for the provocative thoughts!) it won't be tested
> for a
> while
>
> And finally, emphasising that I'm not being ironic or aggressively
> critical,
> whilst it is unlikely that MS will become Open Source soon (<g> how long
> before a rumour starts from a random hit for "MS" "Open Source"), easier
> access to functional specifications might be an interesting idea... I bet
> all
> the good stuff gets patented ASAP so the implementations are protected.
> (Not
> meaning to start a long debate about that either: I can already hear it in
> my
> head)
>
> Thank you again... when someone from MS hears the question, the answers
> are
> worth listening to.
>
> (An equally forthcoming response to the more immediate Briefcase issues
> would be appreciated... but the deafening silence isn't your fault!)
>
> Julian


My System SpecsSystem Spec
Reply

Thread Tools


Similar Threads
Thread Forum
Solved Does Vista Require Indexing for Search to Work? General Discussion
Search Indexing won't work at all - greyed out Vista file management
Windows search doesn't work - indexing stopped Vista General
Indexing Options Doesn't Work... Vista file management
INDEXING IN VISTA ULTIMATE DOESNT WORK Vista General


Vista Forums is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media Ltd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46