Windows Vista Forums
Vista Forums Home Join Vista Forums Windows 7 Forum Vista Tutorials Tags
Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks.

Go Back   Vista Forums > Misc Newsgroups > .NET General

Vista - Major Performance Bug in DataTable?

Reply
 
Old 03-03-2008   #1 (permalink)
Brian Gideon


 
 

Major Performance Bug in DataTable?

I came across this disturbing problem. The code adds 10000 rows to a
DataTable and then changes the value of one field on each row. It
should be super fast right? As written it takes 19s to complete on my
machine.

class Program
{
static void Main(string[] args)
{
DataTable table = new DataTable();
table.Columns.Add("Value", typeof(Object));
table.Columns[0].AllowDBNull = false;

table.BeginLoadData();
for (int i = 0; i < 10000; i++)
{
table.Rows.Add(i);
}
table.EndLoadData();

Stopwatch sw = new Stopwatch();
sw.Start();
foreach (DataRow row in table.Rows)
{
row["Value"] = 3.14;
}
sw.Stop();
Console.WriteLine(sw.Elapsed.TotalSeconds);
Console.ReadLine();
}
}

The reason why the above code is so slow is because the row["Value"] =
3.14 line is generating an exception that is being swallowed.

Now, if you make either of the following changes it will take less
than 1/100th second to complete. You only do one of the 3 to see a
difference.

1) Comment out table.Columns[0].AllowDBNull = false
2) Comment out table.BeginLoadData and table.EndLoadData
3) Change row["Value"] = 3.14 to row["Value"] = 3

Can someone else confirm this bug? I'm using .NET 2.0.

My System SpecsSystem Spec
Old 03-03-2008   #2 (permalink)
Cowboy \(Gregory A. Beamer\)


 
 

Re: Major Performance Bug in DataTable?

Why do you not strongly type the field to accept a floating point? That
would like work much better than the boxing you are doing to use an object
type. You will probably find the error going away with strong typing, as
well.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

*************************************************
| Think outside the box!
|
*************************************************
"Brian Gideon" <briangideon@xxxxxx> wrote in message
news:758d084c-f6ea-4f27-ad86-981044f27f75@xxxxxx
Quote:

>I came across this disturbing problem. The code adds 10000 rows to a
> DataTable and then changes the value of one field on each row. It
> should be super fast right? As written it takes 19s to complete on my
> machine.
>
> class Program
> {
> static void Main(string[] args)
> {
> DataTable table = new DataTable();
> table.Columns.Add("Value", typeof(Object));
> table.Columns[0].AllowDBNull = false;
>
> table.BeginLoadData();
> for (int i = 0; i < 10000; i++)
> {
> table.Rows.Add(i);
> }
> table.EndLoadData();
>
> Stopwatch sw = new Stopwatch();
> sw.Start();
> foreach (DataRow row in table.Rows)
> {
> row["Value"] = 3.14;
> }
> sw.Stop();
> Console.WriteLine(sw.Elapsed.TotalSeconds);
> Console.ReadLine();
> }
> }
>
> The reason why the above code is so slow is because the row["Value"] =
> 3.14 line is generating an exception that is being swallowed.
>
> Now, if you make either of the following changes it will take less
> than 1/100th second to complete. You only do one of the 3 to see a
> difference.
>
> 1) Comment out table.Columns[0].AllowDBNull = false
> 2) Comment out table.BeginLoadData and table.EndLoadData
> 3) Change row["Value"] = 3.14 to row["Value"] = 3
>
> Can someone else confirm this bug? I'm using .NET 2.0.

My System SpecsSystem Spec
Old 03-03-2008   #3 (permalink)
RobinS


 
 

Re: Major Performance Bug in DataTable?

Doing

row["Value"]

repeatedly is a performance hit. Get the column ordinal and use that
instead. It will speed up your process.

RobinS.
GoldMail, Inc.
------------------------------
"Brian Gideon" <briangideon@xxxxxx> wrote in message
news:758d084c-f6ea-4f27-ad86-981044f27f75@xxxxxx
Quote:

>I came across this disturbing problem. The code adds 10000 rows to a
> DataTable and then changes the value of one field on each row. It
> should be super fast right? As written it takes 19s to complete on my
> machine.
>
> class Program
> {
> static void Main(string[] args)
> {
> DataTable table = new DataTable();
> table.Columns.Add("Value", typeof(Object));
> table.Columns[0].AllowDBNull = false;
>
> table.BeginLoadData();
> for (int i = 0; i < 10000; i++)
> {
> table.Rows.Add(i);
> }
> table.EndLoadData();
>
> Stopwatch sw = new Stopwatch();
> sw.Start();
> foreach (DataRow row in table.Rows)
> {
> row["Value"] = 3.14;
> }
> sw.Stop();
> Console.WriteLine(sw.Elapsed.TotalSeconds);
> Console.ReadLine();
> }
> }
>
> The reason why the above code is so slow is because the row["Value"] =
> 3.14 line is generating an exception that is being swallowed.
>
> Now, if you make either of the following changes it will take less
> than 1/100th second to complete. You only do one of the 3 to see a
> difference.
>
> 1) Comment out table.Columns[0].AllowDBNull = false
> 2) Comment out table.BeginLoadData and table.EndLoadData
> 3) Change row["Value"] = 3.14 to row["Value"] = 3
>
> Can someone else confirm this bug? I'm using .NET 2.0.
My System SpecsSystem Spec
Old 03-04-2008   #4 (permalink)
Brian Gideon


 
 

Re: Major Performance Bug in DataTable?

On Mar 3, 6:31*pm, "Cowboy \(Gregory A. Beamer\)"
<NoSpamMgbwo...@xxxxxx> wrote:
Quote:

> Why do you not strongly type the field to accept a floating point? That
> would like work much better than the boxing you are doing to use an object
> type. You will probably find the error going away with strong typing, as
> well.
>
> --
> Gregory A. Beamer
> MVP, MCP: +I, SE, SD, DBA
>
The field comes from a special kind of non-relational database that
can be many different types. But, yes, changing it to a strongly
typed field speeds it up again. But then again so does commenting out
BeginLoadData and EndDataLoad. That doesn't make any sense does it?
Think about it this way. Do you think any of those 3 changes I
mentioned should change the performance by an order of 10^4 (or more)
for a DataTable that only holds 10,000 rows?
My System SpecsSystem Spec
Old 03-04-2008   #5 (permalink)
Brian Gideon


 
 

Re: Major Performance Bug in DataTable?

On Mar 3, 8:12*pm, "RobinS" <rob...@xxxxxx> wrote:
Quote:

> Doing
>
> * *row["Value"]
>
> repeatedly is a performance hit. Get the column ordinal and use that
> instead. It will speed up your process.
>

So referencing a column by name is at least an order of 10^4 slower
than referencing by ordinal? Why then does making any one of the 3
changes I listed speed it up so much?
My System SpecsSystem Spec
Old 03-04-2008   #6 (permalink)
Jon Skeet [C# MVP]


 
 

Re: Major Performance Bug in DataTable?

Brian Gideon <briangideon@xxxxxx> wrote:
Quote:
Quote:

> > * *row["Value"]
> >
> > repeatedly is a performance hit. Get the column ordinal and use that
> > instead. It will speed up your process.
>
> So referencing a column by name is at least an order of 10^4 slower
> than referencing by ordinal? Why then does making any one of the 3
> changes I listed speed it up so much?
Getting the column doesn't make any significant difference here.

The problem is that you're changing the type of the value stored. I
haven't looked through the problem in its entirety, but the basic
message is "don't do that". Work out the type of the column to start
with, and make sure you only store that type of data in there. Changing
the data type (when you're storing primitive value types) will be
expensive.

--
Jon Skeet - <skeet@xxxxxx>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk
My System SpecsSystem Spec
Old 03-04-2008   #7 (permalink)
Brian Gideon


 
 

Re: Major Performance Bug in DataTable?

On Mar 4, 9:45*am, Jon Skeet [C# MVP] <sk...@xxxxxx> wrote:
Quote:

> Getting the column doesn't make any significant difference here.
Exactly.
Quote:

>
> The problem is that you're changing the type of the value stored. I
> haven't looked through the problem in its entirety, but the basic
> message is "don't do that". Work out the type of the column to start
> with, and make sure you only store that type of data in there.
Changing types is definitely related to the problem. But, that's not
the whole story. I can still change the data type during a call to
the row["Value"] setter and as long as I comment out the BeginLoadData
and EndLoadData from the first loop the second loop will be at least
10^4 times faster. Likewise, setting AllowDBNull = true speeds up
that row["Value"] setter call...again by a factor of at least 10^4.
Quote:

> Changing
> the data type (when you're storing primitive value types) will be
> expensive.
By a factor of at least 10^4? No way. The difference based on a
10,000 iteration loop should be so imperceptable that even the high
precision Stopwatch class wouldn't pick it up.

Because I've profiled the code I can see that the row["Value"] setter
is internally generating and swallowing an exception. I can even see
what part of the code it's occurring in using the Reflector. Sure
enough, Microsoft has code embedded way down deep that swallows
exceptions from that call in some cases. And again, by making
seemingly unrelated calls to the DataTable before the loop somehow
causes the problem to start occurring.

Running the posted code in the debugger is even worse. In fact, I
have no idea how long it takes because I'm too impatient to wait. I'm
assuming that's because the debugger is capturing that first chance
exception and trying to decide whether or not to halt based on the
rules you define in the Debug | Exceptions menu option.
My System SpecsSystem Spec
Old 03-04-2008   #8 (permalink)
Jon Skeet [C# MVP]


 
 

Re: Major Performance Bug in DataTable?

Brian Gideon <briangideon@xxxxxx> wrote:
Quote:
Quote:

> > The problem is that you're changing the type of the value stored. I
> > haven't looked through the problem in its entirety, but the basic
> > message is "don't do that". Work out the type of the column to start
> > with, and make sure you only store that type of data in there.
>
> Changing types is definitely related to the problem. But, that's not
> the whole story. I can still change the data type during a call to
> the row["Value"] setter and as long as I comment out the BeginLoadData
> and EndLoadData from the first loop the second loop will be at least
> 10^4 times faster. Likewise, setting AllowDBNull = true speeds up
> that row["Value"] setter call...again by a factor of at least 10^4.
Sure - so there's definitely something interesting going on.
Quote:
Quote:

> > Changing
> > the data type (when you're storing primitive value types) will be
> > expensive.
>
> By a factor of at least 10^4? No way. The difference based on a
> 10,000 iteration loop should be so imperceptable that even the high
> precision Stopwatch class wouldn't pick it up.
>
> Because I've profiled the code I can see that the row["Value"] setter
> is internally generating and swallowing an exception. I can even see
> what part of the code it's occurring in using the Reflector. Sure
> enough, Microsoft has code embedded way down deep that swallows
> exceptions from that call in some cases. And again, by making
> seemingly unrelated calls to the DataTable before the loop somehow
> causes the problem to start occurring.
Swallowing 10,000 exceptions certainly wouldn't take 10 seconds. It's
possible that it's swallowing millions of exceptions, but not just
10,000.

I wonder whether it's trying to do something nasty in terms of hashing
or internal sorting - although I'd expect that to happen if you *don't*
have BeginLoadData, not if you do...
Quote:

> Running the posted code in the debugger is even worse. In fact, I
> have no idea how long it takes because I'm too impatient to wait. I'm
> assuming that's because the debugger is capturing that first chance
> exception and trying to decide whether or not to halt based on the
> rules you define in the Debug | Exceptions menu option.
Certainly exceptions are significantly slower in the debugger.

--
Jon Skeet - <skeet@xxxxxx>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk
My System SpecsSystem Spec
Old 03-04-2008   #9 (permalink)
Brian Gideon


 
 

Re: Major Performance Bug in DataTable?

On Mar 4, 10:14*am, Brian Gideon <briangid...@xxxxxx> wrote:
Quote:

> Because I've profiled the code I can see that the row["Value"] setter
> is internally generating and swallowing an exception. *I can even see
> what part of the code it's occurring in using the Reflector. *Sure
> enough, Microsoft has code embedded way down deep that swallows
> exceptions from that call in some cases. *And again, by making
> seemingly unrelated calls to the DataTable before the loop somehow
> causes the problem to start occurring.
It looks like the exception is being generated and swallowed in
System.Data.Common.ObjectStorage.Compare. Using Reflector I can see
that there is a try-catch that doesn't rethrow. The posted code
generates and swallows a mind numbing 313,627 exceptions! No wonder
it's so slow.

I did some testing where I change the number of rows in the DataTable
and it appears the number of exceptions being generating is
proportional to n*log(n). I can see some activity happening inside of
a red-black tree implementation of an index which might explain the
n*log(n) ratio.

I can't imagine how this is by design. It must be a bug.

My System SpecsSystem Spec
Old 03-04-2008   #10 (permalink)
Brian Gideon


 
 

Re: Major Performance Bug in DataTable?

On Mar 4, 10:43*am, Jon Skeet [C# MVP] <sk...@xxxxxx> wrote:
Quote:

> Sure - so there's definitely something interesting going on.
>
Definitely, see my reply to myself. I have more information included.

My System SpecsSystem Spec
Reply

Thread Tools


Similar Threads
Thread Forum
How to extract data from a Datatable into another DataTable or dat .NET General
Major change in performance (for the better) can anyone tell me why? Vista performance & maintenance
How to fix A DataTable named 'Tag2' already belongs to this DataSet.(XML/Datatable) .NET General
Vista SP1 - Major network performance problems Vista General


Vista Forums is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media Ltd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46