![]() |
![]() | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
| Welcome to Windows Vista Forums. Our forum is dedicated to helping you find solutions with any problems, errors or issues you are experiencing with Windows Vista. The Vista forum also covers news and updates and has an extensive Windows Vista tutorial section that covers a wide range of tips and tricks. |
| |||||||
![]() |
| |
| | #1 (permalink) |
| | Major Performance Bug in DataTable? I came across this disturbing problem. The code adds 10000 rows to a DataTable and then changes the value of one field on each row. It should be super fast right? As written it takes 19s to complete on my machine. class Program { static void Main(string[] args) { DataTable table = new DataTable(); table.Columns.Add("Value", typeof(Object)); table.Columns[0].AllowDBNull = false; table.BeginLoadData(); for (int i = 0; i < 10000; i++) { table.Rows.Add(i); } table.EndLoadData(); Stopwatch sw = new Stopwatch(); sw.Start(); foreach (DataRow row in table.Rows) { row["Value"] = 3.14; } sw.Stop(); Console.WriteLine(sw.Elapsed.TotalSeconds); Console.ReadLine(); } } The reason why the above code is so slow is because the row["Value"] = 3.14 line is generating an exception that is being swallowed. Now, if you make either of the following changes it will take less than 1/100th second to complete. You only do one of the 3 to see a difference. 1) Comment out table.Columns[0].AllowDBNull = false 2) Comment out table.BeginLoadData and table.EndLoadData 3) Change row["Value"] = 3.14 to row["Value"] = 3 Can someone else confirm this bug? I'm using .NET 2.0. |
My System Specs![]() |
| | #2 (permalink) |
| | Re: Major Performance Bug in DataTable? Why do you not strongly type the field to accept a floating point? That would like work much better than the boxing you are doing to use an object type. You will probably find the error going away with strong typing, as well. -- Gregory A. Beamer MVP, MCP: +I, SE, SD, DBA ************************************************* | Think outside the box! | ************************************************* "Brian Gideon" <briangideon@xxxxxx> wrote in message news:758d084c-f6ea-4f27-ad86-981044f27f75@xxxxxx Quote: >I came across this disturbing problem. The code adds 10000 rows to a > DataTable and then changes the value of one field on each row. It > should be super fast right? As written it takes 19s to complete on my > machine. > > class Program > { > static void Main(string[] args) > { > DataTable table = new DataTable(); > table.Columns.Add("Value", typeof(Object)); > table.Columns[0].AllowDBNull = false; > > table.BeginLoadData(); > for (int i = 0; i < 10000; i++) > { > table.Rows.Add(i); > } > table.EndLoadData(); > > Stopwatch sw = new Stopwatch(); > sw.Start(); > foreach (DataRow row in table.Rows) > { > row["Value"] = 3.14; > } > sw.Stop(); > Console.WriteLine(sw.Elapsed.TotalSeconds); > Console.ReadLine(); > } > } > > The reason why the above code is so slow is because the row["Value"] = > 3.14 line is generating an exception that is being swallowed. > > Now, if you make either of the following changes it will take less > than 1/100th second to complete. You only do one of the 3 to see a > difference. > > 1) Comment out table.Columns[0].AllowDBNull = false > 2) Comment out table.BeginLoadData and table.EndLoadData > 3) Change row["Value"] = 3.14 to row["Value"] = 3 > > Can someone else confirm this bug? I'm using .NET 2.0. |
My System Specs![]() |
| | #3 (permalink) |
| | Re: Major Performance Bug in DataTable? Doing row["Value"] repeatedly is a performance hit. Get the column ordinal and use that instead. It will speed up your process. RobinS. GoldMail, Inc. ------------------------------ "Brian Gideon" <briangideon@xxxxxx> wrote in message news:758d084c-f6ea-4f27-ad86-981044f27f75@xxxxxx Quote: >I came across this disturbing problem. The code adds 10000 rows to a > DataTable and then changes the value of one field on each row. It > should be super fast right? As written it takes 19s to complete on my > machine. > > class Program > { > static void Main(string[] args) > { > DataTable table = new DataTable(); > table.Columns.Add("Value", typeof(Object)); > table.Columns[0].AllowDBNull = false; > > table.BeginLoadData(); > for (int i = 0; i < 10000; i++) > { > table.Rows.Add(i); > } > table.EndLoadData(); > > Stopwatch sw = new Stopwatch(); > sw.Start(); > foreach (DataRow row in table.Rows) > { > row["Value"] = 3.14; > } > sw.Stop(); > Console.WriteLine(sw.Elapsed.TotalSeconds); > Console.ReadLine(); > } > } > > The reason why the above code is so slow is because the row["Value"] = > 3.14 line is generating an exception that is being swallowed. > > Now, if you make either of the following changes it will take less > than 1/100th second to complete. You only do one of the 3 to see a > difference. > > 1) Comment out table.Columns[0].AllowDBNull = false > 2) Comment out table.BeginLoadData and table.EndLoadData > 3) Change row["Value"] = 3.14 to row["Value"] = 3 > > Can someone else confirm this bug? I'm using .NET 2.0. |
My System Specs![]() |
| | #4 (permalink) |
| | Re: Major Performance Bug in DataTable? On Mar 3, 6:31*pm, "Cowboy \(Gregory A. Beamer\)" <NoSpamMgbwo...@xxxxxx> wrote: Quote: > Why do you not strongly type the field to accept a floating point? That > would like work much better than the boxing you are doing to use an object > type. You will probably find the error going away with strong typing, as > well. > > -- > Gregory A. Beamer > MVP, MCP: +I, SE, SD, DBA > can be many different types. But, yes, changing it to a strongly typed field speeds it up again. But then again so does commenting out BeginLoadData and EndDataLoad. That doesn't make any sense does it? Think about it this way. Do you think any of those 3 changes I mentioned should change the performance by an order of 10^4 (or more) for a DataTable that only holds 10,000 rows? |
My System Specs![]() |
| | #5 (permalink) |
| | Re: Major Performance Bug in DataTable? On Mar 3, 8:12*pm, "RobinS" <rob...@xxxxxx> wrote: Quote: > Doing > > * *row["Value"] > > repeatedly is a performance hit. Get the column ordinal and use that > instead. It will speed up your process. > So referencing a column by name is at least an order of 10^4 slower than referencing by ordinal? Why then does making any one of the 3 changes I listed speed it up so much? |
My System Specs![]() |
| | #6 (permalink) |
| | Re: Major Performance Bug in DataTable? Brian Gideon <briangideon@xxxxxx> wrote: Quote: Quote: > > * *row["Value"] > > > > repeatedly is a performance hit. Get the column ordinal and use that > > instead. It will speed up your process. > So referencing a column by name is at least an order of 10^4 slower > than referencing by ordinal? Why then does making any one of the 3 > changes I listed speed it up so much? The problem is that you're changing the type of the value stored. I haven't looked through the problem in its entirety, but the basic message is "don't do that". Work out the type of the column to start with, and make sure you only store that type of data in there. Changing the data type (when you're storing primitive value types) will be expensive. -- Jon Skeet - <skeet@xxxxxx> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk |
My System Specs![]() |
| | #7 (permalink) |
| | Re: Major Performance Bug in DataTable? On Mar 4, 9:45*am, Jon Skeet [C# MVP] <sk...@xxxxxx> wrote: Quote: > Getting the column doesn't make any significant difference here. Quote: > > The problem is that you're changing the type of the value stored. I > haven't looked through the problem in its entirety, but the basic > message is "don't do that". Work out the type of the column to start > with, and make sure you only store that type of data in there. the whole story. I can still change the data type during a call to the row["Value"] setter and as long as I comment out the BeginLoadData and EndLoadData from the first loop the second loop will be at least 10^4 times faster. Likewise, setting AllowDBNull = true speeds up that row["Value"] setter call...again by a factor of at least 10^4. Quote: > Changing > the data type (when you're storing primitive value types) will be > expensive. 10,000 iteration loop should be so imperceptable that even the high precision Stopwatch class wouldn't pick it up. Because I've profiled the code I can see that the row["Value"] setter is internally generating and swallowing an exception. I can even see what part of the code it's occurring in using the Reflector. Sure enough, Microsoft has code embedded way down deep that swallows exceptions from that call in some cases. And again, by making seemingly unrelated calls to the DataTable before the loop somehow causes the problem to start occurring. Running the posted code in the debugger is even worse. In fact, I have no idea how long it takes because I'm too impatient to wait. I'm assuming that's because the debugger is capturing that first chance exception and trying to decide whether or not to halt based on the rules you define in the Debug | Exceptions menu option. |
My System Specs![]() |
| | #8 (permalink) |
| | Re: Major Performance Bug in DataTable? Brian Gideon <briangideon@xxxxxx> wrote: Quote: Quote: > > The problem is that you're changing the type of the value stored. I > > haven't looked through the problem in its entirety, but the basic > > message is "don't do that". Work out the type of the column to start > > with, and make sure you only store that type of data in there. > Changing types is definitely related to the problem. But, that's not > the whole story. I can still change the data type during a call to > the row["Value"] setter and as long as I comment out the BeginLoadData > and EndLoadData from the first loop the second loop will be at least > 10^4 times faster. Likewise, setting AllowDBNull = true speeds up > that row["Value"] setter call...again by a factor of at least 10^4. Quote: Quote: > > Changing > > the data type (when you're storing primitive value types) will be > > expensive. > By a factor of at least 10^4? No way. The difference based on a > 10,000 iteration loop should be so imperceptable that even the high > precision Stopwatch class wouldn't pick it up. > > Because I've profiled the code I can see that the row["Value"] setter > is internally generating and swallowing an exception. I can even see > what part of the code it's occurring in using the Reflector. Sure > enough, Microsoft has code embedded way down deep that swallows > exceptions from that call in some cases. And again, by making > seemingly unrelated calls to the DataTable before the loop somehow > causes the problem to start occurring. possible that it's swallowing millions of exceptions, but not just 10,000. I wonder whether it's trying to do something nasty in terms of hashing or internal sorting - although I'd expect that to happen if you *don't* have BeginLoadData, not if you do... Quote: > Running the posted code in the debugger is even worse. In fact, I > have no idea how long it takes because I'm too impatient to wait. I'm > assuming that's because the debugger is capturing that first chance > exception and trying to decide whether or not to halt based on the > rules you define in the Debug | Exceptions menu option. -- Jon Skeet - <skeet@xxxxxx> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk |
My System Specs![]() |
| | #9 (permalink) |
| | Re: Major Performance Bug in DataTable? On Mar 4, 10:14*am, Brian Gideon <briangid...@xxxxxx> wrote: Quote: > Because I've profiled the code I can see that the row["Value"] setter > is internally generating and swallowing an exception. *I can even see > what part of the code it's occurring in using the Reflector. *Sure > enough, Microsoft has code embedded way down deep that swallows > exceptions from that call in some cases. *And again, by making > seemingly unrelated calls to the DataTable before the loop somehow > causes the problem to start occurring. System.Data.Common.ObjectStorage.Compare. Using Reflector I can see that there is a try-catch that doesn't rethrow. The posted code generates and swallows a mind numbing 313,627 exceptions! No wonder it's so slow. I did some testing where I change the number of rows in the DataTable and it appears the number of exceptions being generating is proportional to n*log(n). I can see some activity happening inside of a red-black tree implementation of an index which might explain the n*log(n) ratio. I can't imagine how this is by design. It must be a bug. |
My System Specs![]() |
| | #10 (permalink) |
| | Re: Major Performance Bug in DataTable? On Mar 4, 10:43*am, Jon Skeet [C# MVP] <sk...@xxxxxx> wrote: Quote: > Sure - so there's definitely something interesting going on. > |
My System Specs![]() |
![]() |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Forum | |||
| How to extract data from a Datatable into another DataTable or dat | .NET General | |||
| Major change in performance (for the better) can anyone tell me why? | Vista performance & maintenance | |||
| How to fix A DataTable named 'Tag2' already belongs to this DataSet.(XML/Datatable) | .NET General | |||
| Vista SP1 - Major network performance problems | Vista General | |||