(How)
C^# You Are - 11 February 2007
DataReader and DataSet!!
- What is a data set? Answer
A data set is an in-memory representation of data from a database (normally).
The data set breaks up the data into tables. Each table contains a list of
typed columns and zero or more rows containing the data from the database.
A data set provides read/write access to data in the database. A data set
also tracks the changes made to the in-memory data and therefore can be used to
flush changes back to a database. A data set is also known as a disconnected
data component because once the data is loaded in the data set the database connection
can be dropped. It will not be needed again until the data set needs to update
or refresh its data. Finally a data set is well supported by
UI controls.
The biggest disadvantage of a data set is the amount of memory that it consumes.
In addition to the raw data a data set also keeps the table schema in memory along
with any changes made to each row. For a large amount of data (thousands of
rows) the overhead can be really high.
- What is a data reader? Answer
A data reader is a forward, read-only stream that allows you to retrieve data from
a database. You iterate through a reader one row at a time. Each row
contains the data from the database. A reader can buffer the data so only
a few rows are loaded at a time. The data contained in a reader can not be
modified and pushed back to the database. A data reader is a connected data
component. As long as the reader is open a connection to the database must
be open. The reader can, and generally should, be configured to auto-close
the connection when the reader is closed.
DataReader implements IDisposable and therefore
should be wrapped with a Using statement.
- Which data access class is faster for reading data from a database: DataSet
or DataReader? Answer
The DataReader is the fastest way to retrieve data from a database.
DataSet uses a DataReader under the hood the load
the data and therefore will always be slower. Slower does not necessarily
mean that you should not use it. There are advantages to using a DataSet.
- You have a table containing a list of employees in a company. There are approximately
10,000 employees. You need to display the employees in a grid so HR can access
an employee's information and edit it. This is a web application. Should
you use a DataSet or a DataReader? Answer
Either answer is technically correct. However you need to evaluate the performance
of the application. Given the size of the data you should consider using a
DataReader. The DataReader will allow you
to read all the data without the overhead of a DataSet. The
requirements are not specific enough to come up with an absolute answer so you need
to look at the scenarios.
One solution is to load the basic employee data into the grid using a DataReader.
This will be relatively fast and low on memory. When the user selects an employee
you could redirect the user to another page where they can edit the employee's information
and save it back to the database. This secondary page could use a DataSet
to load only the specific employee's data.
There are two problems with this solution. Firstly the users have to go to
a separate page to actually edit the data. It is a good bet that in general
a user would not want to edit a lot of users anyway so this is not a bad thing.
However for an application where a user might need to edit many users in a single
data it could become a usability problem. A solution would be to split the
page into two sections. The top section displays the grid and the bottom section
displays the details of the currently selected employee. In this case AJAX
or callbacks could be used to dynamically load the bottom section as users are selected
in the grid.
The second problem actually applies whether you use a DataReader
or a DataSet. Displaying 10,000 employees in a grid is not
useful considering that most of the employees are not needed. Instead you
should use paging and/or filtering to reduce the number of employees displayed at
any one time. The data will be loaded into memory only as it is needed.
- You have created a business layer for your application. You now need to load
the data from the database into the business classes. Should you use a
DataReader or a DataSet? Answer
This, in my opinion, is pretty clear cut. Use a DataReader.
A DataSet is designed to be used as a business object. Therefore
using it to populate business classes only slows down the loading process and eats
up memory. A DataReader is designed to get the data into
the application but not store it long term. Hence business objects go hand-in-hand
with DataReaders.
- How do you update data contained in a DataSet? Answer
All you have to do is change the value in the table/row/column that is impacted.
For example if you have a table called Employees and you want to
update Fred Jone's pay rate you would find the row in the table and then
set the appropriate column's value, like so:
private void UpdatePay ( DataSet ds, string employee, float newPay )
{
DataTable tbl = ds.Tables["Employees"];
DataRow row = //Find employee
row["PayRate"] = newPay;
}
Of course you can combine all this into a single command or even enumerate over
the rows and apply a change to all employees.
- How do you find specific rows in a DataSet? Answer
There are a couple of different ways. The first way is to use the DataTable.Select
method to return a collection of rows from the table that meet the selection criteria.
For example, finding the employee with the name Fred Jones might look like
this:
private DataRowCollection FindEmployees ( DataSet ds, string firstName, string lastName
)
{
return ds.Tables["Employees"].Select(String.Format("FirstName={0} AND
LastName={1}", firstName, lastName));
}
The second method is to use DataRowCollection.Find to find the
row with a specific primary key. For example if you want to find the employee
with the PK of 102 then you could do this:
private DataRow FindEmployee ( DataSet ds, int primaryKey )
{
return ds.Tables["Employees"].Rows.Find(primarykey)[0];
}
- Once you've made changes to a DataSet how do you update the database? Answer
You need to use a DataAdapter to flush the changes back to the
database. For whatever reason DataAdapter itself does not
provide a mechanism to specify the commands to run to insert, update or delete rows.
Instead you'll need to use DbDataAdapter or, preferably, one of
the database-specific adapters like SqlDataAdapter. These
adapters expose properties for you to specify the insert, update and delete command
to run for each modified row. For example the follow could would update the
employee data in the database.
private void UpdateEmployees ( DataSet ds )
{
using (SqlConnection conn = ... )
{
SqlDataAdapter da = new SqlDataAdapter();
da.InsertCommand = new SqlCommand("INSERT INTO...", conn);
da.UpdateCommand = new SqlCommand("UPDATE...", conn);
da.DeleteCommand = new SqlCommand("DELETE...", conn);
conn.Open();
da.Update(ds);
};
}
- How does a DataSet know what rows where added, modified or removed? Answer
Internally a DataSet keeps the original data that was loaded.
When a new row is added to the DataSet it is marked as new.
When a row is deleted from the DataSet it is marked as deleted
but not actually removed. Finally when a change is made to a row the changed
row is stored while the original row is cached off. You can access the rows
that were modified along with their original values (for modified rows) using
GetChanges. Each row will contain a RowState
property of type DataRowState that specifies the state of the row:
added, removed, etc.
This feature also allows you to serialize out the changes to a data set and later
serialize them back in. This feature also allows you to handle concurrency
errors.
- How do you serialize a DataSet? Answer
As of v2 you can use the WriteXml method to write out a DataSet.
You can write out the data with or without the schema or you can write out only
the changes from the original data. You read the data back using ReadXml.
When reading the data in you have to decide whether the serialized data replaces
or is merged with the existing data, if any, in the DataSet.
|
|