(How)
C^# You Are - 10 February 2008
This week's questions revolve around streams again.
- You have unmanaged code that writes an integer to a stream representing the
length of a string followed by the string itself. In your reader code for
the managed side you use the following code but the string is generally garbage.
What is wrong? Answer
public string GetEmployeeName ( BinaryReader reader )
{
return reader.ReadString();
}
ReadString does not read an integral length value although you might have read that from the documentation.
The method actually reads a compressed integral value. Basically this means that the length is one byte when it'll fit in a byte and
subsequently longer as needed. Therefore if you send an integral value the method will not read the string properly.
Last week's questions provided the correct way to read a string. Basically you read the integer value first, allocate a byte array
for the string using the length (or twice the length if the file is Unicode). You then use ReadBytes to read the
string into memory and then use the Encoding class to convert the byte array to a string.
- You use the following code to read some data into a file. ReadHeader works just fine but ReadData fails. Why? Answer
public void Read ( Stream stream )
{
ReadHeader(stream);
ReadData(stream);
}
private void ReadHeader ( Stream stream )
{
BinaryReader reader = new BinaryReader(stream);
int len = reader.ReadInt32();
string str = reader.ReadString();
}
private void ReadData ( Stream stream )
{
using (BinaryReader reader = new BinaryReader(stream))
{
int len = reader.ReadInt32();
byte[] arr = new byte[len];
reader.ReadBytes(arr, 0, len);
};
}
The problem lies in the reader. A reader will close the underlying stream when it is disposed. The problem with the above code is
that it might work sometimes and not others depending upon when the reader is disposed. Nevertheless once the reader is disposed the stream
will be as well.
You can not haphazardly create readers and writers and associate them with a stream. Instead you should create the reader/writer once
and then persist it as long as the stream is needed.
- You need to be able to peek at the next character in a stream but the stream does not support seeking. How can you do it? Answer
Unfortunately there is no real good solution for this problem. The best solution is to buffer lookahead bytes somewhere. When the stream is read again you first
use the bytes, if any, in the lookahead buffer. Unfortunately there is not any built in readers that can do that. One is not hard to write though. If you support peeking
with a lookahead then you can easily extend it to allow seeking within the entire stream.
If you do not feel like writing your own then you can also read the stream into memory using MemoryStream or buffer the data
using BufferedStream. Each has advantages and disadvantages. The MemoryStream will require enough memory to hold
the entire stream. For small streams this is not a big deal but for large streams it will not scale well. The BufferedStream solves this
issue by buffering only a portion of the stream. While great for peeking it will not allow for generalized seeking.
- How do you check for the end of a stream? Answer
Surprisingly there is no easy way to do this for an arbitrary string. For seekable streams, like FileStream, you can
compare the length of the stream with the current position. For non-seekable streams the best you can do is attempt to read the next byte. If
it returns -1 then you have reached the end of the stream. If it is anything else then you will have to store the returned value for later use.
- How can you both read and write a stream at once, such as a file? Answer
There is no built in class to both read and write streams at the same time. Instead the stream supports basic reading and writing. A reader/writer is really
just a simple wrapper around the basic Stream.Read and Stream.Write methods will some validation and buffering code. You can
quite easily create a class that supports both reading and writing simply by replicating the reader and writer code.
A problem does arise when a stream is being read and written at the same time. Specifically what should happen if you read some data, write some data and
then attempt to read some more data? Should the read occur at the position just after the last read or after the last write? This question determines whether you
use a single position for reading/writing or a separate position for each. If you assume that reads will be sequential then how would you be able to move to the
end of the stream so you could read the data you just wrote? While Seek might be a good choice how would you differentiate between a seek for
reading vs. a seek for writing? If you use a single position then a read followed by a write followed by a read would produce an exception as there would be nothing
left to read.
.NET does not really provide any good recommendations on this. The only real recommendation it makes is that when you read from a stream that might have been written
to you should first flush the stream to ensure the read works properly. Complicating this further, for those with C++ backgrounds, C++ supported the ability to read a file
while appending to the end of it. For such a function to exist in .NET you would have to either use two separate position indicators (one for read and one for write)
or you would have to save the current read position, move to the end of the file, write the data and then seek back to the beginning of the file. Such an operation
would not work on most streams other than file streams.