(How)
C^# You Are - 21 October 2007
This week's questions revolve around interoperability with unmanaged code.
This seems to be a popular topic these days. If you don't know a little C then you'll
probably have a hard time with interop as most of it uses C.
- You need to call an unmanaged function with the following signature. How should
you do it? Answer
void Trace(char const* message, int type );
Many people try to make interop a lot harder than it needs to be. The important
thing to remember is that getting data to unmanaged code is pretty much automatic
in most cases. If you are working with a simple POD then assume the marshaler
will handle the details. For the above function you are passing a string and
an int, both PODs. The marshaler can handle these automatically.
[DllImport("...", CharSet=CharSet.Ansi)]
private static extern void Trace ( string message, int type );
A few notes about this, and subsequent, example. Firstly I specified the character
set explicitly. This makes it a little easier for the marshaler to figure
out the character set when dealing with strings. It saves time so use it when
you can. Most interop code (except to Windows) will use ANSI. For these
examples I'll be leaving out the attribute unless it is important to the discussion.
The next point is that all interop functions should reside in a special class designed
for the sole purpose of interop. Several functions can be confined to a single
class. MSDN has a big write up on this. The other important thing to
do is always make the function private, static and extern. .NET clients should
never directly call an unmanaged routine. As the interop code gets more complex
and error handling becomes more important a wrapper method comes in handy.
Get use to it now. This is especially true when interoping custom types like
structures or enumerations. Always expose a .NET type in lieu of unmanaged
types and always use .NET types in method signatures over unmanaged interop types.
- You need to call the following function. It expects the caller to give it
a buffer with at least 256 bytes of space and it returns the actual length. When
called the length should be passed as input.
How do you do it? Answer
void GetUser(char* name, int* length);
Similar to the last question but slightly more complex. In this case the unmanaged
code will return a string of data of undefined length. Additionally the buffer
might not be completely filled. Firstly let's handle the fact that the length
will be an INOUT parameter. That is easy to do, just mark the parameter with
ref as you normally would. For the string we have a problem.
Remember that strings are immutable so you can't just pass it to the unmanaged code.
Instead you have to allocate the memory, copy the data (sometimes) and then create
and copy the string when the call returns. As with .NET code you should use
StringBuilder to handle string allocations. In this case
SB can preallocate the memory for you and copy it back when you're
done. Here's the signature.
private static extern void GetUser ( StringBuilder name, ref int length )
To call this function you have to preallocate the space in SB to
make it at least 256 bytes (as per the function requirement).
StringBuilder sb = new StringBuilder(256);
int length = 256;
GetUser(sb, ref length);
See why a wrapper method is important?
- You need to call a function that returns an array of bytes. The array is allocated
by the caller. How do you do it? Answer
long GetData ( byte* buffer, int count );
There are two issues here. The first is the return type. The second
is the array of data that must be passed back and forth. Let's deal with the
return type first. Instinctively what type should it be? Long?
Wrong. A long in C#/.NET is 64-bits. A long
in C/C++ is only 32-bits. Yes an int and a long
are the same size in C++. Don't ask me why because I didn't make that decision.
Nevertheless the return type should be an int in managed code so
it remains 32-bits.
The more difficult issues is the byte array. We have to pass a pre-existing
array of bytes to the function and it will return some data in the buffer.
It is very similar to the string problem of earlier except we can't use StringBuilder.
In this case the marshaler will have no problems sending the array to unmanaged
code. The marshaler knows how big the array is so it sends the appropriate
amount of data. However coming back the marshaler isn't so sure. The
default behavior is to simply pass back the same number of bytes that are sent.
In this case that would be a reasonable approach.
private static extern int GetData ( ref byte[] buffer, int count );
//Calling
byte[] buffer = new byte[1024];
int count = buffer;
int read = GetData(ref buffer, count);
This particular example can be generalized to a fixed size array. In this
case the array doesn't change size when passed to the unmanaged code. If this
were a string we could do the same thing, effectively. The marshaler, because
it is by ref, copies the array between unmanaged and managed without any issues.
- You need to call a function that accepts an unmanaged structure of simple types
to a function. What is the function signature? Answer
struct Version
{
int cbSize;
short MajorVersion;
short MinorVersion;
short Release;
int Build;
bool Debug;
}
void GetVersionInformation ( Version* version );
This is more typical in interop code. You need to map a custom type to a managed
type before you can use it. For structures you would map the information to
a .NET structure. Do not use a class as they have completely different semantics.
You should make the structure private (since it won't be accessible outside the
wrapper class) and you should make all the fields public. Do not bother with
properties or methods as nobody outside the class will use it. Your goal is
to simply create an identical in-memory object that can be used to pass data back
and forth. Don't move the fields around either. Sometimes you might
want to create a constructor to help initialize some of the fields. Here's
the structure.
[StructLayout(LayoutKind.Sequential)]
private struct Version
{
public int cbSize;
public short MajorVersion;
public short MinorVersion;
public short Release;
public int Build;
public bool Debug;
}
Notice the attribute at the top. This attribute tells the compiler/CLR that
the structure should stay in memory in the exact format that is laid out here.
You didn't know that the CLR can rearrange fields in a class to optimize memory
usage? Surprise. Sequential happens to be the default
for structures but it doesn't hurt to make it explicit. Now the structure
will lay in memory in the way you specified here.
To call the function it really is no different than using any other type, just add
a ref.
private static extern void GetVersionInformation ( ref Version version );
//Calling
Version ver = new Version();
ver.cbSize = Marshal.SizeOf(ver);
GetVersionInformation(ref ver);
- Here is another structure to deal with. How do you do it? Answer
struct EmployeeInformation
{
int Id;
char FirstName[255];
char LastName[255];
}
void GetEmployeeInformation ( int id, EmployeeInformation* pEmployee );
Nothing we haven't seen before. Except, of course, the fixed size string buffer.
Now you could revert to StringBuilder but there is a problem here.
SB is for dynamically allocated memory whereas this string is inline
in the structure. The structure consists of 514 bytes of data, not 12 bytes
plus two memory blocks somewhere. Enter fixed arrays.
Fixed arrays are only allowed in structures (in this context). They allow
us to specify the size of the array in advance and the marshaler will handle the
rest. Here's the structure.
private struct EmployeeInformation
{
public int Id;
public fixed char FirstName[255];
[FixedBuffer(typeof(char), 255)]
public char[] FirstName;
}
private static extern void GetEmployeeInformation ( int id, ref EmployeeInformation
pEmployee );
The first buffer is allocated using the fixed keyword and specifying
the length as a normal array. Note that strings are always allocated as two
bytes for Unicode in this situation. The second buffer is allocated normally
but uses the FixedBuffer attribute to specify the buffer as fixed
length. They both accomplish the same thing. In fact the first case
collapses into the second case during compilation.
- You are working with an API for using an external device. Here are the calls
that are needed. How do you do it? Answer
HANDLE OpenDevice ( char const* deviceName );
BOOL WriteNextBlock ( HANDLE hDevice, byte buffer[1024] );
void CloseDevice ( HANDLE hDevice );
Not really much new here except HANDLE. In general, when you run across a
HANDLE or related type you will use IntPtr to wrap it. The
biggest issue with this approach though is that there is no automatic clean up of
the handle in the event of an exception. For a single use handle that is a
reasonable limitation. However if you are going to be using the handle a lot
or exposing it to managed code you should consider wrapping it in a SafeHandle-derived
class instead. For this simple example it isn't worth the effort especially
since, in this case, we won't be exposing the handle to managed code.
The next area of concern is the BOOL return type. BOOL is an int in C so you
might assume that you should use an int, and you can, but that
just makes it harder to use. Instead use the MarshalAs attribute
on the return type to tell the marshaler to treat it as a boolean.
The final area is the fixed size buffer in the parameter list. This is really
no different than the fixed size buffers we saw earlier except the entire array
is passed on the stack. We'll use the MarshalAs attribute
here to tell the marhsaler how much to send across. Notice the unmanaged type
is a by-value array since we are passing the array entirely on the stack.
If this were a fixed size array where we just passed the pointer then we'd use
LPArray or similar.
private static extern IntPtr OpenDevice ( string deviceName);
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool WriteNextBlock ( IntPtr hDevice, [MarshalAs(UnmanagedType.ByValArray,
SizeConst=1024)] byte[] buffer );
private static extern void CloseDevice ( IntPtr hDevice );
- Now we need to read data from the device of earlier. What is the signature? Answer
int ReadNextBlock ( HANDLE hDevice, byte* buffer, int size, int* read );
The only issue here is how to ensure that the data that is written gets back to
managed code. Remember that the marshaler knows how much to send to unmanaged
code but coming back it doesn't. We use the MarshalAs attribute
again. This time though the number of elements to send back is variable.
A parameter is used to indicate how many elements were actually written, and hence
how many to retrieve.
private static extern int ReadNextBlock ( IntPtr hDevice, [MarshalAs(UnmanagedType.LPArray,
SizeParamIndex=3)] ref byte[] buffer, int size, ref int read );
- Unmanaged callbacks occur infrequently. How can you get unmanaged code to
call back into managed code like in the following function? Answer
typedef void (*ENUMDEVICE_CALLBACK)(char const* deviceName);
void EnumerateDevices ( ENUMDEVICE_CALLBACK* callback );
Callbacks are going to require a delegate in managed code. Not a big deal
as this particular callback is pretty straightforward. It's just a function
with a string parameter. The issue comes in though about garbage collection
and the CLR moving objects around in memory. In this particular case we must
be sure that while we are using the method pointed to by the callback that it doesn't
move around nor get disposed.
The marshaler handles most memory issues automatically for us. When we use
ref, for example, the marshaler will copy the data onto the stack
to keep it from being disposed and it might also pin the memory (depending).
You can read all about the rules the marshaler follows on MSDN under Copying &
Pinning in the marshaler section. For ref parameters it doesn't
really matter whether the real value moves around in memory or not as the data is
copied to the stack and the stack version is actually used for the call. Upon
return the marshaler can easily copy the results back to the original value (whether
it was moved or not).
Delegates are different because unmanaged code is calling directly into managed
code. The delegate method won't move around in memory as the implementation
isn't associated with a particular object but the instance of the object whose delegate
you're going to call might get GC'ed and that would be bad. The problem is
that the GC doesn't have any way of knowing when the delegate is no longer used
by the unmanaged code. For enumeration (as in the example above) we can ensure
the GC doesn't dispose of the object by making sure we keep a reference to it in
our wrapper method.
private delegate void EnumDeviceCallback ( string deviceName );
private static extern void EnumerateDevices ( EnumDeviceCallback callback );
//Later
internal class MyDevice
{
public void EnumerateDevices ( string device ) { }
}
//Calling it
private void EnumerateTheDevices ( )
{
MyDevice dev = new MyDevice();
EnumerateDevices(new EnumDeviceCallback(dev.EnumerateDevices));
}
An alternative is to use GC.KeepAlive which keeps an instance from
being disposed from the start of the calling function to the point where the method
is called.
- Eventing is rare in unmanaged code but it can happen. What is the signature
for this eventing API? Answer
typedef void (*TRACE_CALLBACK) (char const* message );
void AddTraceListener ( TRACE_CALLBACK* listener );
void RemoveTraceListener ( TRACE_CALLBACK* listener );
Similar problem to the last question. The gotcha here is that the delegate
instance must survive at least as long as the time it takes to call the remove function.
A simple local variable on the stack won't work here (like we did earlier) because
once the function returns the object can be freed. Instead we need to "cache"
the instance data until we no longer need it. The easiest approach is to use
a wrapper class to store the instance in a field.
internal delegate void TraceCallback ( string message);
internal class TraceListenerImpl
{
private Collection<TraceCallback> callbacks = ...;
public void AddListener ( TraceCallback callback )
{
AddTraceListener(callback);
callbacks.Add(callback);
}
public void RemoveListener ( TraceCallback callback )
{
RemoveTraceListener(callback);
callbacks.Remove(callback);
}
}
|
|