Thursday, September 30, 2010

threads and C#

I'm old school, i guess. Probably because I'm old. But, whenever I need to do some mult-threaded programming, I open up the System.Threading namespace and start to create threads. So that I don't overwhelm the system, I usually throttle these threads and create a thread collection somewhere .
It's old school.
I only recently found out that there is a slick way to do this in .Net.
First, you'll need a way to know when the thread has finished. Since there will be n of them, you'll need some type of collection.
Specifically, this will need to be a collection of ManualResetEvent objects.
For example,
System.Collections.Generic.List al = new List();

each time a new thread is needed, you'll need to create a new ManualResetEvent object, like this:
ManualResetEvent m = new ManualResetEvent(false);
(the boolean indicates whether to set the initial state).

Then you'll need to add this to the list:
al.Add(m);

ok, now that some of the thread junk is out of the way, we'll want to actually *do* something.

For example, lets stay we create a class called threadClass.
threadplay.threadClass c = new threadplay.threadClass(m);
in this example, i'm passing the ManualResetEvent to the constructor. The object will need a reference to it somewhere.

Next, you'll need a callback method. This will be what the thread calls to do it's work. It's the "DoStuff" method.
In my class, I called it "theCallback". It will need to have this signature:
public void theCallback(Object x)

Inside the callback method, you can typecast the Object to whatever makes sense. So you can pass in data connections, collections, or just about anything you like.
In my case, I created a class called "playObj" with some boring properties which i passed into the constructor:
new threadplay.playObj("kjh", i)

To set up this to run from a thread, just pop it onto the ThreadPool:

ThreadPool.QueueUserWorkItem(c.theCallback, new threadplay.playObj("kjh", i));

This will enqueue the thread to be run at the next possible moment. By default .Net will manage this thread pool for you. You can see or change the max or min number of threads if you want to, but .Net is suppose to correctly scale this based on system resources. If you're contenting with a resource .Net doesnt' know about (for example a hard limit on number of open database connections), you may want to modify the default behavior. But it is suppose to account for CPU utilization and be optimized already.

Inside the callback method, you'll want to call the Set() on the ManualResetEvent. Do it at the bottom of the method, when the execution of the method is done.

Generally, the call to QueueUserWorkItem will happen in a loop (otherwise, what's the point?).
The purpose of the ManualResetEvent is to allow you to wait for each thread to finish. Since you're not controlling the thread pool, you can't get to the actual threads easily, so you can't do a join(). Rather, you can do use WaitHandle.WaitAll() to wait for all the threads to finish (or WaitAny(), if you only want the first one).

Note that WaitAll() expects and Array, not a List<>. But you can convert.
WaitHandle.WaitAll(al.ToArray());
(for some types of data collections, you may need to typecast this to (ManualResetEvent[]). But since i was using a collection of type List, .Net does this for me).

That's it. The hardest part is the ManualReset event stuff.

This will create a pool of n-threads, enqueue any added after the nth until they can run, and wait for all to finish before continuing. Wiring it together is a bit of a pain because of the ManualResetEvent complexity, but besides that, it's very clean and removes the overhead of dealing with management of system resources.

--kevin



Wednesday, September 22, 2010

Linux security

Thought this was interesting. To sum it up, there's a huge flaw in the 64-bit Linux Kernel, specifically related to running 32-bit apps. With the right hack, you can create a buffer overflow, which will allow you to do just about anything you want.
As I've noted before, I think the whole idea of a root account is one of Linux's most concerning features. The Debian-based distros have tried to hide that by not allowing root to actually log in (ie, sudo), but all that does is add a level of abstraction on top of the issue.

Making matters more concerning is the groundless opinion of the Linux faithful that Linux is so solid, no one ever needs a virus protection tool. So a flaw like this can go uncaught. In addition, with this bug, a hacker can create a root session and persist that to disk via a binary. The impact of this is that even after the patch is applied, the root session may still be available, if you've already been hacked. It's just an executable file on the disk that contains a shell with root privs. It could be called anything, making it just about impossible to find.

Anyway, thought it was interesting.
http://sota.gen.nz/compat1/

Monday, September 13, 2010

CLR

Had the opportunity to work with a few CLR stored procedures in SQL server.
Cool stuff.
Here's a quick primer on how to do it, what to do, what not to.
First, something simple, but useful. We've got a need to pass in a string, do "something" to it, then return the modified string.
First, I'll pull together the CLR/.Net code.
Let me put it here, then I'll explain a bit.
using System;
using System.Data;
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;
public class HelloWorldProc
{
[Microsoft.SqlServer.Server.SqlProcedure]
public static void HelloWorld(SqlString x, out SqlString sOut)
{
sOut = x
}
}

The System.Data.SqlTypes is important, as is the SqlServer.Server using statement.

I think ( ? ) the class needs to be in the default namespace. I couldn't get it to work any other way.
The method needs to be static, which does make sense, I think.
Also, note the weird output thing I'm doing. I have to set up the output as a void, then deal with the output as an "output" parameter.

The reason is the limitation placed on Sql CLRs. They can only return integer types (for example, System.Int32, SqlInt32, etc) or voids. Frankly, I've no clue why. Seems silly to me that they can handle a reference parameter, but not an output parameter.
Note that the parameter types are SqlStrings, not strings. But the assignment doesn't even do an explicit typecast.
From the SqlServer side, the assembly has to be created in SqlServer.

CREATE ASSEMBLY HelloWorld from 'c:\temp\HelloWorld.dll' WITH PERMISSION_SET = SAFE;

Next I created a stored procedure to wrap the assembly.

CREATE PROCEDURE SerializeXmlNodes

@x NVARCHAR(1000),

@sOutNVARCHAR(1000) OUTPUT

AS

EXTERNAL NAME HelloWorld.HelloWorldProc.HelloWorld;

The input and output of the stored procedure match the input and output parameters of the .Net method.
Calling the stored proc is done about the same way as calling a "normal" stored proc:

EXEC SerializeXmlNodes @x = 'one', @sOut = @sOut OUTPUT

Select @sOut
in this case, @sOut will be whatever you set @x to, but that's because it's a trivial example.
To get rid of the CLR, you need to drop the stored procedure first, then drop the assembly ("DROP ASSEMBLY HelloWorld").
Very cool stuff, but a little confusing to set it up. Still, it allows string manipulations not available through TSql without serious pain.
--kevin