Wednesday, November 10, 2010

threadPools again

Here's a supremely annoying ThreadPool undocumented/under-documented feature that cost me hours.
If you have a resource you're using -- say a database connection -- and you want to limit the resource -- say because you have hard limit on the number of simultaneous database connections you can open -- then you can limit the number of simultaneous threads that a ThreadPool can open by using the SetMaxThread method:

ThreadPool.SetMaxThread(X,Y);

All good. Only:
-- under-documented feature: If you set the max threads less than the number of CPU cores on your system, the setting is ignored. That is, if you're running on a quad-core machine, and you set the max threads to be 2, the ThreadPool assumes you don't know what you're talking about, assumes it can think for you, and disregards your setting.

-- un-documented feature: if you set the max threads less than the number of cores, the max threads setting *doesn't* default to the number of cores, it's ignored completely. Basically, the ThreadPool doesn't default to the lowest possible value, but ramps up the max threads to whatever it feels like it wants to do.

This makes the ThreadPool mostly useless.

Look, if I'm writing an app that uses no resources except the CPU, that's great. But who writes apps that don't use any resources.
Moreover, most software developers don't have any control over the hardware on which their app runs. If I'm a typical line of business developer, I'm writing code that is set up to run on a server somewhere against a database. Likely, I have no control over the server. And it can be upgraded at any time.
In the days of virtualization, an administrator can update the number of CPU cores to a virtual machine on the fly. So if my app runs on a server, and an unrelated app runs on same server, and unrelated app needs more horsepower, admins can add CPU cores without even telling me.

If they do that, then all of a sudden my app -- which was throttled to a small number of threads so that it wouldn't smash the database connections -- now has the MaxThreads set too low. In that case, ThreadPool completely ignores the setting and spins off a zillion threads if it feels like the CPU can take it... completely crushing the database.

It's so frustrating when Microsoft does this. They make a great product, with a great feature, then make it totally unusable in real-world, line-of-business solutions.

GRRRRRRRRRRRR

--kevin

Thursday, September 30, 2010

threads and C#

I'm old school, i guess. Probably because I'm old. But, whenever I need to do some mult-threaded programming, I open up the System.Threading namespace and start to create threads. So that I don't overwhelm the system, I usually throttle these threads and create a thread collection somewhere .
It's old school.
I only recently found out that there is a slick way to do this in .Net.
First, you'll need a way to know when the thread has finished. Since there will be n of them, you'll need some type of collection.
Specifically, this will need to be a collection of ManualResetEvent objects.
For example,
System.Collections.Generic.List al = new List();

each time a new thread is needed, you'll need to create a new ManualResetEvent object, like this:
ManualResetEvent m = new ManualResetEvent(false);
(the boolean indicates whether to set the initial state).

Then you'll need to add this to the list:
al.Add(m);

ok, now that some of the thread junk is out of the way, we'll want to actually *do* something.

For example, lets stay we create a class called threadClass.
threadplay.threadClass c = new threadplay.threadClass(m);
in this example, i'm passing the ManualResetEvent to the constructor. The object will need a reference to it somewhere.

Next, you'll need a callback method. This will be what the thread calls to do it's work. It's the "DoStuff" method.
In my class, I called it "theCallback". It will need to have this signature:
public void theCallback(Object x)

Inside the callback method, you can typecast the Object to whatever makes sense. So you can pass in data connections, collections, or just about anything you like.
In my case, I created a class called "playObj" with some boring properties which i passed into the constructor:
new threadplay.playObj("kjh", i)

To set up this to run from a thread, just pop it onto the ThreadPool:

ThreadPool.QueueUserWorkItem(c.theCallback, new threadplay.playObj("kjh", i));

This will enqueue the thread to be run at the next possible moment. By default .Net will manage this thread pool for you. You can see or change the max or min number of threads if you want to, but .Net is suppose to correctly scale this based on system resources. If you're contenting with a resource .Net doesnt' know about (for example a hard limit on number of open database connections), you may want to modify the default behavior. But it is suppose to account for CPU utilization and be optimized already.

Inside the callback method, you'll want to call the Set() on the ManualResetEvent. Do it at the bottom of the method, when the execution of the method is done.

Generally, the call to QueueUserWorkItem will happen in a loop (otherwise, what's the point?).
The purpose of the ManualResetEvent is to allow you to wait for each thread to finish. Since you're not controlling the thread pool, you can't get to the actual threads easily, so you can't do a join(). Rather, you can do use WaitHandle.WaitAll() to wait for all the threads to finish (or WaitAny(), if you only want the first one).

Note that WaitAll() expects and Array, not a List<>. But you can convert.
WaitHandle.WaitAll(al.ToArray());
(for some types of data collections, you may need to typecast this to (ManualResetEvent[]). But since i was using a collection of type List, .Net does this for me).

That's it. The hardest part is the ManualReset event stuff.

This will create a pool of n-threads, enqueue any added after the nth until they can run, and wait for all to finish before continuing. Wiring it together is a bit of a pain because of the ManualResetEvent complexity, but besides that, it's very clean and removes the overhead of dealing with management of system resources.

--kevin



Wednesday, September 22, 2010

Linux security

Thought this was interesting. To sum it up, there's a huge flaw in the 64-bit Linux Kernel, specifically related to running 32-bit apps. With the right hack, you can create a buffer overflow, which will allow you to do just about anything you want.
As I've noted before, I think the whole idea of a root account is one of Linux's most concerning features. The Debian-based distros have tried to hide that by not allowing root to actually log in (ie, sudo), but all that does is add a level of abstraction on top of the issue.

Making matters more concerning is the groundless opinion of the Linux faithful that Linux is so solid, no one ever needs a virus protection tool. So a flaw like this can go uncaught. In addition, with this bug, a hacker can create a root session and persist that to disk via a binary. The impact of this is that even after the patch is applied, the root session may still be available, if you've already been hacked. It's just an executable file on the disk that contains a shell with root privs. It could be called anything, making it just about impossible to find.

Anyway, thought it was interesting.
http://sota.gen.nz/compat1/

Monday, September 13, 2010

CLR

Had the opportunity to work with a few CLR stored procedures in SQL server.
Cool stuff.
Here's a quick primer on how to do it, what to do, what not to.
First, something simple, but useful. We've got a need to pass in a string, do "something" to it, then return the modified string.
First, I'll pull together the CLR/.Net code.
Let me put it here, then I'll explain a bit.
using System;
using System.Data;
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;
public class HelloWorldProc
{
[Microsoft.SqlServer.Server.SqlProcedure]
public static void HelloWorld(SqlString x, out SqlString sOut)
{
sOut = x
}
}

The System.Data.SqlTypes is important, as is the SqlServer.Server using statement.

I think ( ? ) the class needs to be in the default namespace. I couldn't get it to work any other way.
The method needs to be static, which does make sense, I think.
Also, note the weird output thing I'm doing. I have to set up the output as a void, then deal with the output as an "output" parameter.

The reason is the limitation placed on Sql CLRs. They can only return integer types (for example, System.Int32, SqlInt32, etc) or voids. Frankly, I've no clue why. Seems silly to me that they can handle a reference parameter, but not an output parameter.
Note that the parameter types are SqlStrings, not strings. But the assignment doesn't even do an explicit typecast.
From the SqlServer side, the assembly has to be created in SqlServer.

CREATE ASSEMBLY HelloWorld from 'c:\temp\HelloWorld.dll' WITH PERMISSION_SET = SAFE;

Next I created a stored procedure to wrap the assembly.

CREATE PROCEDURE SerializeXmlNodes

@x NVARCHAR(1000),

@sOutNVARCHAR(1000) OUTPUT

AS

EXTERNAL NAME HelloWorld.HelloWorldProc.HelloWorld;

The input and output of the stored procedure match the input and output parameters of the .Net method.
Calling the stored proc is done about the same way as calling a "normal" stored proc:

EXEC SerializeXmlNodes @x = 'one', @sOut = @sOut OUTPUT

Select @sOut
in this case, @sOut will be whatever you set @x to, but that's because it's a trivial example.
To get rid of the CLR, you need to drop the stored procedure first, then drop the assembly ("DROP ASSEMBLY HelloWorld").
Very cool stuff, but a little confusing to set it up. Still, it allows string manipulations not available through TSql without serious pain.
--kevin

Friday, August 13, 2010

Killing Java

Killing Java

Will Oracle kill Java?They're currently suing Google over the way Google used Java in Andriod.
http://www.reuters.com/article/idUSTRE67B5G720100813


I don't know anything about the lawsuit or what's being alleged except for the article. Most interesting is Orcale's CEO's statement:


Oracle Chief Executive Larry Ellison has said he views the Java software as a key asset, pointing to its use in a variety of electronic devices, from PCs to DVD players."Sun's corporate philosophy was obviously very different from Oracle's in terms of enforcing the Java patents," said Edward Reines, an IP litigator at Weil Gotshall who is involved in separate patent litigation against Oracle.



The fact that Oracle sees Java as an "asset" and is talking about patents is probably the end of Java. For the last 10 years, open source enthusiasts have been begging Sun to make Java open source. Sun kept saying "no" but they kept throwing little bones out -- submitting Java to ANSI, free compilers, free tools, etc. Looks like maybe that's over. Gosling quit citing "ideological differences."

Now this.


As to Google, my prediction is that they will create their own language. Google's search engine is built on Linux but the kernal was completely re-written. So I think Google's response, if Oracle takes their ball and goes home, is to create another ball.


For the open source community -- they're in trouble. If Oracle starts to lock down mySql (now that they own that too), the open sourcers are in the strange situation of not having a database or a language. Their only desktop OS has almost no vendor support now (NVidia already dropped Linux support for their video cards). Apache is about the only thing left. And Mono.


It would be ironic if C#/Mono became the new open source standard. In truth, when I look at the desktop world, I think it's almost shifted so that Microsoft is now one of the least locked down, usable environments. Apple has made huge strides and become a major player. Based on market cap, they're actually bigger than Microsoft now. But they're infinitely proprietary.Google is awesome, but nothing if not proprietary.With Oracle's hard line stand on Java (if it continues), Microsoft is actually the least locked down of all the major vendors.I frankly never understood all the Microsoft hatred. Most of the Linux guys I know are fanatics about Apple, which is way more locked than anything Microsoft ever did. Many of them are Oracle fans, too, which has always been locked.


Java is 15 years old now, and, while it's really cool, is in need of an overhaul . Maybe this will be the impetus to get a new Java-like/C#-like open source language?

Although, more likely, Google will just create its own and lock it.

Thursday, July 29, 2010

powershell shortcut

Just thought this was cool.
Suppose you want to monitor or know or look at some applications running on a remote server through Windows Task manger.
You could rdp to the box, then start the task scheduler UI and look at them.
or you could open a command prompt (assuming you have powershell installed) and run:

schtasks.exe /query /S /U /P

you'll get a list of everything scheduled that you have privs to see, including Next Run time and Status.
This doesn't only include things in the task scheduler but even system processes scheduled to run. For example, things from
Folder: \Microsoft\Windows\Tcpip
like
IpAddressConflict1

Pretty cool stuff.

Sunday, July 18, 2010

worflow timeouts

just thought this was strange and it took me a bit of debugging so I thought I'd share.

Suppose you have a workflow that does database updates -- say SqlServer. Now your Sql updates have timeouts. So does your workflow. But these two timeouts are unrelated. If your Sql time out is longer than your workflow one, you can find yourself in the situation that your workflow times out before your query finishes. It's an easy mistake to make, especially if you have multiple people working on it or if you are using some of the default values and it's not obvious what the timeouts are.

There are a series of delegates you set up for workflows: OnWorkflowAborted, OnWorkflowCompleted, OnWorkflowTerminated, etc.

You'd think -- or at least I did -- that one of the delegate methods would fire if the workflow timed out. But it's not true. Instead, the workflow returns false when the wait handle is set. So :

bool Ok = waitHandle.WaitOne(WORKFLOW_TIMEOUT) ;

in case of a time out, Ok is set to false. And that's the only indication that the flow timed out.

So you need to do something like
if (waitHandle.WaitOne(WORKFLOW_TIMEOUT))
...
else // <-- timeout
// do timeout logic.

What I think is even more interesting is that I couldn't find this in any of the sample workflow code I saw on the web. Really, most of it doesn't even set a timeout, which is probably "not optimal" (i.e., "really dumb").
But even the code that does set the time out, never checks to see if it actually has timed out.

Anyway, thought this may help someone sometime, so I'd post it.

--kevin


Friday, July 9, 2010

a while

I've been kinda busy and neglecting this blog. My bad. I've got a few geek thoughts to document about Windows Workflow and some TSql things that are pretty cool. Also been playing with Studio2010 a bit.
If I can get that documented this weekend, I will.

Until then, I ran across this quote from Einstein. At first I blew by it, but then I realized how it fits software development. The quote is:
"A clever person solves a problem. A wise person avoids it."

The part that's always disappointed me about IT is that cleverness is rewarded but wisdom is not. If you write terrible code that blows up and you work all weekend to fix it, you're a hero. A friend was telling me that Intel used to reward people for the number of bugs they found. Unfortunately, the people they rewarded were the people who were writing the code to begin with. The clever people added a lot of very complex bugs and then were heroes for finding them. And they got a bonus for each one.
I've seen that. One of my prior employers used to give spot bonuses for extra work. The people who won them were always the folks who did crappy work, but then would work all weekend to fix the problems it caused. I remember a couple cases where the systems were so bad, that they asked me to take a look at them. I corrected the issues, so that they didn't fail anymore. And I got... nothin'. Not even an "atta-boy".

Of course, on the positive side, I got to sleep at nights, since my stuff didn't blow.
:D

Saturday, June 12, 2010

Linux again

and now, Adobe has killed its support for Linux 64 bit.


Friday, May 28, 2010

microsoft's future

Not a very techie post this time. I have one I need to put up, but this caught my attention.

In summary, when you count up all stock for Microsoft and for Apple, then multiply it by the value of said stock, Apple is now worth more (higher market capitalization) than Microsoft. MS's CEO said, basically, "who cares". When you look at gross profits, Microsoft wins, but not by that much -- roughly $56Billion/year to Apple's $52Billion.

Problem is that only a few years ago, Microsoft had to bail Apple out of insolvency in order that MS wouldn't become a monopoly (although some argued that, since it controlled 98% of the market, it was anyway ). So the trend is clear. While Microsoft has been basically stagnant, Apple has been going from a tiny, near-bankrupt player, to an industry leader. So it's pretty clear who has momentum.

I've blogged about this before, but I see PC's lasting about another 5-10 years and then disappearing from the mainstream. Software developers will use them. People who do a lot of photo manipulation will use them. People who edit home video will too. And there will be people who have specialty applications (accountants who use Quickbooks or something).
But most will use ipads and netbooks. For games, it will be the xBox, the PS3, and the Wii. They'll watch video on game consoles too.

Note that none of those devices run Windows. In addition, Microsoft has a smart phone which sucks and can't compete with Google or Apple.

Microsoft is making great strides in the server market. But, I think they're starting to look more and more like IBM about 10 years ago. With Oracle's purchase of Sun, and IBM re-inventing itself, I'm not sure that the future is really on the servers. Servers do ... what they do. Sure, there will be tweeks her and there, but, besides security and compliance, servers do about what they did 15 years ago. The improvements are mostly due to better hardware, not better OS's. And, frankly, I'm not seeing much growth in the U.S. server market (China is a different story). Sure, companies upgrade from time to time, but frankly, most everyone that needs a server has one by now. It's easy to think that servers are their future, but IBM thought the same thing. Now, IBM has become very, very profitable making software and selling consultants. Their servers are still crankin' but the sales of them have been pretty flat.

So, here's my prediction.

--Based on what Balmer is saying, I don't see MS changing direction any time soon.

-- MS' console sales are going to stay basically flat. They had a great competitive advantage
when you could watch Netflicks on Xboxes. But now you can do it on PS3/s and Wii's too.

-- MS' desktop OS sales are going to drop dramatically in the next 5-10 years. I'd say they're looking at a 20% loss or more. In fact, if Apple keeps stealing market share, they could be worse. Last I saw MS was down from 98% marketshare on PCs to 92%. And when you focus on Windows 7, they haven't even caught Mac yet.

-- MS won't ever be a major player in cell phones or hand-held devices. The problem is that MS keeps trying to use Windows for everything. When you look at what Google did, they basically took a browser and made it into a small OS. It's light, small, and runs great on underpowered devices. That's *tons* easier than taking a bloated OS that's intended to run on every computer (PC and Server) ever made and trying to squeeze it down to a phone. I have a Windows phone. I'd give it a C- . It uses Windows' interface which is impossible to navigate on a cell, it crashes regularly, runs slow and requires regular reboots.

-- MS will stay strong on servers. But I wouldn't expect a lot of revenue growth.

I guess I'm wondering if they aren't going the way of Sun and IBM. A few years ago, some developers walked around with and attitude because *they* were *serious* developers (meaning they worked on mainframes, not those silly PCs). I wouldnt be shocked if 10 years from now, it's the Microsoft developers that do that. While the Android developers are slowly taking over.
If I'm right it will mean that the software development industry will move back to the wild west. Since there will be less standardization in non-servers (PCs, game consoles, phones, ipads, netbooks, etc) and each will use its own OS, there will be fewer tools available to do development and fewer standards on how to do it.
This could be good for IT and it could be bad. On the one hand, I think it will get harder to do client development. Even now, if you think about a bank app to check your balance, it has to run on a PC, a Mac, an iPad/pod/phone, a handfull of other phones (Palm Pre, for ex), and Chrome OS. That *has* to be a pain.
On the bright side, it will keep us developers busy. I'd expect whole new investment opportunities and I'd expect to see developers make more money because of the additional challenges.
We'll see what the future holds.

Tuesday, May 4, 2010

Geek Humor

Lots of stuff to blog about, but for today, just some geek humor.
I've soooo been on this project.

Just made me laugh.
--kevin

Sunday, April 18, 2010

wasssup with java?

Every time I turn on my PC now, Java wants to update. What's up with that? Java is 14 years old. I really don't expect a 14 year old system to keep having new patches all the time. I mean, I get that the hardware changes and whatnot. And it needs to be updated from time to time. But seriously.... I must have had to patch 6 times so far this month.
Now this:

I've gotten to do some Java development again recently, and I do still like it. It's a fine technology. But I wonder if it's getting left a bit behind. Microsoft has really been pushing some of the tools and extended features of .net. Generics, attributes, workflows, now Pex and some of the new tools. Don't get me wrong, I don't think Java's going anywhere. It's a great language and you can do awesome stuff with it.
But I've always thought Java was a language in search of a purpose. At first (remember?) it was designed for portable devices -- cell phones, pda's, mp3 players. That was its birth reason. These hand-held things changed constantly so having an interpreted language (write once, test everywhere), was a great idea.

Except that it was too top heavy to run on those things (especially 14 years ago). So it was proclaimed to us that Java applets was the way of the future for the web.

But, of course, they're top heavy too. And they take a lot of bandwidth, back when people were still connecting over modem.

So, we were told that what was coolest about Java was its ability to run on different servers, and Java servelets were the answer. This really was a great idea. And it took hold a little. But there were 2 problems with it. First, companies don't need to have the same language across multiple servers. They don't move apps from Unix to Mainframe to Windows very often. So the idea of a cross-platform language was unneeded. Second, the hard-core OS developers (like Unix admins and guys who wrote assembler on mainframes) really didn't care about Java. These weren't the software geeks who got stoked by new object oriented ... anything. And they were perfectly happy with Perl, thank you very much.

So, the *real* reason for Java we were told was CORBA. This would allow developers to write thin tiers on Windows and Unix (and whatever else) to facilitate communication between boxes. And then the hard-core OS devs could write the real complex stuff, and it could be invoked from Java. Great idea. Terrible implementation. I'm sure someone somewhere is using it, but I never spoke to anyone who could actually get it to work reliably.

Next, we learned that the *real* purpose for Java was UI development on a desktop. Enter Swing (well, JFC pre-dated it, but sucked so bad, I don't even want to remember it).
Swing, again, was a great idea. And a wonderful implementation. Personally, I think Microsoft has *finally* just caught up with some of the great ideas (like databound controls that bind to objects rather than databases, really separating the UI from the raw data).
I spent about a year developing Swing apps. And I have to say that it was the greatest framework I've ever used that really, really sucked. Developing it was cool. But it was so, sooooooooo buggy. I remember writing window frames that the user had to resize manually in order to get them to refresh. There was nothing I could do programatically to do it... that actually, you know... worked.

In the end, EJB won the fight. And it is cool, it really is. The big win for EJB is also its big loss. Sun, who developed the standard, doesn't actually build the implementations. Which means there's a huge disparity in reliability. What works in IBM's server, doesn't in BEAs. What's great in JBoss isn't great in IBM's.

And now this.
Gosling quit.
Not a surprise after the takeover by Oracle. Still, it's a blow.

So I wonder where Java is going. I really would love to see Gosling connect with some of his old peeps and create a "newJava" . I really like Java, but I think it's time for a makeover. What would be even cooler is if he really went open source this time and connected with the ANSI group from the start.

Meanwhile, I'd be happy if I just didn't have to download new patches every day.
--kevin

Tuesday, March 30, 2010

Is Linux finally dead?

I really like Linux. You have to understand. When I started in IT, I'd spent a fair amount of time learning to program PCs. I could tell you from memory which was the mouse interrupt (12? wasn't it?) and how PC's allocated hard drive files. When the sat me down at my desk my first day on the job, it was in front of a brown screen wise terminal jacked into an HP Unix server. It wasn't so much a learning curve as a learning cliff. But I managed.

Over the years, I switched from HP UX to Solaris and learned how to do low-level file IO in C and how Unix/Linux/Posix does inodes. Good times. :D

At home, I've almost always run a Linux PC . I love the flexibility in XWindows. I geek out on running different desktop managers. I spent days (days!) changing my splash screen and getting fluxbox correctly configured.

I almost abandoned Linux a few years ago, but Ubuntu brought me back. What an awesome build.

So I got a really nice laptop a few months ago -- quad core, 6gb, 1/2 TB hard drive (with an open bay) and a really kickin' ATI graphics card. In many ways, it's the PC I'd always promised I'd get someday.
It came with Windows and I actually waiting for the latest rev of Ubuntu (Karmic) to set it up for dual boot.

But.... it kinda ... sucks. The audio card works great, unless you plug in the headset, then it plays sound through the headset and the speakers both. The graphics look ok, but the vendor's graphic drivers don't work. They cause the PC to do all kinds of crazy things (like the wifi card just drops when I use the vendor's drivers). I can read CD's but can't seem to burn DVD's -- and the lightscribe feature is totally unavailable because the manufacturer won't support Linux. My phone plugs in but Linux can't read it. Same with my camera.

I've read a lot online that Karmic is buggy. I can live with that. I mean, just between you and me, let's face it, Vista sucked. You upgrade and move on.

But the software for desktop Linux is disappearing. And what's left isn't working very well. Virtual Box (which is a really great product, by the way), has constant graphics issues with the new revisions of Linux. And it's crashed hard on me a few times. Rhythm box crashes if I change the volume twice (I can do it once). Google *finally* added support for Chrome, but it clearly wasn't at the top of their list.

And now I see this:

Basically NVidia is dropping open source support. That means (in effect) Linux wont' be able to really support NVidia cards anymore. As mentioned, my ATI doesn't work either. So that leaves Linux in a pretty bad place.

Add to that the fact that now Sony is dropping the option to (easily) install Linux on a PS3, and you start to see a trend.
Just about all the computer game makers have dropped their Linux support. There was a time when you could get the latest game ported to Linux, and available online or sometimes even at the local computer store. But those days are long gone.

Dell still sells Linux PCs (I think?). But last I looked they had one (count 'em ONE) latitude laptop with Linux and the not-so-stunning , bottom of the line Intel graphics card. And (I think?) Dell is about the only major manufacturer still offering Linux at all.

Does it matter? Well, here's my guess. In 5 years, no one will really own a PC -- I mean, the non-geeks (you'll have to pry my HP laptop out of my cold, dead hands, brotha!) . Everyone will have Netbooks for their day-to-day stuff, and PS3's/Wii's for their games and movies. In some ways, it's a step backward, really, since these will all use proprietary operating systems. That said, in many ways, proprietary OS's make sense. In a very theoretical sense, the one-size-fits-all PC is really a bit much. Windows has gotten pretty bloated as they've tried to put things in there that no one actually uses. Linux has too. The model of a thin client, with a hardware-tight proprietary OS makes a lot of sense, especially given the number of vendors now in the market.

The death of desktop Linux has been predicted a million times and I hate to be the million and first. But I have to say that things are looking dimmer. Sure, there will always be some government office in Botswana or something that uses it. And sure, there will always be hardcore geeks who play world of warcraft through wine.
But I look for desktop Linux to take step back out of the mainstream. Not sure if that's a good thing or not. Maybe it's where it belongs. Going mainstream has its price, too.

--kevin

Tuesday, March 16, 2010

encrypted data store

For a while, I've been struggling with how to persist database connection strings. Some of them have user accounts and passwords, so I don't want them in plain text in the config file. I don't really want them hard coded either. I have played with encryption mechanisms on the config files but can't find anything I like. Using Microsoft's default framework, either the encrytpion is tied to the user id --- so when you move from a dev account to a production one, you can't decrypt --- or it's tied to the machine, which affords the same issue. If you manually encrypt, then you’re just displacing the issue, since you need to find a way to store an encryption key or the like.

So, here’s my solution. I’ve set up an encrypted database repository to store database connection strings. The connection string to this repository is in a plain text config file, but since it is SqlServer, it uses user auth to connect – so, no passwords needed. Since it’s an encrypted table, anyone looking to decrypt needs access to the database, the table, the cert and the symmetric key. Without all those privs, you get nothing. If additional security is needed, you can add certificate enforcement on the connection itself.

Here’s how to set up the basics.

First, you’d need to create a table. I’ll ignore the DDL for this. I used a “database name” for the key. For the value, you’d want something like this:

ALTER TABLE DataConnectionValues

ADD ConnectionString varbinary(255);

GO

It should be a big enough column, of course. I picked 255, since it has to account for an encrypted string, which would be longer than the non-encrypted one. And I picked binary since the encryption may use non-standard characters.

Ok. Next we’ll need a master symmetric key, if one doesn’t exist

IF NOT EXISTS

(SELECT * FROM sys.symmetric_keys WHERE symmetric_key_id = 101)

CREATE MASTER KEY ENCRYPTION BY

PASSWORD = ‘someCleverPassword’

GO

The inner select and existence check are clearly not needed, but I added them anyway.

Now a certificate:

CREATE CERTIFICATE DBConnection

WITH SUBJECT = 'Encryption for database connection strings';

GO

Next, a symmetric key using the cert

CREATE SYMMETRIC KEY DBConnection_01

WITH ALGORITHM = AES_256

ENCRYPTION BY CERTIFICATE DBConnection;

GO

I used AES 256, but SQL Server supports other encryption types.

That’s pretty much it for the set up.

Now you can insert a row into the table, then do an update to add the encrypted value.

To update a value, first open the key:

OPEN SYMMETRIC KEY DBConnection_01

DECRYPTION BY CERTIFICATE DBConnection;

Then do an update, using the EncryptByKey function (it takes a reference to the key just created above):

UPDATE DataConnectionValues

SET ConnectionString = EncryptByKey(Key_GUID('DBConnection_01'), 1234);

GO

The EncryptByKey will encrypt using the key, while the DecryptByKey function will decrypt, of course.

I’d recommend wrapping these in a set of functions or stored procedures so that the developers don’t have to mess with keys and certs. In addition, that means that the user accounts only need execute privs on the procedures, not select privs on the table.

The coolish thing is that the users will get nothing if their accounts don’t have the right privs.

To get that, you need connection and select privs on the table (or execute on the stored procedures), privs on the key and the cert.

You can add these cert and key privs by:

GRANT REFERENCES ON SYMMETRIC KEY::[DBConnection_01] TO [someUserId]

GRANT CONTROL ON CERTIFICATE::DBConnection TO [someUserId]

If you really want to get cool, you can add connection encryption to the initial connection string to get the database values. This is probably a good idea since you’ll be passing passwords across the wire. Doing that does two things: First, it encrypts the connection, second, it secures it via the certificate. Not only does this make it harder to hack the values, it adds an additional layer of authentication, since now, the user not only has to have the correct user credentials, but also has to have the correct certificate installed.

From the developer perspective, this is as simple as changing the initial connection string to something like:

Data Source=someServer;Initial Catalog=myEncryptedData;Integrated Security=SSPI;Encrypt=true

That last part sets the encryption to true, and secures the connection. To do this, you’ll need to set up a server certificate on the connecting server and also on the database server.

I’m out of space to discuss that here, but Google knows everything.


Next time I'd like to blog about how to wrapper this into a data connections library and use an object factory to really abstract the complexity of connecting to databases.



--kevin

Friday, February 26, 2010

case sensitive dictionaries

This is just something I didn't know.
If I create a generic dictionary (a string, object Dictionary , for example), the keys are case sensitive. I'd spent hours of work trying to get around this. UNTIL, I found out that case sensitivity in Dictionaries is configurable.

new Dictionary...(StringComparer.OrdinalIgnoreCase);

solves the issue.

Just something to put on the stack.

Tuesday, February 23, 2010

What does Microsoft have against OOA/D?

My guess is that it's because Object oriented development wasn't invented by Microsoft, but it sure seems that they have it out for OOA&D.
Of course, you can go back a few years and clearly see how they were dragged kicking and screaming into the OO world. Take VB4. I still have the box. It clearly says it's an "Object Oriented RAD" environment.
Forget for a minute that OO and RAD are two (nearly mutually exclusive) different things. RAD is designed to get stuff out the door quick. OO is designed to make what you build more supportable, even though it may take longer to build. But, let's ignore that.
VB4, for all it's "OO" claims was to objects as Donald Trump is to good hair. There was no inheritance, no data protection. About the only thing even vaguely OO was polymorphism but that was an accident due to the fact that VB supported late binding of variables.
Not a shock it was that MS' web platform (ASP) relied on VBScript and JScript. JScript was no more OO than VB. Well, to be fair, it had the keyword "object" but you couldn't inherit them or protect them. In fact, you could "superclass" them -- meaning you could actually change the API of the parent. They also were pretty loosely scoped and properties defined in one were visible in another without even a direct object reference.

But that's all in the past, right?

Well, MS is better. C# is OO, no doubt. But you can tell that the gurus there still don't believe in this whole object junk.
For example. take Windows Workflow Foundation. If you create a new flow, the tool auto-magically creates a code-behind "designer" class for you. You can do all kinds of cool things with this -- you can create input and output properties and nifty methods that actually can change how the flow works at runtime. But you can't inherit it. It's a partial class (*choke*). (I once told my dev team that I'd slap the first on of them to use a partial class, but that's another story. )

So... hypothetical situation. Let's say you want to do something crazy, like, oh, i dunno, standardize the input and output elements in your flows, so that you can write a nice clean "workflow consumer" class that your junior devs can just use without having to mess with the whole delegate and workflow runtime junk. So, you think "aha! C# is OO. So I'll create a parent class with the inputs and outputs and implement a standard that all workflows need to subclass from that."
Nope. Notta. Can't do it. That would mean you'd be using OO. Not allowed.

And don't even try to inherit ASPX pages. Cuz, you know.. you wouldn't want to do that, of course. Heaven forbid that you'd want to pull out some of the common functionality. You can't really even inherit the code behind, since it, too, is a partial class (*grumble*).

To this day, I'm still P.O.'ed at them for the "virtual" keyword. Why on earth, would you want to ever make a method that you *don't* want to inherit and potentially override? Ok, maybe I can see it in a few unusual cases. But as the default?

Seems to me, every OO developer has created a class for some project. Six months later, they needed to extend it. They dutifully subclass only to realize that they forgot to make the parent method virtual . Or they just didn't think of it. Now they have to open up the parent class and change it. Look , isn't that the whole *point* of using OO? So that you *don't* have to change the parent class?

For that matter.. why the heck can't you inherit variables from parent classes? what were they thinking when they disallowed this: protected virtual string _x = ""; ? Oh, ok, i get it, you can do it with a property, but.. wha? Why create a property for one stupid string, just so I can inherit it?

As a side note, the whole property thing confuses me. Ok, I mean I get it. Properties replace the getters and setters in Java and, frankly, they're cool. But I defy anyone to show me the value of

public virtual string X
{get { return _x;}
set {_x = value;}
}
But yet, this is the default. So basically you're saying you need 4 lines of code to update a variable? come on. And it doesn't even help maintenance really. I mean, it takes 3 seconds to add it later if I need it. It's an awesome idea, but they're killing us with it's implementation.
As an example, take a look a workflows again. All input and outputs need to be properties. So you potentially end up with 200 {get...set...} combination that... provide.. which value again?

I think this is true because the reflection objects like properties. It's like the MS developers had a new feature and they wanted to use it everywhere. But sheesh.

Unfortunately, I don't see it changing. MS is stepping away from improvements in the core language and moving to improvements in the tools. Unfortunately, the tools won't support OO either -- they're mostly RAD-based. WCF is cool, but try to inherit the WCF transaction sometime. And going forward, I'd expect more things like SSIS and SSRS, and less things like Spring.Net, which actually makes use of inheritance. I do think it's telling that Java developers came up with Spring, while Microsoft came up with F#.

I don't think Microsoft's developers lack any mental capacity. I think they're smart folks. But I just don't think they've had the occasion to use OO. I think that's why none of their tools really support activity diagrams and sequence diagrams very well. (Well, Visio lets you do a few things with them, but we all know it was an after thought, and not a very full functionality.)

On the other hand, Microsoft has been very kind in adding code snippets to Studio. After all, if you can't inherit, you're left with copy/paste, right?

Sunday, January 24, 2010

The Gap

Less techie-stuff today.
Here's a story. Many years ago a very smart young man went to work in IT. Naturally, he developed software in the language of the future (PL/1, with a bit of assembler thrown in). He worked with the computing gods, and aspired to become one.

But along the way, something happened. Maybe he just didn't have divine attributes, maybe he got bored or outsourced or decided he wanted a life. So he changed his aspirations. Rather than being one of the gods of computing Olympus, he would manage them.

He was good at management. He was hard-working and politically savvy. Over time, he worked his way up from Junior Assistant Lackey Manager to Senior Assistant Lackey. Pretty soon, he found himself in a Vice President's office.

But he had one problem. He had no clue what his employee's did. He didn't remember much about computers, but he knew terms like "libraries" and "frames" and maybe even remembered a bit of JCL. They were talking about EJB's and Ajax (wasn't that a cleaning product?) and SSIS and PCI and data warehouses.

Conceptually, he kind of got it, being a smart fellow. But the details were all a mixture of magic words and flowcharts. And he couldn't understand why IT had gotten so expensive or why it took so long to get things done. Why not just toss together some JCL and run the silly thing? Why did everything take 4 months?

Enter the vendors. They walk into his office with really spiffy demos. It's not a flow chart, it's something he can see. And it's pretty. The reports are all graphical and the demo shows how easy they are to use. And time to market? Oh, the vendor insists that they have a crack consulting team that can deliver on the fly. Why, as part of the demo, the vendor even builds a really cool report in minutes.

Our hero is hooked. He signs on and spends a zillion dollars.

Then he gets the product to his IT staff and they tell him that they can do the same work already with other tools. Moreover, they insist that the complexity isn't producing the pretty report, but aggregating the data from the 16 different heterogeneous data stores they have, manipulating it to make it actually match, then (this is the really hard part), making it mean something. It's easy to sum a table column. It's a lot harder to know what the data in the column means or what the rules are around summing it.

The problem is that there is a gap between the people who work in IT and the folks who manage it. The vendors and consultants make a ton of money trying to fill that gap. Each year, there's a bevy of products aimed at it. They each hold the promise of being able to close the gap -- and turn the IT systems from a collection of disparate pieces into a contiguous whole. Only, in reality, the vendors don't really care much if they work. Oh, I'm not saying they are sloppy or bad people - although some are, I'm sure. But I'm just saying that in the end, the vendors have very little incentive to really close the gap, since that's basically how they make their living. Without the gap, consulting services have to shift their work to being little more than resource augmentation providers. And they have no real competitive edge. Today, if a consultant can get certified in the latest hot product, they can bill themselves as experts. If the gap closes, that will hold much less value. The vendors are in the same position. If the gap closes, then the only way for them to make a living would be to be truly creative with their products or to build tools that actually provide value for the developers. Don't get me wrong, some companies do that. Microsoft has a bunch of them. Oracle does too, although they have a lot of "gap-closers" too. Sun has a few, although they've been hamstrung by financial issues. But other companies would have to pony up and compete with the IBMs of the world.

And it's so much easier to simply pull together a good demo.


Friday, January 15, 2010

CORRECTION

My mistake.
Some days I just struggle to tie my own shoes.

In my last blog post, I talked about how to create an audit trigger. And I made a mistake that I found badly when I tried to run an update.

Here's the issue.
SET @OldValue = (SELECT B FROM deleted);
SET @NewValue = (SELECT B FROM inserted);

this is great if there's exactly one row in the tables. This means that if you're updating only one row at a time, everything's dandy. But if you do an update that will impact more than one row, this will fail and crash hard.

How do you get around it?
Basically, you need to open a cursor for each row in the update tables, then step through the cursor one row at a time.

So in this case, first you would
build out a cursor
DECLARE cur_inserted CURSOR FOR
SELECT B, A FROM inserted
OPEN cur_inserted
FETCH NEXT FROM cur_inserted INTO @B, @A
WHILE @@FETCH_STATUS = 0
BEGIN


SET @B= (SELECT B FROM deleted WHERE A = @A);

INSERT INTO MyAudit (Value, AuditType)
Select B, 'OLD' from deleted WHERE A=@A; (assuming A is a unique key).

Basically, it does the same thing, but uses a cursor to walk through the modified rows, then pulls the corresponding values from deleted that correspond to the cursor row. In this case, it will pull the "old" values into the log table while walking through the "new" ones. Of course, you can still add conditional statements on this as before.
but this will actually work for mass updates.
Sorry for the confusion.