Thursday, December 24, 2009

audit tracking on Sql Server

Here's the challenge. You've got a database. In that database, you've got a table (say T). In that table, there are some columns -- say A, B, & C. If column B changes, you want to log that change somewhere.

You've got a couple issues. First, you don't care about columns A or C. If they change, you don't want to log the change. So how do you know when it's column B that changes?

Next, you only want to log the change if it actually *changes*. So if the value of B is x and someone does "UPDATE T SET B='x'", you don't want to log that, since it's not really changing the value ,but just setting it to what it already was.

In Sql Server, you can do this easily with a table trigger, but how you do it is a bit whacked.

First you need to create the trigger, of course. Then we'll need 2 variables -- one for the old value and one for the new:

CREATE TRIGGER somename
ON T
AFTER UPDATE
AS
BEGIN
DECLARE @OldValue integer
DECLARE @NewValue integer

Note that the trigger actually fires after the update. I did this because, in this case, the trigger need not be transactional -- I don't want to rollback the change just because I can't audit it. There may be cases where this isn't true, of course, but it will due for what I need.


Next, you'll need to use this: COLUMNS_UPDATED()
COLUMNS_UPDATED() will show you the columns that got updated by the update statement that fired the trigger, but the weird part is that it returns it as a bit mask. If you don't know what a bit mask is and don't want to learn, you can use a built in function to help. The column I'm looking for (B) has a corresponding bit in the bitmask. You can look to see if it gets set, but the way bit masks work, it's not quite that simple. Changing columns A or C at the same time as B, will create a different bit mask value. So it's not as simple as " if COLUMNS_UPDATED() == x20000" or something.

Because it's been like a hundred years since I've had to bit shift, I' m going to suggest using Microsoft's built in function to look for this value. It is:
sys.fn_IsBitSetInBitmask()

So, the syntax:
IF (sys.fn_IsBitSetInBitmask(COLUMNS_UPDATED(), 2) >0 )

will return true if the 2nd column (in this case "B") has been set, regardless of whether A or C have.

But what it doesn't tell is if the value was actually *changed*. This if statement will return true anytime the column is "Set" in an update statement.

But it's pretty easy to check, although it takes a few extra lines:

SET @OldValue = (SELECT B FROM deleted);
SET @NewValue = (SELECT B FROM inserted);

"deleted" and "inserted" are in-memory tables that hold the values of the update before and after the update itself. "Inserted" holds the new values, "deleted" holds the old ones. Since it's in memory, it's pretty quick. In my case, updates to the table will only happen one row at a time. If you're talking about mass-updates, you need to be careful that performance won't suffer. On the other hand, it is an in-memory query, so the numbers have to get pretty big before that's an issue. If you're doing such bulk transactions, you probably have other issues anyway.

With this, the old and new values can just be compared:
IF (@OldDisp <> @NewDisp)

To insert them into the audit table, just pull the values from either "inserted", "deleted" or both, depending on what it is you're looking to audit.

INSERT INTO MyAudit (Value, AuditType)
Select B, 'OLD' from deleted;
or
INSERT INTO yAudit (Value, AuditType)
Select B, 'NEW' from deleted;

for the old and new values.

That's pretty much it.
You'll want to add some error handling. And be careful about this. If the error gets thrown, it probably means that you can't insert into a database table. If that's the case, you may not be able to insert into any (out of space or whatever). So you may want to log the issue to system logs
(EXEC XP_LOGEVENT @ErrorNumber, @ErrorMessage, @ErrorSeverity )

or RAISE the error.

What's cool about this, is that it's database centric. If anyone modifies anything, you have it -- whether it's a service that does it, a scheduled task, or even an individual user with database privs. You can pull off the user id that initiated the update, if that's helpful, but keep in mind that the scheduled tasks and services probably all have a shared account.

--kevin





Tuesday, November 17, 2009

the excel syndrome

What is the excel syndrome?
It's an executive manager who uses Excel and is convinced he "knows" computers. You see it a lot now. And it's one of the big things that's changed in the last few years. Ten years ago, if you said "e-commerce" everyone's eyes glazed over and they said "just make something work".

Today, all the execs use Excel and buy things from Amazon. So when an IT professional says : "setting up this database correctly and adding accounts with the right security and tables with the right schemas will take... er... 2 weeks", the exec says "WHAT!?? that's insane! Why... I can do it in Excel in 10 minutes". Then he (or sometimes she, but that's pretty rare), slashes the project estimate.

Oh, I know I'm exaggerating for emphasis. But the syndrome exists. It really does. My CEO just said today that we weren't delivering fast enough. Fast enough compared to ... what? How does he have a feel for how long it *should* take?

I started noticing this a few years ago. Non-IT people started to think they understood IT well enough to second guess their IT staff. I see it among IT people too -- especially those who have worked in small environments and with older, more limited systems. Sure, it may be possible to export a file from DBase from a live system with a simple real-time query. But.. um... DBase isn't relational? Oh, and there's that little thing called "security"? And the DBase database that only held 2MB of data is a slightly different creature than a SqlServer database that holds 200GB.

It is part of a trend in the industry. IT has become big business. Consulting companies and vendors are big business. As the money that goes into IT grows, more and more people are trying to get their little part of it. Naturally, that includes the execs and the bean counters.

I'm not sure where IT will go in the future, but it's days of free-wheeling are done, I think. It will become more and more driven by people who have an opinion on how the money is being spent, even if they don't know what their opinion means. It's kind of like those Windows7 commercials where the people are saying the invented Windows7, because they sent an email to Microsoft telling them to make computers better.

There are a lot of those people.....


Thursday, November 12, 2009

data access

there's lots of ways to do data access, but here's something I've been doing that seems kinda cool.

First, create a namespace (DataObjects, in my case) .
Next, a BaseDataObject this is the parent that all the data accessors will inherit from.

In the BaseDataObject go the common things. For example, I have a public property called "DatabaseName". Using this, the code will query for the correct connection string parameters when needed.

The BaseDataObject also owns the actual connection. This is interesting. The Connection goes into a thread static collection of connections (which is hard to say quickly :D ). When the connection is needed, the collection is searched. If the connection isn't there, it gets created and inserted for the life of the thread. Took a while to nail this, but this means that the connections are correctly cached by database name for the life of the thread and no longer.

In order for this to work, the connection is actually of type IDbConnection. Anything else will bind the data object to one database type.

All of the methods of the BaseDataObject follow that pattern -- each uses IDbCommand, IDataReader and the like.

The simplest subclasses only override the connection itself. For example, I've got a MySqlDataObject that allows connection to a MySql database. It overrides the "connectToDatabase" method to secretly append a mySql designation to the DatabaseName. This means the connection string resolution returns a MySql compatible connection string, not a SqlServer one. In addition, when it creates the database connection, it creates a "MySqlConnection", which gets implicitly cast upward to a IDBConnection on the return.

The more complex subclasses (the SqlServer one, for example) have some additional functionality. For example, I've added an ExecuteNonQueryWithParameters method to the BaseDataObject, which accepts a Dictionary of key-value parameters and then uses this to populate the Sql Statement. The BaseDataObject resolves all this by looping through the dictionary and ingloriously building the sql string. The SqlDataObject overrides this to use the Parameters.AddWithValue method on the SqlCommand, since this is available.

In addition, I've occasionally further subclassed the child data objects to allow versions of the objects that toss exceptions back to the caller and versions that don't, but log the exceptions and return a null result.

To wrap this all up, I added an object factory to the DataObjects namespace. Pass in the DatabaseName, and you get an appropriate subclass of BaseDataObject, based on whatever business rules are applied to decide this. The returning object is basically a live connection ready to be used.

So far, this has worked pretty well. My big frustration is the lookup on the database name -> connection string. I have been using a Microsoft "Settings" object for that, but, frankly, it sucks and doesn't provide any real benefit. The good news is that, since this is hidden by the objects in the namespace, I can change it anytime without issue. Moreover, the structure allows me to reference databases as objects. Not only do I like that, but the polymorphism and use of object factories allows that I can move data from one database to another with no changes to code.

Just thought it was cool enough to post.





Wednesday, November 11, 2009

today

just something to share that doesn't mean anything.
I spent about 3 hours today debugging a workflow. I was furious at Microsoft.
The issue was that the AutoResetEvent wasn't getting set, but timing out. So, after the flow was complete, the "oncomplete" delegate would fire, but then sit there and wait for the AutoResetEvent to timeout.

I checked the code. I checked the flow. I debugged. I searched the web. I was typing up a post on MSDN, when I thought I'd better double check my logic.

My flow was being called from a subclass of a util class that I had written. After some additional hair pulling, I realized that the subclass was hiding the AutoResetEvent with a variable named the same. What was really stupid was that in C#, you have to create the variable in the subclass with the "new" key word:
private new AutoResetEvent waitHandle = new AutoResetEvent(false);

Geeze. SO I *intentionally* was hiding the variable that I needed then wondering why it wasn't there.

My official thought: "DOH!"

So here's my takeaway. It's getting easier to build complex functionality with the new tools. They do a ton for you. Awesome.
But they hide things. Nothing in my debugging would show me which variable was getting updated or how.

I'm not blaming Microsoft. Well.. ok... I am, simply because I need someone to blame.

But it's a challenge that developers face now. Devs have always had to use code we didn't control. But the degree to which its needed is getting much higher. Whether we're using EJB or Windows something Foundation, you just have no clue what the code is really doing. If you make a simple mistake setting something up -- a hidden variable, a mistaken delegate, a copy/paste error in a config file -- can cost days in developer time to find. In the end, it could even be a bug in the toolset or a mistake in the docs.

It gives me pause for thought. I really wonder if this is the best way to approach things.

--kevin

Wednesday, November 4, 2009

Workflow and Exceptions

This guys says lots of cool, good, happy things about handing exceptions in Windows Workflow.

AND he says it with a cool accent!

The basics are pretty simple, but there are some weird gotchas. Like you have to have a different Exception Handler for each kind of exception you want to handle, not just different code for each.

There are a couple additional things to keep in mind:
first, Studio (2008, anyway) doesn't seem to recognize that you have a workflow exception handler in place. So when you run the code in debug mode, it will stop at the exception and tell you that it's unhanded. If you hit F5, it'll will step into the handler. Seems like a disconnect between the development tool and the workflow foundation.


There are a bunch of things you can do with the exceptions, of course. But my real desire is usually to trap them, log them, then return them gracefully to the calling process.


The second thing to be aware of is how to do this. Lots of options, but if you create a public, read-only property to your workflows called, say "Exceptions", and defined as , say, a custom collection (or generic) to hold the exceptions, these will get auto-magically returned to the caller. However, they are cleverly hidden so that it's not obvious where they are.
Upon completion of the workflow, all the publics get serialized and passed as a parameter to the delegates registered for the workflowRuntime.WorkflowCompleted, workflowRuntime.WorkflowTerminated, workflowRuntime.WorkflowAborted events.

So somewhere in the code, there should be a something like this:
workflowRuntime.WorkflowCompleted += OnWorkflowCompleted;

OnWorkflowCompleted is the delegate defined by:
protected void OnWorkflowCompleted(object sender, WorkflowCompletedEventArgs e)
Typically, here is where you will set the wait handle:
waitHandle.Set();
where waitHandle is defined as :
protected AutoResetEvent waitHandle = new AutoResetEvent(false);

But the 2nd parameter (WorkflowCompletedEventArgs e) is really just a serialized representation of the workflow itself. So all the public properties are available as properties.

So, in my case, I just have my exception handlers add the exception to the public collection owned by the flow. This is really cool for capturing all the exceptions in one place. For example, say I'm persisting some data elements. It could be that many of the data elements are bad. If I throw an unhandled exception for the first one, I never see the rest until the data gets scrubbed and reposted. Then I bomb on the 2nd one. But with this approach, I can just add the exceptions to the output collection and de-serialize in the calling code.



Thursday, October 29, 2009

SQL Server post Insert Trigger

I mostly want to document how to do this so I don't forget.

Here's the problem space: you've got a table (call it kjh) with some element (a). Whenever a new row is inserted into kjh, you want to insert a row into another table (kjh2), with some of the data (lets say column a) being duplicated.

This can be useful for a lot of things. You could use this to audit the values in a table, for example. In my case, I needed to copy off some of the data, so that the data could be modified, but the original data would be left alone for reporting purposes.

There 2 small issues. First, how do you get the data from the newly inserted row into the copy? Second, how do you trap and ignore the error that comes from trying to insert a duplicate row into the kjh2 table. This matters, in my case, because there are other things that can insert into kjh2, so there's a race condition. In the vast majority of cases, the trigger should insert cleanly. But it's possible that other processes could insert. Since there are no foreign key constraints here -- and since the other processes may legitimately have the right to do the insert to kjh2 before kjh -- I run into the possibility of this error.

One of the things that perplexed me is that this seems like it would be such a common problem. But multiple web searches basically lead no where. I finally found a kind soul on msdn to help.

On to some code. First the tables, just to show I have nothing up my sleeve.

CREATE TABLE [dbo].[kjh](
[a] [int] NULL,
[b] [int] NULL
) ON [PRIMARY]

CREATE TABLE [dbo].[kjh2](
[c] [int] NULL
) ON [PRIMARY]


Now the trigger:

CREATE TRIGGER trinsert ON kjh
AFTER INSERT
AS
IF ((select a from inserted) between 2 and 19)
BEGIN TRY
INSERT INTO kjh2(a)
SELECT a
FROM Inserted ;
END TRY
BEGIN CATCH
DECLARE @ErrorMessage NVARCHAR(4000); DECLARE @ErrorSeverity INT; DECLARE @ErrorState INT; DECLARE @ErrorNumber INT; SELECT @ErrorMessage = ERROR_MESSAGE(), @ErrorSeverity = ERROR_SEVERITY(), @ErrorState = ERROR_STATE(), @ErrorNumber = ERROR_NUMBER();

if (@ErrorNumber <> 2627)
-- raise the error

END CATCH

Just a couple notes.
"Inserted" is a temp table that SQL automatically creates which contains all the freshly inserted rows.
The error number mentioned is the error for a unique constraint violation.

and... *poof*... i've pulled a trigger out of a hat.

Wednesday, October 28, 2009

Missing Technical Phrases

Each season seems to bring a slew of new buzz words. Synergy, paradigm, core competencies and others have all graced our ears.

I'd like to officially propose some new, more down-to-earth ones. Please feel free to leave a comment with your own additions. Perhaps, we can create a list to present to those on high at Google and truly establish some meaningful phrases.

Inverse Redundancy:
Rather than putting one application on multiple servers in order to add stability and scalability, this is the practice of putting every application and service on the same server(s) in an apparent effort to increase clutter and decrease stability.
It has a related term...

Inverse Staff Redundancy
this is the condition of having the same person assigned to multiple roles on many projects. Typically, a project team or manager will present an org chart with many layers and lots of boxes. But looking closely, you'll see that most of the boxes below the management layer have the same resource. This is also known as "Office Space Syndrome", in a reference to the guy in the movie Office Space, who had 14 bosses.

Psychotic Optimism
This is the view that things are going well in spite of substantial evidence to the contrary. You've been on that project. I know I have. It's 4 days from delivery. You're 7 weeks behind schedule and have no resources actually working on it anyway. And your executive manager declares that the project is going well and will be delivered ahead of schedule.

Security By Annoyance
This is the practice of making security policies that serve no real purpose except to annoy the users, thereby creating the illusion of security. Technical examples include things like password restrictions that are so complex, MIT students can't understand them, sessions that conveniently expire right before work is committed, and locally-running software (hard drive encryption, virus checkers, etc.) that consume 60% of PC resources.
The primary value of this is three-fold. First and foremost, it safeguards the jobs of the security team, since their work is visible and seemingly important. Second, it drives off would-be hackers by making the common functionality too painful to bother hacking. And third, it decreases system load as common users simply give up and resort to pencil and paper.
It should be noted that this is not strictly an IT term, but can be seen in any airport security checkpoint.



Feel free to add your own.

Thursday, October 22, 2009

bitlocker

I figured that since today is the official release date of Windows7 (I think?), I'd add an entry about one of its features.

Windows 7 now has drive encryption built in, in an application called bitlocker:

I've been using it for a short time, and I have to say, it's pretty slick. It's just about what you'd want it to be. I have my sd card encrypted and it requires a password to view it the first time, then, once opened, allows free access until the computer restarts or the card is removed and re-inserted.

In general, I haven't really found anything with Windows 7 I don't like. I've got occasional annoyances with Microsoft because they seem to like to move things. You look for something in a menu or control panel section and it's not there. But you find it in a new section -- same functionality, same look and feel, just randomly moved.

I think one of the things about Vista was that Microsoft wanted to make it more userfriendly to people who weren't used to computers. At least, that's what they told me at the presentation I went to on it. I think there are 2 funny things about that. First, what is their target market? the 2 people in the mountains of Colorado who haven't looked at a computer in 12 years? I mean, computers are so key to our culture now that schoolkids grow up on them. Why target the computer illiterate when there are so few left. (Perhaps their real target is in China or other places where PCs aren't so common?)

The other thing about this is that, while a few things have become easier, most haven't. In fact, (to the point) Windows7 is more closely tied to Server 2008 than anything else. So, depending on what you do, you may end up having to mess with user roles and group policies. Personally, I think this is great. I love having it broken down like that. But for the computer nubie, they'll get lost.

Still, 7 is an awesome product. It's fast, clean, visually impressive, feature rich -- it's the best OS Microsoft has every marketed. The integrated virtual machine is good (not awesome, but good), the bitlocker feature is pretty cool, the integration with Vista apps is cool -- and it actually *runs* them without crashing! All in all, it rocks.


Thursday, October 15, 2009

why your systems suck

It's been a while since I posted, I know. The delay hasn’t been for lack of interest or distraction as much as it has been from the fact that I just haven’t been doing anything interesting. Most of my time lately has gone to fixing “little stuff.” You know, that old app that works most of the time but has an issue that wakes everyone up once a month. Or that emergency “gotta-have-it” report that no one will bother to look at. Or that special request that comes from the local potentate that is meaningless, but that everything stops for because the potentate wants it.

It’s given me time to think about the overall question though: why do computer systems suck so bad?

If you’re bothering to read this blog, I assume you have to be a geek. Otherwise, you’d not be wasting your time. But we all know that geeks like to waste their time, so you’re here. So, for a minute, pretend you’re not a geek. Let’s say you’re a high-powered manager in a decent organization. You were that guy who played second string on the football team in collage and drank a bit too much the night before the early morning practice. Or you were that young woman whose social calendar was filled with must-do events and whose classes just kept getting in the way.

But you got your MBA and found you had a real aptitude for understanding and managing how companies actually make money.

Now, you find yourself going in each day and looking at a spreadsheet with your morning coffee. This spreadsheet gives you a snapshot of yesterday’s performance and today’s challenges. And it sucks. The tabs aren’t right. The numbers are usually off. The thing is often late. You’ve learned to rely on this and you need it to get the competitive edge you crave. But the thing can’t even reliably calculate something as simply as a sales conversion rate.

You call your IT director and scream into your phone in what becomes a daily ritual. And you just can’t understand why creating a simple spreadsheet should be so beyond your IT department.

Why is it? You’ve got a bunch of good people who all work too many hours and really want to make it right. You’ve got a – maybe not ideal but – reasonable budget. What’s wrong?

What IT groups usually miss is the big picture and the end goal. That business manager cares about his or her spreadsheet. They couldn’t care less that their IT department just worked a 14 hour day to install a new version of a J2EE engine.

What tends to happen in IT is that you get a developer. Let’s say that developer’s name is Marvin (you know… the guy who wears white socks with black shoes and reads comic books at lunch). And he’s pretty good. The code that he writes incorporates complex logic to pull data across a variety of distributed platforms, converts data types – because we all know that duplicated data isn’t really cloned, it’s mutated – and does these complex mathematical calculations that would make a physics professor blink. And it works pretty well – say about 90% of the time. Let’s say that 3 days each month (or about 3/30) it has some issue. It could be something small, like it tried to write data, but couldn’t get a lock for one of the rows. It could be an all-out crash that was really simple to fix by just restarting the app, but required a little manual intervention. Or it could be anything in between – maybe Marvin has a bug in one of his calculations, so that if some value is less than zero he gets the wrong results. But, all in all, it only has issues 3 days each month. Not a huge deal. Someone calls Marvin, he gets out of bed, logs in, fixes the issue and republishes the data.

But suppose the server admin (Alvin) has the same success rate – 9 in 10. About 3 days each month, the server runs out of memory or disk or drops off line without warning to auto-install new patches or something.

If the failure rate of Marvin’s process is 1 in 10 and the failure rate of Alvin’s server is 1 in 10, then the sum of the failure rates is 20%. But here’s the kicker. The whole is greater than the sum of the parts, because a failure in the server can cause downstream consequences that may not be visible instantly. (This is one of the things I can’t get my infrastructure staff to see, by the way. So as part of his process, Marvin, let’s say, writes some temp files. When he gets called because the server did an unplanned restart and he needs to re-run his process, he logs in, goes to a command prompt and runs the process. But he runs it with his credentials, not the credentials of the scheduler’s account, thereby causing the scheduled process to run into privilege issues on its next run. Now he’s just created a problem that was not part of either 10% failure rate. It’s a perfectly reasonable mistake to make. It doesn’t make Marvin a bad guy.

But the end result is that the total failure rate of the end process is now 10% for Melvin’s process + 10% for Alvin’s server + X% for mistakes made during clean up + Y% for other failures in the process caused by fallout from the server crash – missing temp files, lost session data, etc.

In the end, the failure rate of the whole process may be something like 25% or 30%. And that likely doesn’t include the failure of whatever systems Melvin is pulling the data from. If those systems also have a 10% failure rate, and their failure can cause downstream problems, this can add an additional 15% or more to the overall failure of the process.

So the reason computer systems suck is that the failure rates grow geometrically with each new error.

How do you fix it?

Naturally, you need to fix the individual issues. You need to get Melvin’s 10% failure rate down to 5% and then under 2%. But the real answer is that the systems need to be more loosely coupled and self-resilient. Any cross-system dependency needs to be identified and planned for. And someone needs to own – not the architecture of the individual pieces but – the architecture of the interaction between the pieces. This is probably the most overlooked piece of the puzzle.

Someone once said: “the more complex you make the drain, the easier it is to stop up the pluming”.

Wednesday, September 2, 2009

just a little annoyance

One thing I don't get is why Microsoft creates these really spiffy OS's for servers, then configures them like they're PCs.
My latest little bitty annoying thing is that Windows 2008 server by default hides hidden files and application extension in windows explorer.
I mean.. come on. For a retired couple who use their PC to download pics of the grandkids, I totally get it.
But for a professional server OS? "Oh, you highly trained, professional Windows admin, you can't possibly be expected to know what an "exe" file is.. we'd better hide that from you."

Yeah, I know you can change that to see these things, but how stupid does MS think system admins are that they set it up this way in the first place? What an insult.


Saturday, August 22, 2009

Windows 7

I played with Windows7 beta a whlie ago and really liked it. But I picked up a new laptop since then, and didn't want to put the beta OS on my new laptop. Now that the RTM is out, I just upgraded my laptop, so I thouhgt I'd blog about my experience for anyone who cares.

First the geek details: I'm running an HP quad core with 6-gig ram and a 4650 ATI graphics card. It came pre-installed with Window's Vista Home Premium 64-bit. I'm upgrading to Windows7 Ultimate 64bit

Next, the install was long(ish). Let's face it, my laptop is a whapper. But it still me a couple hours.

The results: stellar. I've been playing with computers for 100 years or so. I've installed Linux more times than I can remember. I've put ever major version of windows on except "Bob". I've never, ever, EV-Er had an upgrade go as well. I did the "upgrade" rather than the clean install. It picked up my graphics card, my network, and amazingly all my settings, down to my log in password and desktop wallpaper.
I gotta say, this part blew me away -- *it even kept my installed programs*! I rebooted to 7, my virus checker was running, Google Chrome had all its bookmakrs, everything is just "there".

It did a pre-check before installing to identify which programs it would have issues with and it found one that it didnt know what to do with (an HP-specific keyboard filter, that I really don't care about), and 3 or 4 that i'll need to re-install. It saved the report, so I can go back and look at it in case I dont' remember.

Now... Windows... 7.... ROCKS.
It's *much* faster than Vista, doesn't crash as much and has a bunch of bells and whistles that are actually usefull -- like the ability to run a virtual XP instance built in.

I've only upgraded today, so I may have a different opinion in a few days, but so far, i'm very impressed.
MS got it right, this time.

Sunday, August 16, 2009

a quickie Unix thing

For those who don't know, I've worked with Unix/Linux/Posix for .. like... longer than I'll admit. And I keep getting into Posix style projects.

This is a really quick thing, but by far the most useful thing about Posix to me is the way you can chain commands. Probably the most coolest command chain that I regularly use, or at least the one I use most often is:

find ./ -name "*.something" -exec egrep -il "somepattern" {} \;

Yeah, for those non-"ix"-ers that syntax is ugly. Ok, for those of us who have been there, it's ugly too.
But what it will do is look through every file named *.something for the pattern "somepattern", ignoring case, and provide a list of all the files that have a pattern match in them (that's what the -l is for).

I use it constantly. Of course there are lots of variations, but the ability to look through a complex directory tree of php files for anything with a certain pattern, when one is totally unfamiliar with the code, is wildly useful.

it's not difficult, but I've run into a few poeple who don't know how to do this and stumble over it using pipes and the like.

Hope it helps someone.

Wednesday, July 22, 2009

Creating Custom Attributes

I don't want to seem like all I ever do is bash Microsoft. Like everyone, they have their ups and downs.
Personally, I think C# and visual studio is the best toolset I've ever used in like a zillion years in IT.
Here's one small reason why. In C#, you can add attribution to object properties and methods.
Create a workflow, examine the properties, and you may see something like this:
[DesignerSerializationVisibilityAttribute(DesignerSerializationVisibility.Visible)]
or this for a WCF service:
[ServiceContract]
public interface SomeService
{
[OperationContract]


This attribution is basically metadata about the properties and meathods. Microsoft uses these for a whole list of things. One of the most obvious uses is to be able to provide meaningful information to visual editors and other 3rd party tools. An add on product, for example, that does unit testing -- nUnit, mbUnit and even Microsoft's own testing set -- uses this metadata to identify the entry points for tests and the test methods with attributes like
[TestFixture] and [TestMethod]

This is great. And it gives .Net a real professionalism over other great tools powered by Java or C++ or Php or other things.

What's even better, is that the attribution is extentable, so that you can create your own attributes. More, doing that is easy.
Simply create a class that subclasses the "Attribute" class in .net's System namespace. For example, suppose I wanted a custom attribute to indicate which properties were to be persisted to a database. This is similar to the "Serializable" attribute, but suppose I wanted to roll my own.
All I would do is this:
[AttributeUsage(AttributeTargets.All)]
public class DBSerializable: Attribute

As long as the class is in a namespace to which the target code has a reference (ie, "using") you can add
[DBSerializable]
to any property of the target class. Visual studio's intellesense will pick it up and "tada" instant custom attribution.

Ok, so I've got it... now what do I *do* with it?
Well, in this case, all I want to do is identify which properties have this attribute so that I can serialize them or not.
Since .Net's reflection pulls the meta-data about the properites, this means it pulls the attributes too -- even the custom ones!
So this becomes possible:
public static PropertyInfo[] myCustomAttributeDemo(object o) // input type of "object"
{
Type t = o.GetType(); // get the type of object
PropertyInfo[] pi = t.GetProperties(); // get the properties for the object

foreach (PropertyInfo p in pi) // for each property in the object
{
object[] q = p.GetCustomAttributes(true); // get the attributes
for (int ii = 0; ii < class="Apple-tab-span" style="white-space:pre"> // for each attribute
{
if (q[ii].GetType() == typeof(Util.DBSerializable)) // check the type
{
// do some custom logic for serializable properties
}

}
}

This can be really handy for writing really generic, reusable code. For example this was part of a generic object serializer that will generate insert statements for specific databases based on any object passed in. Doing that really cuts down on the amount of (basically) duplicate code. And it just has the small assumption that the persistable properties use the custom attribute.

Very cool.

Wednesday, July 15, 2009

annother MS rave

Seems I'm on a Microsoft-bash roll.
I guess this is a trivial thing, but it drives me batty (which isn't as much of a drive as a putt, but that's another issue).

On Windows 2008 server (... note.. the "server" part of that?) all application errrors open up cute little pop up boxes that say really helpful things like "program stopped working, click the button to stop it". And *they* *just* *sit* *there*.

remember the "server" part? Why on earth would anyone want a server poping up little useless error messages to a console that probably doesn't even have a monitor attached to it?
Do they really expect us to pay an operations person to sit on a stool in the server room and watch for the boxes to make sure they get cleared in a timely manner?
What's more, the pop-ups don't even have any options. It's not like the old famous "abort, retry error". It's just "abort".

On a PC? maybe. On a *server*? I think the guys in Redmond need more Starbucks.

And it gets better. How do you stop them from popping up each time? It's a registry setting. Not a check box on the pop up ("click here to prevent this meaningless, stupid action in the future"). No, not a control panel button. No, it's a registry setting that you need to add.

Isn't that splendid?

the MSDN thread is here:

but the correct key is:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\DontShowUI

Even better, this has to be added as a new key with a default value set to 0. Mahvalous.

/end rant

Friday, July 3, 2009

WCF

Does anyone else think that WCF is not much more than a media campaign?
The product itself just seems half-baked.

For example, suppose you want to consume a wcf service.
You end up having to first point to the service to generate the reference to it. Then you have to load the service in a browser or something to get a command line that you'll need. Then you open up a command prompt and use said command line to generate 2 files that you'll need -- one is a class file and the other is a config file.
Then it's back to Visual Studio where you need to open these and add them to your project.

Then you run the service consumer and get "Could not find default endpoint element that references contract"

(or at least, that's what I get). To be fair you don't always get this message. But if you move the service around -- like put it on your local box to write it, then move it to a devel environment to test it, then move it to QA.... good luck. It turns out that in addition to all this stuff, Microsoft generates some hidden files. Now that really irks me. I mean, if they're generating hidden files, why can't they just generate the stupid files I have to go to the dos prompt to get?
These hidden files don't always get regenerated.
In addition, the reference to the service isn't always fully qualified, which can cause issues when you move it. On top of that, studio caches a lot of binary info that it doesn't remove when you change the location of the service.

I've gotten it all to work before. But I keep wondering why Im bothering. I'm really not getting any benefit from this. I feel like MS just added wcf to put one more thing on its glossy slide show, but didn't actually finish the product.
Am I the only one who feels like that?



Wednesday, July 1, 2009

Windows WorkFun

Here's a lovely thing about windows workflow.
Suppose you want to pass in some input paramters.
That's great. And WF supports it! You create a generic dictionary of
like this:

Dictionary<string, object> ret = new Dictionary<string, object>();

now the WF framework will dutifully walk through the Dictionary, and for each string/key, it will use .Net's reflection to look for a public workflow property with the same name.

So if you create a workflow and, in the code of the class, you add:

Public string thisIsAHack

{ get { return "hack";}

set (someVar = value;}

}

then you add a Dictionary entry like this
r.add("thisIsAHack", "someDummyValue");

and if you pass the Dictionary to the workflow , then ("*ding*") the workflow property will be populated.

Ok... on one level that sounds cool.

On another level, it makes me say "what were you guys *THINKING*?"

First of all, there is no validation at all. It's very loosely coupled. So in many ways, it's a throwback to the old JScript days when all variables were "var x =" . There's no strong typing. So you have no clue until runtime whether or not things are going to actually.. you know... work? The least they could have done was provided a tool to do easy type-checkng on the inputs. Or a "is this ok" method we could call before hand that would do a test and return true/false, so we're not stuck with a runtime exception.

But, ok.

The second (minor, I'll admit) thing is that they have to be public properties. The public part I kind of understand (although it throws out data protection for things in the same namespace). But the *property* part annoys me.

I'll be honest. I love properties. I get it. But it drives me *UP A WALL* that the coding standard is:

public string SomeProp

{

get{ return _someProp;} set {_someProp = value; }

}

that code and *ZERO* value and just clutters the source. I mean please. Properties are great if you want to do some special logic with the variables. But come on. What value does it add to put a wrapper on top of the accessor, that just passes and sets the value anyway?

So far, it's not so terrible. But here's the awful part.

Ok... first of all, the workflow classes are "partial" classes. Partial classes are a sin. I hope Microsoft repents from this before God smites the world because of it. Or at least the guys who invented OOA/D smite them.

Honestly, I think Micorsoft has never liked object oriented development. All their products talk about it

I remember when VB4 came out. I still have the box. Right on it, it says that it's an "object oriented RAD" tool. Forget the fact that OO and RAD are nearly polar opposites, VB4 was about as OO as ... well... C. Yeah, it had a keyword "object" but no inheritance or dataprotection, and only had polimorphism because the data was not typed. I mean, please.

We've seen this a lot. Databound Grid controls. Need I say more? That's about as non-OO as you can get. Oh, and MVC? well, it's in the docs, but you have to work pretty hard to do it.

Alright. So partial classes are an abomination. It is an oximoron to anything object based. The whole point of a class is to put all the logic and values together. Partial classes are.. well.. not classes. They're pieces of things. At my last job, I told my team that if I ever ran across one in a code review I'd slap the developer.

(note to self: *remember to breathe*)

Now here's where it gets good. The partial class workflow is a partial to... what? Oh, you can't tell. MS locked that down. So, suppose you're walking along and you add a public (*grumble*) property to your part of the partial class. And suppose there's a name collision of some weird type with the other side of the partial class. Or.. even bettter, suppose there's a name collision with something in the hierarchy tree that.. oh yeah.. that's right, you can't see.

When you dutifully add the item to the input dictionary

r.add("Site", "somesite");

everything is great. But when you create the workflow, you'll get a runtime error with this very helpful exception message :

"Ambiguous match found".

Microsoft very clever doesn't tell you *which* property caused the ambiguous match, by the way. It's similar to some of the SQL server errors I've seen where they tell you there's a data missmatch on an insert, but don't bother to mention which column is actually mis-match.

Gee. thanks guys.

Day1

I thought I'd start this blog as a way to keeping track of all the goofy little technical issues that I encounter and (hopefully) resolve, so that I don't forget the resolution and so that others may share and benefit from my pain and run-on sentences. ..
Enjoy
--kevin