www.codinghillbilly.com   kyle.baley.org  Subscribe / Contact
 
 
 
 
LATEST POSTS
Wednesday, February 15, 2006

Time to make this thing interactive. I have a moderately strong opinion on something and need to know the performance ramifications.

The topic is SQL. Sub-topic: inner joins. Very, VERY often, I will see SQL such as:

SELECT   *
FROM     Employee e,
         EmployeeDepartment ed,
         Department d,
         EmployeePosition ep,
         Position p
WHERE    e.EmployeeID = ed.EmployeeID
AND      ed.DepartmentID = d.DepartmentID
AND      e.EmployeeID = ep.EmployeeID
AND      ep.PositionID = p.PositionID
AND      e.EmployeeName = "Bobbie Sue Sister-Wife"

This is, of course, valid SQL. And I hate it. Here is how I would write it:

SELECT     *
FROM       Employee e
INNER JOIN EmployeeDepartment ed
ON         e.EmployeeID = ed.EmployeeID
INNER JOIN Department d
ON         ed.DepartmentID = d.DepartmentID
INNER JOIN EmployeePosition ep
ON         e.EmployeeID = ep.EmployeeID
INNER JOIN Position p
ON         ep.PositionID = p.PositionID
WHERE      e.EmployeeName = "Bobbie Sue Sister-Wife"

Both will return the same results but I like the second much better. The reason is thus: the filter is blatantly obvious. In most cases, I know the inner joins will be for stupid little tables that were included for the sake of normalcy and I can ignore them. I don't want to sift through all that crap in my where clause just to find out what data I really want. This way, I can quickly see where my data is coming from (i.e. the inner joins) and how it is filtered (i.e. the where clause). It's not all munged together for the sake of saving a few keystrokes.

This is especially true for complicated procedures and ones that involve outer joins. I never remember what direction += goes and it's not an easy thing to search for. Enter += into Google (or better yet, Google Groups) and see for yourself.

Here's where the interactive part comes in. Which one performs better? Note that in most cases, I'll go with the second one anyway because I don't think the performance is that much different as to make a noticeable difference either way. And I do prefer maintainable code over more-performant-than-anyone-will-notice code. But if I'm ever doing work on Amazon's database, I'd like to justify my astronomical rate.

So don't be shy. I'll be judging answers based on clarity of thought, support for my position, and potential clients.

Monday, February 13, 2006

The Coding Hillbilly has been through some trials and tribulations of late in the SharePoint world and wishes to turn his misfortune into fortune by increasing his page hits for potential profit. (Oh, don't look so shocked. I gots childun to feed.)

First up is the SharePoint development environment. As in, how should you set one up? That one's easy. Bil Simser's already described my ideal way. You set up a virtual machine on your computer with everything you need and have at 'er. But individual virtual machines are not always an option so if you're up for reading about second best in my world, let's chat.

The basic idea is that there is a central SharePoint server with several portals on it, preferably one per developer so that they aren't stepping on each others' toes. You can use the same host header for each portal but change the port numbers. This has caused some issues for when I needed to run the SharePoint Restore utility on it but it's usually easier than trying to rip a new host header out of the Grand Poobah of the DNS in your organization.

You create the portals like you would any other but I'll suggest not including developer names in the names of the portals. Developers come and go and you never know if Billy Bob isn't quite stable enough mentally to handle working on a portal called Bobby Jo's Moonshine Portal after Bobby Jo gets arrested for shirking his child support duties. I generally use the same name and add the port number to the end.

Once you have created your portals, assign a developer to each one and they can go about their business pretty much in the manner described in almost every book or article I've read about developing on SharePoint. I do have one suggested deviation, though.

Usually, the documentation says you should configure Visual Studio to compile the project to the server where the assemblies reside. I'm not a fan of this because I usually have references to at least half a dozen assemblies that never change and I'm sick of waiting the extra 2.6 seconds for them to copy over when I do a full re-build. So instead, I leave it set to compile to Debug\bin and set up post-build events on the projects that actually do change. The events are pretty simple: copy MyWebPart.dll Z:\bin (assuming, of course, Z: is mapped to your particular portal's location). One advantage to this is that each developer can map his or her Z: drive to a separate portal and the build event doesn't need to change.

Another suggestion: you'll probably want a separate portal to act as the main holding cell for the latest version of the entire project. It's essentially a build server.

There is a potential disadvantage to this if SQL Server is not on the same server: Going back to square one isn't as easy as restoring a virtual machine's snapshot. Theoretically, you could take an image of the server and a backup of the config database and all the portal databases and restore them all if the worst happens. You could also theoretically restore only the three databases associated with a single portal if only one portal has failed. You could also theoretically go backwards in time to the point where you screwed up your portal which, from my own experience with SharePoint, would have a higher probability of success.

So a lot of jawin' just to say "create one portal for each developer" but it's a setup to a post hopefully in the near future where I talk about how to copy the contents of one portal to another on the same server. Hint: I've seen more automation in cave drawings.

Tuesday, February 07, 2006

Some of you may be assuming from my lack of prolificacy on this blog that I am a lazy programmer. After all, if I'm to be taken seriously, I should put some effort into unleashing my vast hillbilly coding knowledge on the world.

Well, if you assumed I am a lazy programmer, you would be right. And I'll tell you why that's a good thing. Better yet, I'll encourage everyone reading this to be just as slack and shiftless as I am. So gather around, fellow idlers, and learn how you, too, can be a better developer by being lazy.

The main tenet of this new religion is thus: Somebody's already done it. That's right, hotshot. The problem you've been banging your head against a wall about the last three days? It's been fixed already. Three times over. In six languages. All you have to do is find it.

As a lazy programmer, you should be thinking not, "How can I solve this problem?" but "Where can I find someone else's solution to this problem?" And with sites like Google Groups, there is no excuse for not being able to find it. You need to do custom paging on a DataGrid? Find out how to query a CMS SDO package? Build a web part to index your porn collection? Trust me, it's been done. And better than you could have done it yourself, kiddo.

Now it's not always easy to find these solutions. No one ever said being lazy would be easy. The skills you will need to hone are how to make judicious use of quotes in Google and how to break down your issue into problems that already have solutions. Luckily, these are generally skills that can be learned. My favorite method is by hovering over someone smarter than I am while he or she solves a problem (usually one that I've given them).

The good part is that when you learn these skills, the amount of code you have to actually write is reduced drastically. Yesterday, I mentioned at work that I hadn't written a single line of code in three years. They all had a hearty laugh and said, "Oh, Coding Hillbilly, you're such a card." They'll come around...

I'll close with some some Windows configurations that have made me more productive at being lazy. Firstly, right-click your task bar and select Toolbars | Address. Expand it a bit and voila. You never need to click on the IE/Firefox icon again. For IE, you may want to go into Internet Options | Advanced and deselect the option to reuse windows, too.

Secondly, download TweakUI. It's got more features than any lazy programmer worth his salt would ever want to go through but the one I want to mention is Internet Explorer | Search. In that section of the powertoy, you can assign key words to websites you commonly search. I'll let you work through the details but the end result is, when I want to find, say, a long-lost relative, I type the following into the address bar on my task bar: google "coding hillbilly" "sister-wife" and up comes Google with that search already entered. I have similar keywords set up for Google Groups, IMDb, and MSDN (which I use Google for because frankly, it does a better job of searching the MSDN site than Microsoft does).

I could give you more details on how to set it up but I'm tired of talking to you people. Ping me if you're interested.

Coding Hillbilly Out!

Tuesday, February 07, 2006

Just had an interesting discussion with a couple of people that merits repeating here, if only because more people agreed with me than not. Which means less chance of me having to come up with credible counter-arguments for people who comment.

SQL functions or .NET expression columns. That was the topic under debate. The specific example I gave was thus: The database contains a field called fileSize which is an integer and represents the number of bytes for a file. On a web page, we want to display the size in Mb or Kb, which ever makes more sense depending on the size of the file.

So in our stored procedure, do we return fileSize as-is or do we return convertToBeautifulShinyString( fileSize ) as fileSize? The answer, clearly, is the former and I'll tell you why:

SQL should be stupid. I mean really stupid. "Walk through a screen door" stupid. "Let's start a war we can't finish" stupid. Well, maybe not THAT stupid...

Ask your average developer to name four SQL keywords and you will invariably hear: SELECT, INSERT, UPDATE, DELETE. That, to most developers, is what SQL does. It doesn't do CAST or DECODE or CASE or CONVERT. It gets data, adds data, updates data, and deletes data.

There are advantages too numerous to mention. I love saying that because you can sound correct without having to back up your claims. But I'll name a few anyway.

Firstly, debugging SQL sucks. When was the last time you right-clicked on convertToBeautifulShinyString and selected "Go to definition"? (To be fair, you might be able to do it in Toad but I have only one monitor so I usually can't see the entire right-click menu.) The error messages are often cryptic and you get even less help when you get results that aren't the ones you want.

Secondly, what if I want to sort on fileSize? Standard argument theory says I should pad this point out a little but I think it's obvious so e-mail me if you need further explanation.

Thirdly, what the hell is the database doing formatting data for display anyway? How does it know how the data is being displayed? How does it know what language to display the data in? Why does it care?

I'll borrow a phrase from one of the people that sided with me during the debate: It just feels wrong. It raises my finely-honed "hacker" alert.

So we return the fileSize and let the code butcher it any way it wants. Before .NET, this usually meant code that also raised my "hacker" alert because I can't stand looping through a recordset just to manipulate data. Now, though, it can be done with a single line of code:

myDataSet.Tables[0].Columns.Add( "displayFileSize" ).Expression = "IIF ( fileSize > 1000000....oh figure it out yourself, ya lazy bastard )";

To cover my ass, I will mention that this isn't an absolute rule. There are cases where I'll concede that using a SQL function or some simple SQL calculation is acceptable. An extended total for an invoice line, for example. Or even some simple business logic. But anything that involves string concatenation should make you as nervous as your average Canadian voter.

Tuesday, February 07, 2006

The article is coming. Keep your pants on. The Coding Hillbilly wishes to opine on performance first.

There are a lot of articles and essays on performance. Judging from my inbox, my own performance could use some work but I prefer less humbling performance problems so I'm sticking to coding.

My overall opinion is that there is an over-emphasis on performance. For example, the Database class in the Microsoft's Enterprise Library Data application block has a method called GetStoredProcCommandWrapperWithSourceColumns. (NOTE: This is a new method in the June 2005 update). The method allows you to create a stored procedure command wrapper without having to explicitly create the parameters. Instead, you specify a string array of column names from the DataTable you plan to update.

Here's an example of how it is used:


 Database database = DatabaseFactory.CreateDatabase( );

 DBCommandWrapper insertCommand =
  database.GetStoredProcCommandWrapperWithSourceColumns(
   "add_suspected_pappy", new string[]
   { "FirstName", "LastName", "SuspectedOccupation",
   "AlternateRelationshipToMother" } );

 database.UpdateDataSet( myDataSet,
  "Pappies", insertCommand, null, null, UpdateBehavior.Standard );

In this example, we are updating a database with the information stored in a DataTable (called Pappies) in a DataSet. For those of you familiar with the UpdateDataSet method, we are assuming that the table contains only inserted or unmodified rows. For the rest of you, I'll cover this in a later article. For now, assume that the UpdateDataSet call will loop through each row in the table and execute the add_suspected_pappy stored procedure on that row if it has a RowState of Added.

There is a lot going on in this code behind the scenes but the relevant point is that there are at least two calls to the database: one to retrieve information about the parameters of the add_suspected_pappy stored procedure, and one to execute the stored procedure.

It has been argued many times to me that this is inefficient. That we should create the parameters on the command wrapper manually. We know what the parameters are and we could save valuable time doing this work ourselves. Well, I say NAY!

The part I disagree with is "valuable time", and in particular "valuable". It is true that we will save time database access time writing this code ourselves. I argue that the time we save is negligible. Database access is pretty fast these days, especially since the connections are usually cached.

In this case, it doesn't matter much either way. There are only four parameters and it wouldn't be much of a stretch to add insertCommand.AddInParameter calls for them. And even with two dozen parameters, you could cut and paste your merry way through them in no time, although it does add the possibility of human error in.

The point is that the time you save in this case is probably not enough to justify the extra code. There could be other justifications for writing it out longhand but on performance alone, I don't buy it.

Put another way, if the entire user action that uses this code has a noticeable delay, the problem will not be with GetStoredProcCommandWrapperWithSourceColumns.

Nitpicking on performance issues like this harkens back to the old days when storing dates in two numbers made sense. Nowadays, there are other sources for performance problems: network connections, whether you are retrieving too much data for what you need, integration with other applications. In most cases, these are the things that take time, not whether you should convert a Bubble Sort to Quick Sort. The problem is that these aren't very interesting issues and they aren't ones you learn about in Data Structures 101.

And I don't want to suggest that you should blatantly flaunt obvious performance flaws. But in the heat of a deadline, don't over-exert yourself flipping through algorithm texts for the "optimal" solution. Chances are, your server is fast enough that you'll never know the difference.

So how do you know when to use something like GetStoredProcCommandWrapperWithSourceColums and when to create the parameters yourself? That's the subject for another op-ed piece.

Until then, the Coding Hillbilly reminds you not to drink and code.

Tuesday, February 07, 2006
A formal introduction to The Coding Hillbilly. Consider this the technical arm of The Hillbaley Ho Down and Extravaganza. In these virtual walls, we'll leave the yellow journalism of Dirty Sanchez behind and explore topics in software development, .NET, and other imponderables. You'll agree it's a natural spin-off.
 
So join me, Ramone "Hot" Carl, former editorial reporter for The Ho Down, as we discuss the nature of homemade stills and peyote binges in terms of objects, XML, and other topics designed to maximize search engine hits. My experience is in .NET, CMS, and recently, SharePoint, and you will be surprised how conducive those topics are to applications involving backwater gin.
 
I will close with a promise that any comments containing the phrase "quick question" will be beaten.
 
Coming soon eventually: Using AJAX to search for potential speed-traps on common rum-running routes.

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Copyright © 2010 Kyle Baley. All rights reserved.
 
CATEGORIES
.NET General (18) alt.net (4) altnetconf (9) ASP.NET AJAX (40) ASP.NET MVC (29) Bahamas (1) Bahanet (9) BDD (1) Brownfield (18) Career (9) Castle (1) Code coverage (1) Coding Style (6) Communication (1) Community (18) Conscientious Coding (34) Continuous Integration (11) dasBlog (12) Development (16) DevTeach (4) Domain (2) Environment (4) Estimating (1) Featured (14) Flamingo (10) Games (1) Google App Engine (2) GWT (5) Hardware (6) Java (1) Javascript (7) Linq (2) Livelink (6) Lucene.NET (2) MbUnit (1) Metrics (1) Miscellaneous (24) Mocking (4) NAnt (4) NHibernate (12) NInject (1) Office (3) Office Development (6) Open Rasta (1) Patterns (5) Presenting (13) Professional Development (15) Refactoring (10) ReSharper (11) REST (2) S#arp Architecture (5) Security (3) Software (11) Sundry (18) TDD (19) Tools (21) User Interface (5) Utilities (8) Visual Studio (8) VSTO (1) Web development (12) Windows (3) Working Remotely (16) Workplace (3) Writing (4)
 
LATEST POSTS
 
POPULAR POSTS
 
 
ARCHIVE