One of the reasons I like Stackoverflow as a place for asking and answering programming questions is that I

  1. Almost always get what I am after
  2. Almost always learn something in the process (in answering and asking)

Case in point with a question I asked regarding hierarchy sorting in SQL Server. The question in full can be found on the adjacent link, but, in a nutshell I was looking at how I might sort a variable length hierarchy field in SQL Server 2000.

So, the field value could be 1. Or it could be 1.2.3.4. Or it could be 1.92.333.14.58.62.7.81231.

Given that I needed to sort by the last digit, then the second-last, etc, etc, AND that the length was variable, I found myself a little stumped as to an elegant solution. So I decided to call upon the generosity of the Stackoverflow community

First-up was a number of fixed length solutions, which while good, didn’t quite help. Then I got some help in SQL 2008, which was unfortunate, as I was wanting it in SQL 2000. And then user Vash popped up with an edit to his original post:

Solutions for 2000, 2005, 2008: Solutions to T-SQL Sorting Challenge here.

Marvellous! Here is the solution (modified sightly) that I adopted:

CREATE FUNCTION dbo.ufn_SortByHierarchy(@s as varchar(500), @delimiter as char(1))
  returns VARBINARY(2500)
AS
BEGIN
  DECLARE @pos as int, @r as varbinary(2500), @element as int;

  SET @r = 0x;
  SET @s = @s + @delimiter;
  SET @pos = charindex(@delimiter, @s);
  while @pos > 0
  BEGIN
	SET @element = cast(left(@s, @pos - 1) AS INT);

	SET @r = @r +
	  CASE
		WHEN @element = -2147483648 THEN 0x0000000000
		WHEN sign(@element) = -1
		  THEN 0x01 + cast(2147483647 + @element AS BINARY(4))
		ELSE 0x02 + cast(@element AS BINARY(4))
	  END;

	SET @s = stuff(@s, 1, @pos, '');
	SET @pos = charindex(@delimiter, @s);
  END

  RETURN @r;

END

And, as it turned out, I’d never heard of the varbinary data type, so right there was some extra reading and knowledge for me.

And that, I thought was it. But no! A short while later, user Martin Smith, popped up with a comment.

@James – Not sure what you ended up using. You can also use parsename for this as in this answer: stackoverflow.com/questions/3057532/

A comment! I’d have voted it up if it were an answer. So I followed the link, and voted that up.

So, I also learned about Parsename, which works in lowly SQL 2000 as well. unfortunately, it only works on a four-digit hierarchy, but still useful for me to know about.

So, lots of useful information, new things learned and a permanently documented solution to my problem. That is why I like Stackoverflow.

, , ,

Not in production environment, anyway!

Recently, I have been involved in implementing functionality that requires a modification to the structure of a SQL Server table. In fact, the nature of the change was a simple addition of a new column to the table in question.

Logic would dictate that that wouldn’t be too problematic. Right? Wrong!

The fact is, that when you use ‘*’ to reference all columns on a table, you’ll make it impossible to perform any reliable impact analysis on your code base. Columns will be used without being explicitly referenced.

If you add more columns to the table in future, who can guarantee that your existing views or stored procedures will actually need them? If you have code in them that performs a ‘Select *’ then they will, regardless. The obvious problem with this is the overhead of returning unwanted data from the database, and the potential degradation of performance inherent therein. However, there is another pitfall surrounding table joins, which I will now attempt to demonstrate:

Imagine you start out with two tables:

create table Test (Id int, code char(1))
create table Test2 (Id int, TestId int, name varchar(10))

And you would like a query that joins these two, so you create a view

create view vwJoinTests as
select t.code, t2.*
from test t
inner join test2 t2 on t.id = t2.testid

You can select from this, no problem

select code from vwJoinTests

A year later, another developer comes along and adds a new column to Test2:

alter table Test2 add code char(1)

Now, the column ‘Code’ exists on both tables.

Selecting from the view is still fine:

select code from vwJoinTests

And, because our original ‘select’ remains intact (it doesn’t break anything), this fault will not be picked up by testing.

But, if I want to reapply the view code, I get the following error:
Column names in each view or function must be unique. Column name ‘code’ in view or function ‘vwJoinTests’ is specified more than once.

Why would I want to reapply the view? Well:

  1. You may want to modify it
  2. You may want to rebuild a database
  3. You may want to install a brand new database on a client site

Technically, 2 and 3 are the same, but I thought I’d emphasise the potential for embarrasment when it comes to your clients.

It also emphasises the point that structural changes to the database, should be accompanied by a full database build test to ensure its integrity has been maintained.

,

Dynamic SQL is, by its nature, very flexible and often very useful. However, it is this flexibility that can be both a blessing and a curse.

The offical Oracle recommendations on Static vs Dynamic SQL state that, in practice, “static SQL will meet nearly all your programming needs.”Dynamic SQL

They go on to give a number of conditions under which they deem it acceptable for you to use dynamic SQL. They suggest it is used when one or more of the following items is unknown at precompile time:

  • text of the SQL statement (commands, clauses, and so on)
  • the number of host variables
  • the datatypes of host variables
  • references to database objects such as columns, indexes, sequences, tables, usernames, and views

Although this may be true, I would prefer to add the caveat “Only if you absolutely cannot find a way to do it in Static SQL”

They also state:

However, some dynamic queries require complex coding, the use of special data structures, and more runtime processing. While you might not notice the added processing time, you might find the coding difficult unless you fully understand dynamic SQL concepts and methods.

So rather than have a long and tedious discussion as to the relative merits, lets just list the advantages and disadvantages in a series of bullet points.

Advantages

  • It gives flexibility and scalability
  • It can reduce the number of lines of code written

Disadvantages

  • It can become very complex and difficult to read. Think about quotes embedded in quotes, and other such things.
  • It can have a detrimental effect on code stability. Some Dynamic SQL errors will not be known until run time. (An example of this is where you reference a non-existent table)
  • Dynamic SQL code is harder to test than the equivalent static SQL. It may also be impossible to test for every possible circumstance that your Dynamic SQL will encounter, thus introducing inherent risk.
  • It will be more difficult to conduct an effective impact analysis on Dynamic SQL in your code-base.
  • SQL injection and misuse – Dynamic SQL is more prone to misuse, and is invariably less safe than static SQL
  • The queries code within Dynamic SQL is not subject to a query plan, and as such optimisations may be missed. As such, it can be slower than the equivalent static SQL
  • As the SQL query is not known until runtime, it can be harder to performance-tune SQL Dynamic code (for example, determining the indexes that might be required on a table)

So there we are. Take that and pin in on a wall somewhere :-)

, , ,

Programmers like new toys that they can play with, be it a second monitor, a new IDE, or a new programming language. They also like to feel that they are learning new things and not getting assigned to the technological scrap-heap and so will inherently favour (for example) C#3 above VB6 even when the final solution might be suboptimal.C#

I found myself drifting into this mindset recently when prototyping a design for component to fit in with an existing framework of COM components.

The component needed to call out to SQL Server 2005 to start an Agent Job. My first instinct was that I had to interface with SQL 2005 Native Client components through the .NET libraries. I envisaged something something like this:

//...get the connectionString somehow...
string connectionString = "Data Source=SOMESERVER;Initial Catalog=SomeDatabase;Integrated Security=SSPI;Provider=SQLNCLI.1;";
OleDbConnection oleDbConnection = new OleDbConnection(connectionString);
ServerConnection serverConnection = new ServerConnection(oleDbConnection.DataSource);
Server oSqlServer = new Server(serverConnection);

JobServer oAgent = oSqlServer.JobServer;
Job oJob = oAgent.Jobs["Test"];
JobHistoryFilter oFilter = new JobHistoryFilter();
oFilter.JobName = "Test";
oJob.Start();

…or something similar therein.

The component would have to expose a COM Interface and be registered using REGASM, which would requrie manual intervension in the shipping process.

In addition to this, I would have neededed to organise for the SQL 2005 Client tools installed on four servers and undertake regression testing of the applications on these servers to inspire confidence that this hadn’t broken anything.

To be frank, it all seemed a little onerous and overcomplicated. Apart from simply wanting to use C#, the reason for this doing this vb6was based on an assumption that I needed the .NET library code to hook into SQL Server and that there was no other way. Really? Have you ever heard such nonsense?

Without realising, I’d very nearly falling into the Second System Effect trap (of sorts). All I really had to do was make a call out to the SQL Server sp_start_job to achieve the same effect, and all I needed was trusty old MDAC 2.8. This meant I could write it in VB6 to fit in with existing framework components with none of the impact or complication inherent with the C#-developed component.

So, I set about writing a little test application as proof of concept. It looked something like this:

Private Sub Command1_Click()
    Dim connection As New ADODB.Connection
    Dim connectionString As String

    connectionString = "Data Source=SOMESERVER;Initial Catalog=SomeDatabase;Integrated Security=SSPI;Provider=SQLOLEDB.1;"

    Call connection.Open(connectionString)

    connection.Execute ("EXEC msdb.dbo.sp_start_job @job_name=N'Test'")

    MsgBox ("Success")
End Sub

But I really wanted to use C#! However,  in this instance it just wasn’t feasible.

So, in conclusion, don’t get blinded by shiny new toys or by a desire to improve oneself, don’t simply dive into the first solution that you come across, and remember, complication is ruination!

, , ,

This was brought to my attention today, and made for interesting reading:

Oracle Database 11g vs. Microsoft SQL Server 2008Oracle 11g

The report claims in its ‘Executive Summary’ that Oracle 11g holds a substantial advantage over SQL Server, and alleges the following high level points:

  • Oracle is faster for DBAs to perform administrative functions
  • Oracle has fewer steps for standards RDBMS tasks
  • Increased Oracle DBA productivity due to lower complexity and higher efficiency
  • Oracle has architectural and functional capabilities for backup and recovery beyond those offered by SQL Server
  • Oracle demonstrates significant time savings with regard to performance diagnostics and tuning

Later in the Rationale section, however is the following statement:

Oracle has sponsored the studies in response to a challenge made by Edison to Oracle claims that Oracle Database was significantly easier and thus less costly to manage than the other major database management systems.

Which, of course, makes me suspicious. Regardless of Edison’s so-called ‘challenge’, the report is still sponsored by Oracle (and published on their web site), and I would have been surprised if this came back in SQL Server’s favour.

Predictably, Microsoft has published a response denouncing these claims. It alleges the following: SQL Server 2008

  • The report has unverifiable metrics and inconsistent interpretation of results
  • The report is not based on ‘real-world; data
  • SQL Licensing Costs are cheaper
  • Oracle databases consistently suffer from more security vulnerabilities
  • Security patching not reflected in true cost of operating Oracle 11g

I’ll not pass comment, but thought it interesting to share!

,

I’ve been checking whether or not service packs have been applied on SQL Server Installations. I keep forgetting the script that I run to check this,  and having to look it up.

So, here it is:

SELECT SERVERPROPERTY(‘productversion’), SERVERPROPERTY (‘productlevel’), SERVERPROPERTY (‘edition’)

This seems to work in SQL 2000, 2005 and 2008

Taken from this Microsoft link:

http://support.microsoft.com/kb/321185

You could also try

SELECT @@version

Which will give you everything (and more) in a single string:

Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86)   Feb  9 2007 22:47:07   Copyright (c) 1988-2005 Microsoft Corporation  Developer Edition on Windows NT 5.2 (Build 3790: Service Pack 2)

Go nuts! Here is a free EBook on Defensive SQL Server Programming from Red Gate

Defensive Database Programming with SQL Server

Alex Kuznetsov

ISBN: 978-1-906434-44-1

This leads us to another interesting article:

The Curse and Blessings of Dynamic SQL

, ,

I love it when I discover something like this…

I’m querying a table in SQL Server that has over 200 million rows. I’m just wanting the row count. So, naturally, I tried the following:

select count(*) from TABLE_NAME

The query took one minute when it was run initially. And select (1) performed no better. (See: SQL count(1) is not better than count(*))

To be fair, I was quite shocked that it took so long to get this value. Surely there was some value stored internally that I could just pluck out in the blink of an eye?

I searched around for this information, and stumbled across this: How can I get sysindexes information in Sql 2005?

So here it is. Takes no time at all to run:

select object_name(i.object_id) as objectName, sum(p.rows) as rowCnt, i.name as indexName
from sys.indexes i
join sys.partitions p on i.object_id = p.object_id
and i.index_id = p.index_id
where i.index_id <= 1
and i.object_id = object_id('TABLE_NAME')
group by i.object_id, i.index_id, i.name

And, if I need, I can strip out the second-last line to get it for all tables.

As it turns out, you can do the equivalent in SQL Server 2000:

select object_name(id), rowcnt
from sysindexes
where indid < 2
and id = object_id('TABLE_NAME')
order by 2

Which will run in 2005 as well, although its legacy, so we must discourage that! You can tell this, because SQL Management Studio doesn’t colour in sysindexs green, but SQL 2000 Query Analyzer does.

,