When I starting writing this blog, I was a little unsure as to what to put in it. I knew I wanted it to be mostly programming related, with articles about things that interested me. So I named it my ‘Programming Notebook’ and put into words exactly what I envisaged I’d be putting in it. This is what I put in my About page:

My main intention was for this blog was to be a notebook for myself in a presentable format – an aid to memory of things I’ve found useful on a day-to-day basis, and little snippets I’d like to share. A lot of it won’t be ground-breaking and you’ll almost certainly find it around elsewhere on the net, but it’s presented in a way that I learned it in the hope that others might find it useful.

And later on I started fretting that I was duplicating what already ‘out there’, which is nonsense, because almost everything that anyone writes is already ‘out there’ anyway.

None of this has anything to do with Identity Inserting in SQL Server, but maybe goes some way to explaining why I am writing about something that is well documented ‘out there’.

The reason is that I had to look it up twice. I needed to look it once, and then I forgot, so I needed to look it up again. What I needed was some sort of programming noteb…. hey… I have one!

So here it is:

SET IDENTITY_INSERT [MyTable] ON
INSERT INTO [MyTable] (IDField, Field2, Field3, Field4) values (1,212,2,2)
SET IDENTITY_INSERT [MyTable] OFF

You’ll need to set IDENTITY_INSERT ‘ON’ for the table into which you are inserting, and (for good measure) set it back to ‘OFF’ when you’re done.

You’ll also need to include a field list (which is good practice anyway). The following just won’t work:

INSERT INTO [MyTable] values (1,212,2,2)

So there we are, SQL Server Identity Inserting covered….again.

,

One of the reasons I like Stackoverflow as a place for asking and answering programming questions is that I

  1. Almost always get what I am after
  2. Almost always learn something in the process (in answering and asking)

Case in point with a question I asked regarding hierarchy sorting in SQL Server. The question in full can be found on the adjacent link, but, in a nutshell I was looking at how I might sort a variable length hierarchy field in SQL Server 2000.

So, the field value could be 1. Or it could be 1.2.3.4. Or it could be 1.92.333.14.58.62.7.81231.

Given that I needed to sort by the last digit, then the second-last, etc, etc, AND that the length was variable, I found myself a little stumped as to an elegant solution. So I decided to call upon the generosity of the Stackoverflow community

First-up was a number of fixed length solutions, which while good, didn’t quite help. Then I got some help in SQL 2008, which was unfortunate, as I was wanting it in SQL 2000. And then user Vash popped up with an edit to his original post:

Solutions for 2000, 2005, 2008: Solutions to T-SQL Sorting Challenge here.

Marvellous! Here is the solution (modified sightly) that I adopted:

CREATE FUNCTION dbo.ufn_SortByHierarchy(@s as varchar(500), @delimiter as char(1))
  returns VARBINARY(2500)
AS
BEGIN
  DECLARE @pos as int, @r as varbinary(2500), @element as int;

  SET @r = 0x;
  SET @s = @s + @delimiter;
  SET @pos = charindex(@delimiter, @s);
  while @pos > 0
  BEGIN
	SET @element = cast(left(@s, @pos - 1) AS INT);

	SET @r = @r +
	  CASE
		WHEN @element = -2147483648 THEN 0x0000000000
		WHEN sign(@element) = -1
		  THEN 0x01 + cast(2147483647 + @element AS BINARY(4))
		ELSE 0x02 + cast(@element AS BINARY(4))
	  END;

	SET @s = stuff(@s, 1, @pos, '');
	SET @pos = charindex(@delimiter, @s);
  END

  RETURN @r;

END

And, as it turned out, I’d never heard of the varbinary data type, so right there was some extra reading and knowledge for me.

And that, I thought was it. But no! A short while later, user Martin Smith, popped up with a comment.

@James – Not sure what you ended up using. You can also use parsename for this as in this answer: stackoverflow.com/questions/3057532/

A comment! I’d have voted it up if it were an answer. So I followed the link, and voted that up.

So, I also learned about Parsename, which works in lowly SQL 2000 as well. unfortunately, it only works on a four-digit hierarchy, but still useful for me to know about.

So, lots of useful information, new things learned and a permanently documented solution to my problem. That is why I like Stackoverflow.

, , ,

Not in production environment, anyway!

Recently, I have been involved in implementing functionality that requires a modification to the structure of a SQL Server table. In fact, the nature of the change was a simple addition of a new column to the table in question.

Logic would dictate that that wouldn’t be too problematic. Right? Wrong!

The fact is, that when you use ‘*’ to reference all columns on a table, you’ll make it impossible to perform any reliable impact analysis on your code base. Columns will be used without being explicitly referenced.

If you add more columns to the table in future, who can guarantee that your existing views or stored procedures will actually need them? If you have code in them that performs a ‘Select *’ then they will, regardless. The obvious problem with this is the overhead of returning unwanted data from the database, and the potential degradation of performance inherent therein. However, there is another pitfall surrounding table joins, which I will now attempt to demonstrate:

Imagine you start out with two tables:

create table Test (Id int, code char(1))
create table Test2 (Id int, TestId int, name varchar(10))

And you would like a query that joins these two, so you create a view

create view vwJoinTests as
select t.code, t2.*
from test t
inner join test2 t2 on t.id = t2.testid

You can select from this, no problem

select code from vwJoinTests

A year later, another developer comes along and adds a new column to Test2:

alter table Test2 add code char(1)

Now, the column ‘Code’ exists on both tables.

Selecting from the view is still fine:

select code from vwJoinTests

And, because our original ‘select’ remains intact (it doesn’t break anything), this fault will not be picked up by testing.

But, if I want to reapply the view code, I get the following error:
Column names in each view or function must be unique. Column name ‘code’ in view or function ‘vwJoinTests’ is specified more than once.

Why would I want to reapply the view? Well:

  1. You may want to modify it
  2. You may want to rebuild a database
  3. You may want to install a brand new database on a client site

Technically, 2 and 3 are the same, but I thought I’d emphasise the potential for embarrasment when it comes to your clients.

It also emphasises the point that structural changes to the database, should be accompanied by a full database build test to ensure its integrity has been maintained.

,

Dynamic SQL is, by its nature, very flexible and often very useful. However, it is this flexibility that can be both a blessing and a curse.

The offical Oracle recommendations on Static vs Dynamic SQL state that, in practice, “static SQL will meet nearly all your programming needs.”Dynamic SQL

They go on to give a number of conditions under which they deem it acceptable for you to use dynamic SQL. They suggest it is used when one or more of the following items is unknown at precompile time:

  • text of the SQL statement (commands, clauses, and so on)
  • the number of host variables
  • the datatypes of host variables
  • references to database objects such as columns, indexes, sequences, tables, usernames, and views

Although this may be true, I would prefer to add the caveat “Only if you absolutely cannot find a way to do it in Static SQL”

They also state:

However, some dynamic queries require complex coding, the use of special data structures, and more runtime processing. While you might not notice the added processing time, you might find the coding difficult unless you fully understand dynamic SQL concepts and methods.

So rather than have a long and tedious discussion as to the relative merits, lets just list the advantages and disadvantages in a series of bullet points.

Advantages

  • It gives flexibility and scalability
  • It can reduce the number of lines of code written

Disadvantages

  • It can become very complex and difficult to read. Think about quotes embedded in quotes, and other such things.
  • It can have a detrimental effect on code stability. Some Dynamic SQL errors will not be known until run time. (An example of this is where you reference a non-existent table)
  • Dynamic SQL code is harder to test than the equivalent static SQL. It may also be impossible to test for every possible circumstance that your Dynamic SQL will encounter, thus introducing inherent risk.
  • It will be more difficult to conduct an effective impact analysis on Dynamic SQL in your code-base.
  • SQL injection and misuse – Dynamic SQL is more prone to misuse, and is invariably less safe than static SQL
  • The queries code within Dynamic SQL is not subject to a query plan, and as such optimisations may be missed. As such, it can be slower than the equivalent static SQL
  • As the SQL query is not known until runtime, it can be harder to performance-tune SQL Dynamic code (for example, determining the indexes that might be required on a table)

So there we are. Take that and pin in on a wall somewhere :-)

, , ,

Programmers like new toys that they can play with, be it a second monitor, a new IDE, or a new programming language. They also like to feel that they are learning new things and not getting assigned to the technological scrap-heap and so will inherently favour (for example) C#3 above VB6 even when the final solution might be suboptimal.C#

I found myself drifting into this mindset recently when prototyping a design for component to fit in with an existing framework of COM components.

The component needed to call out to SQL Server 2005 to start an Agent Job. My first instinct was that I had to interface with SQL 2005 Native Client components through the .NET libraries. I envisaged something something like this:

//...get the connectionString somehow...
string connectionString = "Data Source=SOMESERVER;Initial Catalog=SomeDatabase;Integrated Security=SSPI;Provider=SQLNCLI.1;";
OleDbConnection oleDbConnection = new OleDbConnection(connectionString);
ServerConnection serverConnection = new ServerConnection(oleDbConnection.DataSource);
Server oSqlServer = new Server(serverConnection);

JobServer oAgent = oSqlServer.JobServer;
Job oJob = oAgent.Jobs["Test"];
JobHistoryFilter oFilter = new JobHistoryFilter();
oFilter.JobName = "Test";
oJob.Start();

…or something similar therein.

The component would have to expose a COM Interface and be registered using REGASM, which would requrie manual intervension in the shipping process.

In addition to this, I would have neededed to organise for the SQL 2005 Client tools installed on four servers and undertake regression testing of the applications on these servers to inspire confidence that this hadn’t broken anything.

To be frank, it all seemed a little onerous and overcomplicated. Apart from simply wanting to use C#, the reason for this doing this vb6was based on an assumption that I needed the .NET library code to hook into SQL Server and that there was no other way. Really? Have you ever heard such nonsense?

Without realising, I’d very nearly falling into the Second System Effect trap (of sorts). All I really had to do was make a call out to the SQL Server sp_start_job to achieve the same effect, and all I needed was trusty old MDAC 2.8. This meant I could write it in VB6 to fit in with existing framework components with none of the impact or complication inherent with the C#-developed component.

So, I set about writing a little test application as proof of concept. It looked something like this:

Private Sub Command1_Click()
    Dim connection As New ADODB.Connection
    Dim connectionString As String

    connectionString = "Data Source=SOMESERVER;Initial Catalog=SomeDatabase;Integrated Security=SSPI;Provider=SQLOLEDB.1;"

    Call connection.Open(connectionString)

    connection.Execute ("EXEC msdb.dbo.sp_start_job @job_name=N'Test'")

    MsgBox ("Success")
End Sub

But I really wanted to use C#! However,  in this instance it just wasn’t feasible.

So, in conclusion, don’t get blinded by shiny new toys or by a desire to improve oneself, don’t simply dive into the first solution that you come across, and remember, complication is ruination!

, , ,

This was brought to my attention today, and made for interesting reading:

Oracle Database 11g vs. Microsoft SQL Server 2008Oracle 11g

The report claims in its ‘Executive Summary’ that Oracle 11g holds a substantial advantage over SQL Server, and alleges the following high level points:

  • Oracle is faster for DBAs to perform administrative functions
  • Oracle has fewer steps for standards RDBMS tasks
  • Increased Oracle DBA productivity due to lower complexity and higher efficiency
  • Oracle has architectural and functional capabilities for backup and recovery beyond those offered by SQL Server
  • Oracle demonstrates significant time savings with regard to performance diagnostics and tuning

Later in the Rationale section, however is the following statement:

Oracle has sponsored the studies in response to a challenge made by Edison to Oracle claims that Oracle Database was significantly easier and thus less costly to manage than the other major database management systems.

Which, of course, makes me suspicious. Regardless of Edison’s so-called ‘challenge’, the report is still sponsored by Oracle (and published on their web site), and I would have been surprised if this came back in SQL Server’s favour.

Predictably, Microsoft has published a response denouncing these claims. It alleges the following: SQL Server 2008

  • The report has unverifiable metrics and inconsistent interpretation of results
  • The report is not based on ‘real-world; data
  • SQL Licensing Costs are cheaper
  • Oracle databases consistently suffer from more security vulnerabilities
  • Security patching not reflected in true cost of operating Oracle 11g

I’ll not pass comment, but thought it interesting to share!

,

I’ve been checking whether or not service packs have been applied on SQL Server Installations. I keep forgetting the script that I run to check this,  and having to look it up.

So, here it is:

SELECT SERVERPROPERTY(‘productversion’), SERVERPROPERTY (‘productlevel’), SERVERPROPERTY (‘edition’)

This seems to work in SQL 2000, 2005 and 2008

Taken from this Microsoft link:

http://support.microsoft.com/kb/321185

You could also try

SELECT @@version

Which will give you everything (and more) in a single string:

Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86)   Feb  9 2007 22:47:07   Copyright (c) 1988-2005 Microsoft Corporation  Developer Edition on Windows NT 5.2 (Build 3790: Service Pack 2)

Go nuts! Here is a free EBook on Defensive SQL Server Programming from Red Gate

Defensive Database Programming with SQL Server

Alex Kuznetsov

ISBN: 978-1-906434-44-1

This leads us to another interesting article:

The Curse and Blessings of Dynamic SQL

, ,

I love it when I discover something like this…

I’m querying a table in SQL Server that has over 200 million rows. I’m just wanting the row count. So, naturally, I tried the following:

select count(*) from TABLE_NAME

The query took one minute when it was run initially. And select (1) performed no better. (See: SQL count(1) is not better than count(*))

To be fair, I was quite shocked that it took so long to get this value. Surely there was some value stored internally that I could just pluck out in the blink of an eye?

I searched around for this information, and stumbled across this: How can I get sysindexes information in Sql 2005?

So here it is. Takes no time at all to run:

select object_name(i.object_id) as objectName, sum(p.rows) as rowCnt, i.name as indexName
from sys.indexes i
join sys.partitions p on i.object_id = p.object_id
and i.index_id = p.index_id
where i.index_id <= 1
and i.object_id = object_id('TABLE_NAME')
group by i.object_id, i.index_id, i.name

And, if I need, I can strip out the second-last line to get it for all tables.

As it turns out, you can do the equivalent in SQL Server 2000:

select object_name(id), rowcnt
from sysindexes
where indid < 2
and id = object_id('TABLE_NAME')
order by 2

Which will run in 2005 as well, although its legacy, so we must discourage that! You can tell this, because SQL Management Studio doesn’t colour in sysindexs green, but SQL 2000 Query Analyzer does.

,

Learned something really useful today….

The Problem

Recently I encountered a problem whereby an early-morning database merge background process kept overrunning

A series of stored procedures where taking a series of records from some SQL Server tables, and then merging them across to related table, i.e. applying all records marked as ‘inserts’, ‘updates’ and ‘deletes’. In other words:

MERGEDATA_TABLE1 —Merge–> TABLE1
MERGEDATA_TABLE2 —Merge–> TABLE2

On some days, the stored procedures that were merging larger tables were starting to take significantly longer. Merging 200,000 records into a table 180,000,000 in size took 40 seconds (on a pretty high spec server). However, merging 400,000 (double) into the same table took 40 minutes – sixty times as long for twice the number of records.

The Cause

When the stored procedures were applied to the database, a query plan was generated for them. The optimisations for these plans were based on the content of the tables at that time.

In particular, each of the MERGEDATA_ tables would have been empty, so the optimisations and query plans would have been completely unsuitable.

Static query plans are more suited to stored procedures that are run frequently, and that query tables that do not have vast fluctuations in size.

The Solution

Placing WITH RECOMPILE at the top of each stored procedure to force a recompile whenever it was executed seemed fixed the problem.

This was fine as we were only going to call the procedures once a day. If we were going to call them once a second then we may have had to rethink.

Code

CREATE PROCEDURE Table1_Merge

WITH RECOMPILE

AS

,

Have you ever found your PC running slowly after playing around with a particularly large SQL Database on you local machine? Well then, it’s possible that you have left the SQL Server ‘Maximum server memory’ setting at its default.

The default is 2,147,483,647 MB – Yes thats  bytes 2,251,799,812,636,672, or about 2 Petabytes.

SQL Server Configure Maximum Memory Usage

SQL Server Configure Maximum Memory Usage

I want a PC with that amount of memory!

To configure max memory usage for a particular instance of SQL Server 2005/2008 (and possibly later versions as and when they come :-) ), simply adjust this setting in the above dialog, which can be located by right-clicking your database instance in SQL Management Studio, and selecting Properties

Alternatively, you might sneer at the novice user interface shortcut and decide you want to script it:

EXEC sys.sp_configure N'show advanced options', N'1' RECONFIGURE WITH OVERRIDE
EXEC sys.sp_configure N'max server memory (MB)', N'300' RECONFIGURE WITH OVERRIDE
EXEC sys.sp_configure N'show advanced options', N'0' RECONFIGURE WITH OVERRIDE

The above code reconfigures the setting to 300 as shown in the second line.

,

The use of

    select count(1) from mytable

instead of

    select count(*) from mytable

is shrounded in rumour, myth and folklore.

Well, fortunately, Majita Lah at MI Lambda has taken the time to dispel this:
,