How should I test a field against an SQL variable which can be null?

406 Views Asked by At

I've got the following SQL :

CREATE TABLE tbFoo(
    a varchar(50) NULL,
) 


CREATE NONCLUSTERED INDEX IX_tbFoo_a ON tbFoo
(
    a ASC
)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON)

insert into tbFoo select null
insert into tbFoo select 'test'

The two following queries work fine and use my index as expected :

select * from tbFoo where a='test'
select * from tbFoo where a is null

Now, let's pretend I want to store my comparison value in a variable, like this :

declare @a varchar(50)
select @a = NULL

The following query won't return the expected results if @a is null because I should use the "is" operator rather than "="

select * from tbFoo where a=@a 

The following will work but will do a table scan if @a is null (because of the 'test' row which forces the evaluation of the second parenthesis)

select * from tbFoo where (a is null and @a is null) or (a=@a)

Eventually, I've came up with this solution, which works fine and uses my index :

select * from tbFoo where (a is null and @a is null) or (@a is not null and a=@a)

Is my analysis of the situation correct?

Is there a better way to handle this situation ?

8

There are 8 best solutions below

1
richardtallent On

Nothing is ever "equal to" NULL... which is sort of the point of NULL.

Your solution will work fine. I'm surprised at how the query optimizer handles the shorter version. I would think that testing a for NULL before testing equality with a table scan would be a no-brainer.

0
Vincent Buck On

Eventually, I came up with this solution, which works fine and uses my index :

In SQL Server 2008, you can define a filtered index based on a predicate that excludes NULLs:

CREATE UNIQUE NONCLUSTERED INDEX IX_tbFoo_a 
ON tbFoo (a)
WHERE a IS NOT NULL;
0
Praesagus On

This what I do. It's very flexible. I am assuming that @a is an argument for the sproc. 'somethingweird' can be something that you will never see in your recordset '~~~' or whatever.

set @a = isnull(@a,'somethingweird')
select * from tbFoo where isnull(a,'somethingweird')=@a
0
SQLMenace On

another possibility is using setting ansi nulls to off

set ansi_nulls off

declare @a varchar(50)
select @a = NULL

select * from tbFoo where a=@a

set ansi_nulls on

Just keep in mind that you are breaking away from the default behavior here

0
MatBailie On

I don't have an instance at home to play with, but I can see the table scan getting very annoying. A possible alternative is to use UNION in place of the OR operator...

select * from tbFoo where (a is null and @a is null)
UNION ALL
select * from tbFoo where (a=@a and @a is not null)

(I'm not sure exactly what effect the "@a is not null" will have on performance, but my gut feel would be to include it. It's a constant expression that should allow the optimiser to know when the whole condition always fails. My technique is always to play and see what works best.)

I find this UNION trick to have two properties:
- It can significantly improve performance by simplfying queries
- It balloons code with multiple joins and cause major maintenance headaches

But then, life is jus a balancing act :)

1
Jon Erickson On

Just ISNULL both sides like so...

DECLARE @random VARCHAR(50)
SELECT  @random = 'text that never appears in your table'

SELECT * FROM @tbFoo WHERE ISNULL(a, @random) = ISNULL(@a, @random)
0
StriplingWarrior On

Perhaps your database engine optimizes what you've got automatically, but it seems to me like the following would be more efficient:

if @a IS NULL
    select * from tbFoo where a is null
else
    select * from tbFoo where a = @a

My reasoning for this is that you would perform the if @a IS NULL condition only once, rather than checking it for every line in the database. Again, though, a quality database engine should be able to convert your code into the same kind of data plan as this.

0
Jonathan Leffler On

Your analysis is correct - and is why 3-valued logic makes life difficult.

The suggestion from @StriplingWarrior is good; it finesses the problem by executing different SQL depending on whether the variable is null or not. Where that is not possible, your long-winded solution which repeatedly uses the host variable is necessary.