Here is the set up, question is below.
CREATE TABLE Person (
PersonId int Identity(1,1) not null,
PersonName varchar(20) not null
PRIMARY KEY CLUSTERED (PersonId)
)
GO
CREATE TABLE Size (
SizeId int IDENTITY (1,1) not null,
Size varchar(20) null,
PersonId int NOT NULL
PRIMARY KEY CLUSTERED (SizeId)
)
ALTER TABLE dbo.Size
ADD CONSTRAINT FK_Size_Person FOREIGN KEY (PersonId) REFERENCES dbo.Person (PersonId)
GO
INSERT INTO Person (PersonName)
VALUES ('Bob'), ('Bill'), ('Dave'), ('Nathan'), ('Luke')
GO
INSERT INTO Size (Size, PersonId)
VALUES ('32', 1), ('2XL', 1), ('Large', 1), ('Small', 2), ('Medium', 3)
CREATE FUNCTION [dbo].[fn_Sizes]
(
@PersonId int
)
RETURNS varchar(150)
AS
BEGIN
DECLARE @sizes varchar(150)
SET @sizes = (SELECT s.Size + ', '
FROM dbo.Size s
WHERE s.PersonId = @PersonId
FOR XML PATH(''))
-- Return the sizes
RETURN ISNULL(LEFT(@sizes, LEN(@sizes)-1),'')
END
Fixing a view and one of the issues is a function that combines multiple rows of values into one column. I'm not sure if the problem is the DISTINCT keyword, or the function, or the use of FOR XML PATH, or all of it. I am optimizing a view that takes 15 mins to run, and the execution plan says 90% of the work is in the DISTINCT sort. So I am looking for a way to get rid of it.
The actual view query is really big, but an example of this section is:
SELECT DISTINCT p.PersonName, dbo.fn_Sizes(p.PersonId) as Size
FROM dbo.Person p
JOIN dbo.Size s ON p.PersonId = s.PersonId
The DISTINCT is in there to get rid of the extra Bob rows, because Bob has many sizes. I need to keep the multiple sizes on Bob's row, but I don't want multiple rows of Bob. Is there a more efficient way to do that other than using DISTINCT?
Second, word on the street is that functions are best avoided if you are going for speed. I don't see a way to use "FOR XML PATH" outside the function, because I can't get it to work on just one column. So I am looking for a way to produce the multiple values on one row.