Fun with SQL Joins (for us Data Nerds)
Howdy all. I keep looking for inspiration for these articles in my every day job duties, and since I’ve been up to my eyeballs in SSRS reporting for the last several weeks, I’ve found something that might be a little interesting, and possibly even a bit fun, for the more technically inclined among you.
For those with no direct experience with these things, “SSRS” stands for SQL Server Reporting Services. It’s essentially the built-in reporting functionality for SQL Server. It’s generally installed with SQL Server itself, and can be used by anyone with the proper access and credentials for the database.
When creating SSRS reports, one of the keys to making your report work is the creation of your datasets, from which the report itself will pull and display the data you’re attempting to report. The SSRS report in question may only have one of these datasets, but in my experience at least, most of them have several.
Regardless of the number of datasets, however, each dataset in a given SSRS report is populated via query, and most queries depend on something called JOINS to link multiple tables in the database together in order to get data out that both makes sense as meets whatever criteria you’re actually trying to specify in the report.
For example, a very, very simple query with a join in it might looks something like this:
SELECT *
FROM Transactions
JOIN Users ON Transactions.Username = Users.Username
WHERE <meets some condition>
Translated into English, what that’s saying is: ”Show me all of the data from the Transactions and Users Tables, linked together on the Username field, where the data meets the specified condition”.
All fields in the Transactions and Users tables will be returned in the result set for all the records that meet the conditions.
This is a lot of data.
What I started running into in the SSRS reports recently was a need to speed up report generation by being a bit more selective in my SQL Joins. And interestingly enough, it turns out that you don’t actually have to join TABLES. Any data object will do just fine, as long as the syntax is correct, and as long as you reference it properly.
So rather than referencing tables, you can instead reference named-subqueries. A simple example of this would be to replace the Transactions and Users table references in the above query with named subqueries. So instead of “Transactions”, use a more targeted Transactions query and name that “TransactionData”, and then do the same for “Users”, named “UsersData”, like this:
SELECT *
FROM
(Select Username, TransactionNumber
FROM Transactions
Where DocumentType = 'Dissertation')
AS TransactionData
JOIN
(Select Username, Status, Department
FROM Users
WHERE Status = 'Faculty')
AS UsersData
ON TransactionData.Username = UsersData.Username
WHERE <meets some condition>
This new query is essentially the same, but the data being returned is limited to ONLY the fields we need to find rather than returning both entire tables. Less data means less time and effort for the report to manipulate and display everything.
And of course, once you realize that you can drop in any object for that JOIN in place of the table, you can get really complicated to get at that pesky data you need, like this:
SELECT DISTINCT
TransactionDataSet.Username AS [Users UserName]
,TransactionDataSet.Status
,TransactionDataSet.Department
,TransactionDataSet.Organization
,(CASE WHEN TransactionDataSet.AuthType = 'Aeon' THEN 'Local' ELSE TransactionDataSet.AuthType END) AS AuthType
,TransactionDataSet.State
,TransactionDataSet.Zip
,TransactionDataSet.Country
,ReadingRoomHistoryDataSet.Location
,ReadingRoomHistoryDataSet.ReadingRoom
FROM
(SELECT DISTINCT Users.Username, Users.[Status], Users.Department, Users.Organization, Users.AuthType, Users.[State], Users.Zip, Users.Country, Transactions.TransactionNumber
FROM
Transactions
INNER JOIN Users
ON Transactions.Username = Users.UserName
INNER JOIN Tracking
ON Tracking.TransactionNumber = Transactions.TransactionNumber
INNER JOIN Queues
ON Tracking.ChangedTo = Queues.ID
LEFT JOIN TransactionAccountLinks
ON Transactions.TransactionNumber = TransactionAccountLinks.TransactionNumber
LEFT JOIN Accounts
ON TransactionAccountLinks.AccountID = Accounts.ID
WHERE
Transactions.DocumentType = @DocumentType
AND Tracking.ChangedDate BETWEEN dbo.ufn_Aeon_ConvertLocalDateTimeToUTC (@BeginDate) AND dbo.ufn_Aeon_ConvertLocalDateTimeToUTC (DATEADD(day, 1, @EndDate))
) AS TransactionDataSet
INNER JOIN
(SELECT ReadingRoomHistory.Username, ReadingRoomHistory.ReadingRoom, ISNULL(ReadingRoomHistory.Location, 'N/A') AS Location
FROM ReadingRoomHistory
WHERE ReadingRoomHistory.TimeIn BETWEEN dbo.ufn_Aeon_ConvertLocalDateTimeToUTC (@BeginDate) AND dbo.ufn_Aeon_ConvertLocalDateTimeToUTC (DATEADD(day, 1, @EndDate))
AND ('N/A' IN (@ReadingRoom) OR ReadingRoomHistory.ReadingRoom IN (@ReadingRoom))
) as ReadingRoomHistoryDataSet
ON ReadingRoomHistoryDataSet.Username = TransactionDataSet.Username
See, I told you it was fun. 😊
- Kevin
Comments
Please sign in to leave a comment.