MS SQL Server: Algorithm and Query for Set-Based Running Accumulators (i.e., Accumulate Row-Level Values across multiple Rows) into a String of Delimited-values, with Break-Level Resets
Have you ever wanted to perform a set-based running-subtotal operation or running-accumulation-of-values (across multiple rows) in SQL-Server — i.e., have row-level accumulators store a running total or running subtotal or aggregation of values within a "break level", stored in a String as delimited-values (i.e., delimited-list of values)? Want the solution to not use cursors?Well, in this blog I present one such cursorless method of implementing that type of functionality in a SQL-Server Query, Stored-Procedure, or Function.
The source code provided here is for a Microsoft SQL-Server Transact-SQL (T-SQL) solution that provides a basis for a query / stored procedure (SP) that you wish to perform Set-Based Running String Accumulation with Break-Level Resets operations in. This demonstrates how to accumulate values from multiple rows in a table into a single string, and how to perform a "reset" of that accumulator at your chosen break-level.
This code is very handy for reporting requirements where you need to display on a report a single field whose value is really made up of the aggregation of values from multiple rows in a database. The example scenario used here is reporting all OrderLineIDs related to a product. The AdventureWorks database (sample) from Microsoft provided the basis and data for this example.
WHY DO THIS?
I have run into a few situations where this has been extremely useful. A typical situation involves reporting functionality, where a report is supposed to show a list of values that exist for all items in a group, but only report this information in summary at a group level. Like, e.g., you have a part number with (potentially) a lot of sub-parts / components that make up the "parent" product, and you want to generate a report showing (at the product level) information like price, cost, build-time, and a list of sub-components (in summary - like just their part numbers). Well, this query demonstrates one method for how that can be accomplished. Also, I have used this as a technique to "bind" two reporting procedures together where a master-detail report links the details as a comma-delimited list of primary-key values (and then, in the detail report, I use the power of a user-defined function to transform that delimited list of key-values into a table (link to source code here on my blog) for joining, using this user-defined delimited-list parser SQL function.I have seen running totals done before, with varied methods, but what I have not seen much about elsewhere is how to do running subtotals. The code I wrote that appears below can easily do both, and does. It is easily adapted to do multi-column break-value running subtotals, as noted in the code comments.
Note: for this example, I used the SQL Server 2012 Adventureworks sample database, which provided me with a test case of approximately 113,000 rows against which I perform the running-subtotal (by product level) logic and do my row-level accumulator manipulations (using SET logic). This query is relatively efficient and took only just under 6 seconds to execute on my early Core-i7 Desktop development PC within a VMware Workstation virtual machine, for the entire operation (before I limited output with TOP() function.
SQL-Server Query Source Code
--********************************************************************************
--This source code is Copyright (c) 2007-2017
-- Author: Mike Eberhart
--
--I hereby release this code under the terms of the MIT License (for freeware).
--
--Permission is hereby granted, free of charge, to any person obtaining a copy
--of this software and associated documentation files (the "Software"), to deal
--in the Software without restriction, including without limitation the rights
--to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
--copies of the Software, and to permit persons to whom the Software is
--furnished to do so, subject to the following conditions:
--
--The above copyright notice and this permission notice shall be included in
--all copies or substantial portions of the Software.
--
--THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
--IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
--FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
--AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
--LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
--OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
--THE SOFTWARE.
--********************************************************************************
--**********************************************************************************************
-- BEGIN: SET-BASED Running-String-Accumulator Technique
--**********************************************************************************************
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; -- Clears the data cache
DBCC FREEPROCCACHE WITH NO_INFOMSGS; -- Clears the procedure cache
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
DECLARE @starttime DATETIME = GetDate();
--Variable to accumulate our delimited string of values-per-break-level
DECLARE @RunningSubtotalDelimString VARCHAR(MAX) = '';
--Our break-level-value variables (Data-Types match those of the columns we are comparing to)
--Initialize these to some values that will NOT exist in each break-column's actual data.
DECLARE @SubTotalBreakValue1 INT = -999;
--Our work-table, where we can ensure the ORDER with which we later access the data, being
--by default, the order of the PRIMARY KEY declared here in UniqueID column.
DECLARE @Results TABLE
(
UniqueID INT IDENTITY NOT NULL PRIMARY KEY,
SubtotalBreakColumn1 INT,
ReferenceOrderID INT,
OrderLineIDsForProduct VARCHAR(MAX)
);
--Insert all values to be totaled, into our work table in the REQUIRED BREAK-LEVEL(S) ORDER
INSERT INTO @Results(
SubtotalBreakColumn1,
ReferenceOrderID)
SELECT
ProductID,
ReferenceOrderID
FROM
Production.TransactionHistory
ORDER BY
ProductID, --Insert into table in our subtotal-breaking order (IMPORTANT!)
ReferenceOrderID --and, if we care, sort the OrderIDs too
;
--**********************************************************************************************
-- ALGORITHM EXPLANATION:
-- See SET-BASED RUNNING SUBTOTALS Example 1, which this builds upon and can easily be
-- extended per the comments therein and/or as demonstrated in Subtotals Example 2.
--**********************************************************************************************
UPDATE
@Results
SET
@RunningSubtotalDelimString =
OrderLineIDsForProduct =
CASE
WHEN @SubTotalBreakValue1 = SubtotalBreakColumn1
THEN @RunningSubtotalDelimString +
CASE WHEN @RunningSubtotalDelimString <> '' THEN ',' ELSE '' END +
CONVERT(VARCHAR(10), ReferenceOrderID)
ELSE CONVERT(VARCHAR(10), ReferenceOrderID)
END,
@SubTotalBreakValue1= SubtotalBreakColumn1
;
SELECT DateDiff(ms, @starttime, GetDate()); --Display elapsed Milliseconds
--**********************************************************************************************
--Output the results, showing a few rows to demonstrate the accumulation...
--**********************************************************************************************
SELECT TOP(200) * FROM @results
ORDER BY UniqueID
--UniqueID SubtotalBreakColumn1 ReferenceOrderID OrderLineIDsForProduct
------------- -------------------- ---------------- ---------------------------------------
--1 1 426 426
--2 1 505 426,505
--3 1 588 426,505,588
--4 1 675 426,505,588,675
--5 1 758 426,505,588,675,758
--6 1 841 426,505,588,675,758,841
--...
--...
--45 1 3931 426,505,588,[...etc...],3852,3931
--46 2 425 425
--47 2 504 425,504
--48 2 587 425,504,587
--49 2 674 425,504,587,674
--...
--...
--**********************************************************************************************
--Perhaps we only want one row with the delimited list of unique values for entire break-level
--(i.e., accumulator level, "subtotal level", aggregation level, running subtotal level)
--NOTE: Limit sample using TOP here. Demonstrate GROUP BY with MAX to get just break-level val.
--**********************************************************************************************
SELECT TOP(10)
SubtotalBreakColumn1,
MAX(OrderLineIDsForProduct) AS AccumulatedOrderIDs --The "MAX()" row has the most OrderIDs
FROM @results
GROUP BY SubtotalBreakColumn1
ORDER BY SubtotalBreakColumn1
;
--SubtotalBreakColumn1 AccumulatedOrderIDs
---------------------- -------------------------------------------------------------------------
--1 426,505,588,675,758,841,[...etc...]3457,3536,3615,3694,3773,3852,3931
--2 425,504,587,674,757,836,[...etc...]3456,3535,3614,3693,3772,3851,3930
--**********************************************************************************************
-- END: SET-BASED Running-String-Accumulator Technique
--**********************************************************************************************
Continue to read this Software Development and Technology Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, SVG, other technology tips and how-to's, and my varied political and economic opinions.
No comments:
Post a Comment