Monday, September 26, 2016

Tuning Very Large SQL-Server Databases - Part 3 (Index Optimization - Clustered / Non-Clustered)

This is the third in a series of three SQL-Server best practice tuning guides to help you achieve optimal performance, speed, and efficiency from your Very Large SQL-Server Databases.  Although these techniques will apply to Microsoft SQL Server, they will generally also apply to other relational database systems (RDMBS) like MySQL, Oracle, DB2, Postgres and others.

The series now continues with a focus on optimizing physical data-storage requirements (i.e., space reduction) by making smart use of Clustered Indexes when possible.

Tuning Very Large SQL-Server Databases — Part 3

In Part 1 of SQL-Server Large Database Performance Optimization Tuning and Part 2 of SQL-Server Large Database Performance Optimization Tuning, I discussed the need to Reduce Physical Storage Requirements and use proper data-normalization as the first foundation-step in achieving optimal SQL-Server processing Speed, and explored the reasons WHY a reduction in physical storage requirements was so important — whether the physical data-storage is hard drive storage, RAM device storage or SSD (Solid State Disks), RAID arrays, SAN / NAS networked storage... the need for speed implies a need to reduce storage requirements regardless.

Note: the database performance tuning concepts discussed in this series thus far, apply nearly universally to all mainstream RDBMS (Relational Database Management Systems) products — not just Microsoft SQL-Server — but also to the likes of MySQL, PostgreSQL, Oracle, IBM DB/2 (DB2), and others. And some of the concepts discussed herein (clustered indexes and such) will also apply to many database solutions too, since this concept is not unique to SQL-Server.

Now I will discuss some specific database-tuning strategies with respect to indexes in SQL Server relational databases.

Clustered vs. non-Clustered Indexes in SQL-Server

One of the most important decisions to make when defining an indexing structure and strategy throughout your SQL-Server databases is how to best make use of Clustered and/or non-Clustered indexes on individual tables.

Knowing how to best use, and when to use, a clustered index can make a serious difference in performance tuning large SQL-Server databases. Basically, if a key-column is ALWAYS increasing (in sort order) as data is added, and that column is relatively "narrow" (in bytes), make it CLUSTERED and you'll save a LOT space (since, the data for the Clustered version of an index is already PART of the same file). This is why applying clustered indexing to pre-ordered data is a great fit, and why applying clustered indexes to monotonically increasing values (like those in IDENTITY columns) makes a lot of sense — the data is physically ordered by the values on the table, and adding values to the end of a table can be done quite efficiently in most cases.

Certainly you can still use clustered indexed for data that is not pre-sorted or otherwise ordered already, but depending on the situation, this can lead to some significant inefficiencies as SQL-Server will need to physically re-order the table's data as values are added in order to arrange the data in the table by the clustered-index column(s) data values — which can make disks thrash like crazy. Keep this in mind, especially in high-volume and large database scenarios.

Here are some further points to consider regarding Clustered vs. non-Clustered Indexes and performance in SQL-Server:

  • As mentioned before, if there is less data (actual bytes) to move, regardless of what speed your system can move this data (whether in RAM or to and from disk), SQL-Server will still perform the same operation in less time — and, there is a good chance the elapsed processing time will decrease by a percentage greater than the percentage of storage-space you have saved...
  • ... so, choose an indexing strategy that prevents unnecessary data-movement operations, and/or minimizes data-movement. This will involve understanding what percentage of time your application spends performing INSERTS as opposed to SELECTS, UPDATES, and DELETES. Why does this matter? Well...
  • beware that UPDATES to Clustered-Index Columns can be painfully slow operations... perhaps the reasons are obvious; if not, let me know and perhaps I can expand upon this article in the future. Basically, just think about how, if a table is physically ordered by the column(s) that you are altering, the entire table is subject to physical re-ordering when those column values(s) change.


REAL-WORLD SQL-SERVER TUNING EXAMPLE(S)

Now I will quickly discuss a real-world (i.e., production-environment) time-savings and performance comparison I have witnessed before and after optimizing database indexing strategies, so you can get a feel for the magnitude of performance increases that are possible:

Before Tuning

I encountered a rather substantial SQL-Server database at a client site that had an Order-Lines table with millions of rows of data residing in it. The client had a custom application that performed inserts, updates, and deletions on this OrderLines table on a regular basis, and with a rather high volume of transactions.

And, though the custom application's GUI was responding within the threshold of "acceptable response time" at a few seconds (up to as much as 10-15 seconds under the heaviest loads throughout the day), the delay was bothersome (and costly) as hourly workers had to basically slow down and wait for the application to "catch up to them".


Tuning Performed

A quick examination of the database stored procedures that were used to perform Order-Line additions, updates, and deletes, revealed which database tables were involved in the order-processing maintenance business process (and related application GUI forms) and what tables and columns and indexes these procedures interacted with. Further digging revealed that within the primary Order-Lines table, there was an already-existing clustered index on that table which had been applied to a group of columns that, together, uniquely identified each OrderLine row.

I also noticed that the table in question had another OrderLineID column (a unique value, IDENTITY value, monotonically increasing auto-increment value) on it already. Perhaps it was added at a later point in the software development and database design process — who knows — but, this jumped out as an easy, immediate, on-the-spot fix opportunity that would deliver serious performance improvements.

I simply altered the database table index definitions to make the table use the IDENTITY value as the Clustered index, and the multi-column-identifying-value column-group a non-clustered index (i.e., the new post-tuning index setup was using a Clustered Index on the Identity values vs. a cluster on a combination of other columns' values — or, that is, what used to be the clustered index was now a multi-column unclustered index).

After Tuning

The results were dramatic and GUI operation responsiveness became nearly instant (as compared to multi-second response times). I also had to verify (as you always should do) that my SQL Server Database Index-tuning changes did not impact the performance of OTHER application functions — like reporting operations, ad-hoc queries, web-application access to this same data, and so on — all of which needed to be verified as having not been negatively impacted.

This testing proved out OK, as no regressions were encountered; though, thanks to the opportunity to do this regression testing, a direct outcome was that I was even able to make a few other index-tuning changes that delivered additional speed gains and performance improvements to the various portions of the database applications. This is just one simple case of looking for some "low hanging fruit" to pick when performance tuning with indexing strategy considerations in focus.

I hope you have found this a useful series of Microsoft SQL-Server database tuning and optimization information. Feel free to leave a comment to contact me if your business has a particular SQL performance-tuning challenge, or if you wish to leave any suggestions, point out errata, etc. Enjoy!

by Mike Eberhart (originally published: Jun-2007; updated Sep-2016)

Continue to read this Software Development and Technology Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, SVG, other technology tips and how-to's, and my varied political and economic opinions.

Sunday, September 25, 2016

Tuning Very Large SQL-Server Databases - Part 2 (Physical Storage Optimization - Continued)

This is the second in a series of three SQL-Server best practice tuning guides to help you achieve optimal performance, speed, and efficiency from your Very Large SQL-Server Databases.  Although these techniques will apply to Microsoft SQL Server, they will generally also apply to other relational database systems (RDMBS) like MySQL, Oracle, DB2, Postgres and others.

The series continues with a focus on optimizing physical data-storage requirements (i.e., space reduction), since it is perhaps the most important overall tuning consideration for large databases.

Tuning Very Large SQL-Server Databases — Part 2

In Part 1 of SQL-Server Large Database Performance Optimization Tuning, I discussed the need to Reduce Physical Storage Requirements and use proper data-normalization as the first foundation-step in achieving optimal SQL-Server processing Speed. I am now going to go deeper into explaining the reasons why this is so important and how it can make considerable speed and performance improvements possible in SQL Server relational databases.

Your SQL Database Server has Finite Resources

I do not care how modern or powerful your database server is, how many processors it has, or how much physical memory (RAM) or hard drive space it has. The fact is, your SQL-Server machine has finite resources and bandwidth / processing-power, and it is utterly foolish to waste that power on pushing around extra data when it is not necessary.

In my years of dealing with large databases for various clients, I repeatedly see management more willing to throw huge sums of money at a "hardware solution" to a performance problem instead of applying some common sense database tuning and optimization techniques. Attempting to fix a performance problem by just throwing ever larger and more powerful computers at it may work in some cases, but in many situations it will just act to postpone the inevitable performance implosion that is lurking in poorly optimized databases. Certainly your database server equipment needs to be up to the task, but the "task" should not include performing completely avoidable disk I/O and extraneous processing.

I don't want to get too deep into how to configure an optimal SQL-Server hardware solution here, but the general things you will need to be aware of when tuning and optimizing for very large SQL databases include:
  • First and foremost, you will need to understand the various (interacting) limitations of Microsoft Operating Systems and Microsoft SQL Server versions. Both the Windows Server Operating System and the database product impose limits on the physical resources that can be utilized by your database application.  There are "Standard" and "Enterprise" (and other) versions of each product, and they each impose limitations on the number of processors and RAM that will be made available to your database, not to mention the availability or lack of other substantial features of the database or OS software itself.
  • Hardware: When I originally wrote this how-to article back in 2007, CPUs were no where near as advanced as they are in 2016, and RAM memory was considerably more expensive, as were disk drives, and SSDs were seen as ridiculously expensive, and things like virtual servers and cloud-based SQL-Server solutions were rather in their infancy. Regardless, some basics remain the same about how to choose appropriate hardware for your SQL-Server requirements.

    Processors (CPUs) — SQL Server can really benefit from multiple CPUs and processor cores. At a minimum, you should have a dual-processor server if you really want to improve performance.

    Likewise, RAM is your friend, and SQL-Server's friend too. Get as much as you can afford, and be sure to couple the version of SQL-Server and Windows Server with the amount of RAM in your system.

    Disk Space — you had better have a RAID array of some sort, or an ultra high-speed SAN or such for your physical storage device (if you are not simply using some hosted cloud solution that takes care of all this for you). And, though you will need a large amount of space, increasing the physical number of hard-drives (whether old-school spindles, or modern SSDs) is how to really start boosting performance. I.e., if you have your database on one or two huge disks, instead of a dozen or more (probably smaller) drives in an array that in aggregate offers the space you need, chances are you'll quickly reach the limitations of the disk subsystem's throughput.

    Note: our main focus is to keep physical disk I/O to a minimum, and use RAM wherever possible since physical disk I/O, even on the fastest drive arrays, is many orders of magnitude slower than going to RAM for data.

    Network — if you'll be moving large quantities of data to and from the database server machine(s), you best have a modern ultra-high-speed network in place between those machines.


Moving Data around is Costly, and occurs within the Confines of your Hardware

I've focused on minimizing the data-storage requirements in your massive SQL-Server database for the sole purpose of increasing performance. The primary way this strategy improves performance is by reducing the amount of data that is moved around, especially to and from (slow) disk drives.

Here are some explanations for how minimizing data-storage requirements increases performance and speed within your SQL Server database solution:
  • Just think about it: if you have less data (actual bytes) to move, regardless of what speed your system can move this data (whether in RAM or to and from disk), means that your machine will be able to perform the same operation in less time — and, there is a good chance the elapsed processing time will decrease by a percentage greater than the percentage of storage-space you have saved...
  • ...this is especially true if the reduced data-size causes operations that otherwise were unable to be performed in RAM to now fit within the confines of your server's RAM. If disk-swapping has been eliminated, performance improvements will be stellar.
  • Every SQL-Server operation, whether you are performing a SELECT, INSERT, UPDATE, or DELETE, must move data between RAM and physical storage. Smaller data equals faster movement and higher row-throughput per time-slice. Period. I am not ignoring caching here either: keep in mind that even data retrieved from any cache is some type of physical storage, even if it is other locations in RAM, and data is still going to move around.
  • If an operation is logged, as are any operations that modify data, the amount of data that must be written to the log will also be reduced, and thus speed up your database operation. Optimizing your database design so as to reduce physical storage requirements can therefore also help keep your transaction-log size under control.
  • Index-widths (in bytes) should be smaller, thus reducing storage requirements, and even more so allowing much more index information to be cached in RAM at any given time — this is crucially important for the speed of JOIN operations, sorting, and much more.
  • Any comparison operations will be faster. And, all sorts of operation are comparison operations, including sorts (ORDER BY) operations. Things like JOIN operators, and ORDER BY clauses, and WHERE conditions — these all require comparisons to be performed. And, the less data to compare, the smaller the amount of resources required to perform the comparison (RAM / Disk). If you've done a great job of reducing physical data-storage requirements, these operations will all run much faster.
  • Backup jobs, and even replication, will run much faster, since there is much less data to write out to tape or copy to another storage device. Likewise, data-recovery and transfer is much more efficient.

REAL-WORLD SQL-SERVER TUNING EXAMPLE(S)

Now I will quickly discuss a real-world (i.e., production-environment) time-savings and performance comparison I have witnessed before and after optimizing data-storage requirements, so you can get a feel for the magnitude of performance increases that are possible:

Before Tuning

I was confronted with a SQL-Server database a client had, where they basically "hit a wall" with their internally designed database and stored procedures. Their (relatively new) database was growing at such a phenomenal pace, both in row-counts and physical size, that their database-server hardware was being pushed to its limits, and some reporting processes were unable to finish in less than a couple hours.

There were a few core tables, each in excess of 50 million rows, that were involved in the queries which were so long running. The total database size, with indexes, was pushing past 45 Gigabytes, and growing by nearly 10 Gigabytes a month (and, Indexes were about ⅓ of the total DB size). This thing was a serious time-bomb ready to implode, and it was (of course) meant to be a data-warehouse critical to management-reporting in a Fortune-100 firm.

Tuning Performed

This was one of the largest databases I had ever seen built by someone with absolutely no clue how to properly normalize data and use the minimum amount of storage space possible. Though the few core tables had relatively "narrow" (byte-width) rows (just a couple hundred characters per row), a large part of each row was used as an index (NON-Clustered), and the bulk of the columns in this index were VarChar fields where the columns were mostly filled with data on each row. There were columns like "DivisionName" and "OfficeName" and the like in this Index, each containing data like "Midwest US" and "Houston, Texas Branch". I couldn't even believe it when I saw it.

So, I began by ripping those Varchars out, and replacing them with a TinyInt foreign-key to a Divisions table I made (there were only like eight total values!), and a SmallInt foreign-key to an Offices table I made (with less than a thousand offices). Needless to say, the database size took a large step downward in size. I continued with this type of work...

After Tuning

By the time I was done, the database was only a third of its original size. And, I rewrote the reporting process, that was taking hours to run, to now join on the various Integer columns (vs. all the varchars) and only bring in the "lookup data" (Office Names, etc) when needed at the final output stage.

As soon as I broke through the size-threshold where much more of the information required to perform complex JOIN conditions could fit in the database server's 4 Gigabytes of RAM (which was relatively large RAM back in 2007), performance was now incredible. Procedures that took hours to run before were now running in a minute or less.

This all required very little coding changes, and just some logical application of normalization techniques. The biggest challenge was getting enough processing time to convert the existing production database to the new layout, since it's poor design and massive size made alterations a very slow process — the production server didn't have enough disk-space to allow the conversion to take place at first even!

More such examples to come, as I move forward with SQL Server Performance tuning through proper database design, normalization, and even Transact-SQL coding...

Continue with Part 3 of SQL-Server Database Tuning now.


by Mike Eberhart (originally published: May-2007; updated Sep-2016)

Continue to read this Software Development and Technology Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, SVG, other technology tips and how-to's, and my varied political and economic opinions.

Saturday, September 24, 2016

Tuning Very Large SQL-Server Databases - Part 1 (Physical Storage Optimization)

This is the first in a series of three SQL-Server best practice tuning guides to help you achieve optimal performance, speed, and efficiency from your Very Large SQL-Server Databases.  Although these techniques will apply to Microsoft SQL Server, they will generally also apply to other relational database systems (RDMBS) like MySQL, Oracle, DB2, Postgres and others.

The series begins with a focus on optimizing physical data-storage requirements (i.e., space reduction), since it is perhaps the most important overall tuning consideration for large databases.

Tuning Very Large SQL-Server Databases — Part 1

SQL-Server Performance Optimization and SQL-Server Speed Optimization should always be a primary consideration during the database design phase, regardless of the size of the relational database — with the exception perhaps of "tiny" or "trivial" databases where future growth is guaranteed not to be an issue. And, very large databases (in excess of a Gigabyte of data, or millions of rows of data in any particular table) definitely require the SQL-Server database designer / administrator to consider the ramifications of proper (or improper) database design. Perhaps this seems obvious, but given my many years of experience and encounters with large corporate databases, what should be obvious is anything but obvious to many developers, designers, and administrators that implement these massive databases.

Most of this discussion applies to all modern Microsoft SQL-Server versions (SQL 2008, 2012, 2014, 2016) as well as versions as early as SQL-Server 7.0, 2000, 2005. The newest versions of SQL-Server, especially the full-featured Enterprise editions, may also offer you additional features of relevance like built-in data-compression. But, even fancy features like data-compression are no substitute for proper design.

Reduce Physical Storage Requirements

The first thing I always make sure of when designing databases, and when tuning large SQL-Server databases for performance, is that physical storage space is not being wasting with data-types that are overly precise or excessively large for their intended function, and that data is properly normalized. If you don't understand the concepts of normalization, please study up on it before reading further (or pretending you can do your job as a SQL-Server developer or DBA).

Keep in mind that this discussion is all to be considered in the context of very large databases, where there will be millions of rows of data in one or more tables. In fact, you may be working with tens of millions or hundreds of millions of rows of data. As such, every byte of storage saved on a per-row basis will potentially result in Megabytes saved (in reduced physical storage), or perhaps even gigabytes saved in the overall table and database size.

I will now present some very simple rules for reducing physical storage requirements and some (storage space) savings-opportunity examples to consider:
  • Do not use NVarchar or NChar (vs. Varchar and Char respectively) unless you need Unicode character support. Using NVarchar (or NCHAR) where a standard Varchar/Char will suffice will double the storage requirements for a column, since Unicode characters map to double-byte pairs (to accommodate all those foreign language characters and extended symbols, etc).

    Savings Opportunity: I have encountered NVarChar use on columns where there is an Alphanumeric "TransactionID" on millions of rows of data, but where that AlphaNumeric value is most certainly only containing the 10 digits and the standard 26 letters of the English alphabet. Instant space savings!
  • Use SmallDateTime instead of DateTime, unless you really need the precision. SmallDateTime fields are accurate to the nearest minute, whereas full DateTime types are accurate to 3.33 millisecond intervals. And, that extra precision comes with the price tag of double the storage — SmallDateTime requires 4 Bytes instead of 8 Bytes for DateTime.

    Savings Opportunities: if you see fields like "AccountingPeriodBeginDate", it should be obvious that an accounting period is going to start on a particular day boundary, and SmallDateTime will more than suffice. I also regularly see fields like "LastChangeDate" on tables in large databases where, upon inquiring, nobody can provide any reason why any accuracy beyond a particular minute is needed, but yet the fields are defined as full DateTime fields.
  • Use the smallest Integer data-types that will safely and comfortably handle all possible values in an integer-data column. SQL-Server provides a few datatypes for this, including bigint, int, smallint, and tinyint. Each has their storage limitations and storage-space requirements. TinyInt takes just one byte, but can only store values ranging from 0-255 (i.e., 2^8 values), whereas SmallInt is two bytes in length, and can store 2^16 values, Int is 4-bytes and holds 2^32 values, and BigInt is 8-bytes and holds 2^64 values (see the pattern?)

    Savings Opportunity: look for columns that have an obviously small number of values, but whose column-definitions are grossly oversized. E.g., if you see a field like (accounting) "Quarter" defined as an Int, TinyInt can easily hold the four possible values (Quarter 1, 2, 3, or 4) and saves 3bytes/row. Or, perhaps you'll see UserID defined as an Int where there's no way in the next 100 years the application will have over 10 or 20,000 users — this sure looks like a good situation to downsize to SmallInt and save half the space!
  • As with Integer types, the money data-types come in a large and small size — Money and SmallMoney respectively; the first of which takes 8-bytes to store, and the latter which takes only 4 bytes. SmallMoney values are limited to roughly +/-214,000.0000 compared to hundreds of trillions for Money.

    Savings Opportunity: if you have per-unit Costs, Charges, or the like stored in your database columns where the price is never going to even begin to approach that $214,000 value, use SmallMoney and cut your storage by half. E.g., if you are storing the price of items sold in your convenient-store locations, or even the price-per-gallon of gasoline you sell, this limitation should suffice for the foreseeable future.
  • Decimal and Numeric data-types provide a bit more complexity when analyzing the proper "best" size for the fields. See the SQL-Server Books Online to size appropriately, since the maximum total number of decimal digits that can be stored, both to the left and to the right of the decimal point, will determine the storage-bytes required. You want to size your columns in a way that they will always be large enough to hold the values you will place in them, without over-sizing them.
  • Variable-length fields where fixed-length should be used. Varchar fields require 2 extra bytes in addition to the the defined maximum field-width in order to store a value which holds the characters (used) count. Varchars are great for when there really is heavy variation in the actual length of data being stored. But, they do not make sense if you'll always be filling all bytes in the field (or even all bytes minus two).

    Savings Opportunity: perhaps one of the most ridiculous things you will encounter is a one or two character field that is defined as Varchar(1) or (2). This is one sure way to make a one-character field take 3-bytes of storage per row. I have seen this repeatedly with things like "Flag" fields or "Code" fields people design into table structures. I personally think the RDBMS should not allow a user to even define Variable-width columns of just one or two characters.

    Note: SQL-Server has one oddity to watch out for to ensure that Char fields are truly treated as fixed-length fields — as stated in the Books Online: "If SET ANSI_PADDING is OFF when either CREATE TABLE or ALTER TABLE is executed, a char column that is defined as NULL is handled as Varchar"... just so you are aware (note: I find this wacky to say the least - if I define something as Char, I really mean Char and not Varchar)
  • Look for obvious cases where proper data-normalization is being ignored. Quite often, you'll encounter massive data-repetition in tables with millions or tens/hundreds of millions of rows. The key to speed is to make that repetitive data take up as little space as possible (without employing some data-compression algorithm that is going to kill performance on save/retrieves - in case you were thinking about such a thing).

    Savings Opportunity: start by looking for just plainly blatant ignorant denormalization. I have seen so many massive tables where things like "LastUpdateUser" is a 10 or 20 character (if not Unicode character) field in databases where those last-update "Users" are a finite group of individuals that can be counted on both hands. Talk about a waste! If you encounter this type of situation, it's time to create a "UpdateUsers" table (or whatever) with an Integer primary-key (in this example, a TinyInt) and a field called "UserName" or whatever, and replace the "LastUpdateUser" on the massive-rows table with "LastUpdateUserID" that is a TinyInt column (foreign-key to the new UpdateUsers Table). This will save a ton of space! And, there is nearly zero cost in joining to such a small table later if you really need to get the name associated with the ID.

    Another Savings Opportunity: there is another technique I have employed to tune very large tables (in excess of 50 million rows/table) that may get a bit tougher to initially grasp. But, this technique can be a huge space and processing-time saver. It involves a bit of data-analysis, and thinking "out of the box". Basically, I will look for data-patterns that repeat, and whose unique (pattern) values are finite though not instantly apparent — especially when the patterns span multiple columns on the same table. I've routinely discovered that the unique values from a combination of few fields may only number in the thousands, even though the total table rows exceeds 50 or 100 million. When I find these, and establish the reason for the pattern (and ensure my logic will accommodate future conditions too), I will pull the unique field-combinations into a "lookup" table and create a small FK (SmallInt or Int quite often) to point to those combinations (and replace the column(s) whose data I have moved to a lookup table). Why? I'll get into this in the next section about why to do any of this space-tuning, but the bottom line is speed and efficiency. If you have a particular situation you need tuning help with, contact me.

If you begin with the above space-savings guidelines, chances are you'll find a way to make some significant storage-space reductions in your SQL-Server databases. And, you will have made the first steps towards performing some serious speed-optimizations in your database as well.

One note of caution when making any database changes: be sure you understand the ramifications beyond the scope of the database. Many of the changes I've outlined will be more or less transparent to other applications, depending on what (languages) the applications are developed in, and how the applications access the data (note: I always recommend that applications do not have direct access to the data-tables, but rather access data through stored procedures — for many reasons, one of which is to allow simplified maintenance to the underlying database tables without impacting other applications).

If you don't know what side-effects your changes could cause for Graphical User Interfaces (GUI applications) or reporting applications, and so on are, then be sure to involve someone that does. E.g., When changing Int fields to SmallInt, a Borland (aka CodeGear) Delphi based program most likely won't care, but C# data-bindings may have issues on an ASP.NET page or such. So, be careful. From what I have seen, Delphi as well as C# all seem fine with SmallDateTime in place of full DateTime fields in many cases, but it can also depend on how customized an application is.

Computing / Estimating Storage Requirements

To realize how much potential for storage-space savings there is, you will want to be able to quickly rough-out some approximations based on the columns in table(s), the number of bytes used per-column, and the number of rows per table.

In theory, you should be able to quickly estimate a Table's (physical storage) size just by knowing the number of rows, and multiplying that by the total of all columns' storage requirements. This can be put into a simple formula something like this (I didn't put ALL data types in here, but you should get the idea), where you just multiple the number of rows in a table by:
(NVarChar-field-widths-in-aggregate * 2 + #-of-Nvarchar-fields * 2) +
(VarChar-field-widths-in-aggregate + #-of-varchar-fields*2) +
(NChar-field-widths-in-aggregate * 2) +
(char-field-widths-in-aggregate) +
BigInts*8 + Int's*4 + SmallInts*2 + TinyInts*1 +
DateTimes*8 + SmallDateTimes*4 +
Money*8 + SmallMoney*4 +
(Total-bit-fields-DIV-8 + 1)
This calculation does not take into consideration Keys or Indexes, but it will give you the approximate data-space requirements. And, it's a good way to quickly estimate reductions in storage space too. For the data-types I didn't cover, see the Microsoft SQL-Server books-online, which give a description of how much each data-type takes for storage - see the "Data Types" (Transact SQL) section.

Continue with Part 2 of SQL-Server Database Tuning now.


by Mike Eberhart (originally published: May-2007; updated Sep-2016)

Continue to read this Software Development and Technology Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, SVG, other technology tips and how-to's, and my varied political and economic opinions.

Friday, September 23, 2016

SQL SERVER Function (UDF) to Parse / Split string of delimited Integers into Table

SQL Server UDF to Parse / Split string of delimited Integers into Table

Here is an incredibly useful Microsoft SQL-Server Transact-SQL (T-SQL) user-defined function (UDF) for splitting a string of delimited integer values into a table result.

This is very handy if you have a list of integer index or primary-key values that you want to retrieve results from a table for. Just pass a concatenated and delimited string of integer values to this function, and JOIN to the results of this function or use IN within the SELECT condition, etc.

For example, @vIDs is a string of comma-separated values, like ''1,2,4,22,101'' :

CREATE PROCEDURE uspDoWhatever
    @vIDs    VARCHAR(8000)
AS

SELECT [column list]
  FROM [SomeTable]
  WHERE [intColumnName] IN (SELECT * from dbo.udfGetDelimitedIntegerListAsTable(@vIDs, '',''))

This comes in very handy when working with one-to-many relationships in UIs, whether web-based or otherwise, and other applications.

I will add links here to additional related functions and procedures.

SQL-Server User Defined Function (UDF) Source Code

--********************************************************************************
--This source code is Copyright (c) 2007-2016
--     Author: Mike Eberhart
--
--I hereby release this code under the terms of the MIT License (for freeware).
--
--Permission is hereby granted, free of charge, to any person obtaining a copy
--of this software and associated documentation files (the "Software"), to deal
--in the Software without restriction, including without limitation the rights
--to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
--copies of the Software, and to permit persons to whom the Software is
--furnished to do so, subject to the following conditions:
--
--The above copyright notice and this permission notice shall be included in
--all copies or substantial portions of the Software.
--
--THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
--IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
--FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
--AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
--LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
--OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
--THE SOFTWARE.
--********************************************************************************
--**********************************************************************************************
-- This function takes a string of delimited numbers and explodes them out into a table.  
-- By default, the delimiter is set to '|' if another is not specified in the second parameter.
--
-- The function also selects any "missing" values as NULL.
-- E.g., if a comma-delimited string to parse is '1,,,2' the function assumes each (missing) 
-- value between consecutive commas represents a NULL value.
-- The calling code can turn NULLs into whatever is needed.
--
-- NOTE:There is no error testing for malformed strings that contain non-numeric values other 
-- than the defined delimiter.  Also, currently just setup for INT values at max (vs. BIGINT).
--
-- SOME EXAMPLES / TEST CASES:
-- ===========================
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable](NULL, ',');       --No rows returned
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable](NULL, NULL);      --No rows returned
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('1', NULL);       --No rows returned
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('', ',');         --No rows returned
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('1', DEFAULT);    --Returns ONE value
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('1|2|3', DEFAULT);--Returns THREE values
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('1,2,3', ',');    --Returns THREE values
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('1,', ',');       --Return '1' and NULL
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('1,2,', ',');     --Two values and NULL
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable](',', ',');        --Return TWO NULLs
--SELECT * FROM dbo.[udfGetDelimitedIntegerListAsTable]('1,2,3', DEFAULT);--Conversion Error!
--**********************************************************************************************
CREATE FUNCTION [udfGetDelimitedIntegerListAsTable]
(
  @IntegerList  VARCHAR(MAX),
  @Delimiter    CHAR(1) = '|'
)
RETURNS @IntegersTable TABLE (IntegerFromList  INT)
AS
BEGIN
  IF (@IntegerList IS NULL) OR (LEN(@IntegerList) = 0) OR (@Delimiter IS NULL) RETURN;

  DECLARE  @DelimPos INT = PATINDEX('%' + @Delimiter + '%', @IntegerList);

  WHILE @DelimPos <> 0
  BEGIN
    --If nothing between delims, save as NULL
    IF LEN(SUBSTRING(@IntegerList, 1, @DelimPos - 1)) = 0
      INSERT INTO @IntegersTable(IntegerFromList)  VALUES(NULL);
    ELSE
      INSERT INTO @IntegersTable(IntegerFromList)
      VALUES(CONVERT(INT, SUBSTRING(@IntegerList, 1, @DelimPos - 1)));

    SET @IntegerList = SUBSTRING(@IntegerList, @DelimPos + 1, LEN(@IntegerList));
    SET @DelimPos = PATINDEX('%' + @Delimiter + '%', @IntegerList);
  END --While...

  --If no additional chars after a final delim, treat as an additional NULL
  IF LEN(@IntegerList) = 0
    INSERT INTO @IntegersTable(IntegerFromList)  VALUES(NULL);
  ELSE
    INSERT INTO @IntegersTable(IntegerFromList)  VALUES(CONVERT(INT, @IntegerList));

  RETURN;

END --Function


Continue to read this Software Development and Technology Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, SVG, other technology tips and how-to's, and my varied political and economic opinions.

Wednesday, September 21, 2016

SQL-Server Set-Based Running Subtotals (i.e., Row-Level Accumulators) Example 2 - Multiple Simultaneous Running-Totals

SET-BASED Multiple Simultaneous Running Subtotals in SQL

Have you ever wanted to perform a set-based running-subtotals and/or running-totals operation in SQL-Server (i.e., have row-level accumulators store a running total or running subtotal?) Want the solution to not use cursors? Don't have access to SQL-Server 2012 or newer yet?

Here I present a cursorless method of implementing such functionality in a SQL-Server Query, Stored-Procedure, or User Defined Function (UDF) that works with all versions of SQL-Server (SQL-Server 2005, 2008, 2008r2, 2012, 2014, 2016...) without requiring SQL2012-specific OVER / PARTITION BY functionality. I have seen running totals done before, with varied methods, but what I have not seen much about elsewhere is how to do running subtotals. The code that appears below does both running totals and running subtotals. It is easily adapted to do multi-column break-value running subtotals, as noted in the code comments.

This example builds off of Set-Based Running Subtotals (i.e., Row-Level Accumulators) Example 1, which introduces the techniques shown here (and more), and includes more discussion of the solution(s) — see the comments in the example source code.

SQL-Server 2012 (or newer) Note: once SQL-Server 2012 became widely available, this example was updated to show how to implement the running totals and running subtotals using SQL-Server 2012 OVER and PARTITION BY (windowing functionality) clauses that are ANSI standard SQL available to Microsoft Transact-SQL query writers now. See the lower portion of the source code for a solution using these modern SQL features. Included are performance (run-time) comparison values for both solution methods.

Note: for this example, I used the SQL Server 2012 Adventureworks sample database, which provided me with a test case of approximately 113,000 rows against which I perform the running-subtotal (by product level) logic and do my row-level accumulator manipulations (using SET logic). This query is relatively efficient and averaged just under 3 seconds to execute withing a VM (virtual machine) on my single-CPU Intel Core-i7 development desktop machine.

SQL-Server Source Code

--********************************************************************************
--This source code is Copyright (c) 2007-2016
--     Author: Mike Eberhart
--
--I hereby release this code under the terms of the MIT License (for freeware).
--
--Permission is hereby granted, free of charge, to any person obtaining a copy
--of this software and associated documentation files (the "Software"), to deal
--in the Software without restriction, including without limitation the rights
--to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
--copies of the Software, and to permit persons to whom the Software is
--furnished to do so, subject to the following conditions:
--
--The above copyright notice and this permission notice shall be included in
--all copies or substantial portions of the Software.
--
--THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
--IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
--FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
--AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
--LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
--OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
--THE SOFTWARE.
--********************************************************************************
--**********************************************************************************************
-- BEGIN: SET-BASED Running-Subtotals Technique #2 - Multiple Simultaneous Running-Totals
--**********************************************************************************************
CHECKPOINT; --to produce a cold buffer cache
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; -- Remove all buffers from the buffer pool / clear data cache
DBCC FREEPROCCACHE    WITH NO_INFOMSGS; -- Clears the procedure (plan) cache

SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

DECLARE @starttime DATETIME = GetDate();

--Accumulator variables
DECLARE @RunningTotal           MONEY = 0.0;
DECLARE @RunningTranDtSubtotal  MONEY = 0.0;
DECLARE @RunningProductSubtotal MONEY = 0.0;

--Our break-level-value variables (Data-Types match those of the columns we are comparing to)
--Initialize these to some values that will NOT exist in each break-column's actual data.
DECLARE @SubTotalBreakValue1    CHAR(8) = '18010101';
DECLARE @SubTotalBreakValue2    INT = -1;


--Our work-table, where we can ensure the ORDER with which we later access the data, being
--by default, the order of the PRIMARY KEY declared here in UniqueID column.
DECLARE @ResultsTable TABLE
(
    UniqueID                INT IDENTITY NOT NULL PRIMARY KEY,
    SubtotalBreakColumn1    CHAR(8),
    SubtotalBreakColumn2    INT,
    ActualCost              SMALLMONEY,
    RunningTranDtSubtotal   MONEY,
    RunningProductSubtotal  MONEY,
    RunningTotal            MONEY
);

--Insert all values to be totaled, into our work table in the REQUIRED BREAK-LEVEL(S) ORDER
INSERT INTO @ResultsTable
(
    SubtotalBreakColumn1,
    SubtotalBreakColumn2,
    ActualCost,
    RunningTranDtSubtotal,
    RunningProductSubtotal,
    RunningTotal
)
SELECT
    dbo.[udfConvertDateToString](TransactionDate),
    ProductID,
    ActualCost,
    0,
    0,
    0
FROM
    Production.TransactionHistory
ORDER BY
    TransactionDate, ProductID    --Insert into table in our subtotal-break order (IMPORTANT!)
;

--**********************************************************************************************
-- ALGORITHM EXPLANATION:
-- See SET-BASED RUNNING SUBTOTALS Example 1, which this builds upon as discussed 
-- in the comments therein.
--**********************************************************************************************
UPDATE
    @ResultsTable
SET
    @RunningTotal           = RunningTotal          = @RunningTotal + ActualCost,
    @RunningTranDtSubtotal  = RunningTranDtSubtotal =
        CASE 
            WHEN @SubTotalBreakValue1 = SubtotalBreakColumn1 
                THEN @RunningTranDtSubtotal + ActualCost
            ELSE ActualCost 
        END,
    @RunningProductSubtotal = RunningProductSubtotal =
    CASE 
        WHEN ( @SubTotalBreakValue1 = SubtotalBreakColumn1 AND 
                @SubTotalBreakValue2 = SubtotalBreakColumn2  ) 
            THEN @RunningProductSubtotal + ActualCost 
        ELSE ActualCost 
    END,
    @SubTotalBreakValue1    = SubtotalBreakColumn1,
    @SubTotalBreakValue2    = SubtotalBreakColumn2
;

--Output the results... format to fit printout better.
SELECT
    UniqueID,
    SubtotalBreakColumn1                    AS 'DateBreak',
    CONVERT(CHAR(6),SubtotalBreakColumn2)   AS 'ProdID',
    ActualCost,
    RunningTranDtSubtotal                   AS 'RunningDtSub',
    RunningProductSubtotal                  AS 'RunningProdSub',
    RunningTotal
FROM
    @ResultsTable;


PRINT dbo.[udfUtility_PrintElapsedRuntimeDebugInfo]('Custom Multiple-Running-Subtotals / Total Example', @starttime, DEFAULT);
--**********************************************************************************************
--Using the AdventureWorks DB (with SQL2005), you'll see a large set with data like this:
--**********************************************************************************************
--UniqueID DateBreak ProdID ActualCost   RunningDtSub      RunningProdSub    RunningTotal
---------- --------- ------ ------------ ----------------- ----------------- ---------------------
--1        20030901  3      0.00         0.00              0.00              0.00
--2        20030901  316    0.00         0.00              0.00              0.00
--...
--...
--25       20030901  534    0.00         0.00              0.00              0.00
--26       20030901  707    20.994       20.994            20.994            20.994
--27       20030901  707    20.994       41.988            41.988            41.988
--28       20030901  707    20.994       62.982            62.982            62.982
--...
--...
--75       20030901  707    34.99        1105.684          1105.684          1105.684
--76       20030901  707    34.99        1140.674          1140.674          1140.674
--77       20030901  708    34.99        1175.664          34.99             1175.664
--78       20030901  708    34.99        1210.654          69.98             1210.654
--79       20030901  708    34.99        1245.644          104.97            1245.644
--...
--...
--3942     20030901  999    388.7928     1674280.9813      7775.856          1674280.9813
--3943     20030901  999    388.7928     1674669.7741      8164.6488         1674669.7741
--3944     20030901  999    388.7928     1675058.5669      8553.4416         1675058.5669
--3945     20030902  3      0.00         0.00              0.00              1675058.5669
--3946     20030902  316    0.00         0.00              0.00              1675058.5669
--...
--...
--113437   20040903  929    37.086      1086.456          37.086             27307057.0617
--113438   20040903  930    43.0395     1129.4955         43.0395            27307100.1012
--113439   20040903  931    34.3455     1163.841          34.3455            27307134.4467
--113440   20040903  931    34.8705     1198.7115         69.216             27307169.3172
--113441   20040903  932    39.7635     1238.475          39.7635            27307209.0807
--113442   20040903  932    39.2385     1277.7135         79.002             27307248.3192
--113443   20040903  948    82.8345     1360.548          82.8345            27307331.1537


--**********************************************************************************************
-- END: SET-BASED Running-Subtotals Technique #2 - Multiple Simultaneous Running-Totals
--
-- Average run-time against source data count of 113K rows in AdventureWorks2012 are decent: 
-- 2980ms (3.0 seconds)
--**********************************************************************************************



--**********************************************************************************************
--**********************************************************************************************
-- BEGIN: alternative SET-BASED Running-Subtotal and Running-Totals Technique using 
-- "OVER" (windowing) functionality introduced in SQL-Server 2012.
--**********************************************************************************************
--**********************************************************************************************
CHECKPOINT; --to produce a cold buffer cache
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; -- Remove all buffers from the buffer pool / clear data cache
DBCC FREEPROCCACHE    WITH NO_INFOMSGS; -- Clears the procedure (plan) cache

--DECLARE @starttime DATETIME = GetDate();
SET  @starttime = GetDate();

SELECT
    dbo.[udfConvertDateToString](TransactionDate) AS DateBreak, --show without zero'd HHMMSS info
    ProductID AS ProdID, 
    ActualCost, 
    SUM(ActualCost) OVER
        (
        PARTITION BY TransactionDate
        ORDER BY TransactionDate, ProductID
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS RunningDtSub,
    SUM(ActualCost) OVER
        (
        PARTITION BY ProductID
        ORDER BY TransactionDate, ProductID
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS RunningProdSub,
    SUM(ActualCost) OVER
        (
        PARTITION BY @@SERVERNAME
        ORDER BY TransactionDate, ProductID
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS RunningTotal
FROM
    Production.TransactionHistory
ORDER BY
    --Use @@SERVERNAME as first column in order by so we can create a "partition" of the ENTIRE set for Running-TOTAL.
    --ORDER BY requires non-constant-values for columns, and @@SERVERNAME yields a "constant" (without being flagged as such) 
    @@SERVERNAME,
    TransactionDate, 
    ProductID;

PRINT dbo.[udfUtility_PrintElapsedRuntimeDebugInfo]('OVER / PARTITION BY Multiple-Running-Subtotals / Total Example', @starttime, DEFAULT);
--SELECT DateDiff(ms, @starttime, GetDate()); --Display elapsed Milliseconds 
--**********************************************************************************************
-- END: "OVER" (WINDOWING FUNCTINO) BASED Runing-Total/Running-Subtotals Technique, 
-- which produces same result (NOTE: requires SQL-Server 2012+)
--
-- 12200ms (1.2 sec) average : FAST!
-- Being "native" functionality in SQL-Server 2012, it is not surprising this approach produces
-- the SPEEDIEST EXECUTION TIME. 
--
-- But, for a non-native and/or pre-SQL2012 solution, our original approach we laid out (above)
-- is perhaps as performant as possible while being relatively easy to implement. 
--**********************************************************************************************


Continue to read this Software Development and Technology Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, SVG, other technology tips and how-to's, and my varied political and economic opinions.

Tuesday, September 20, 2016

SQL-Server Set-Based Running Subtotals (i.e., Row-Level Accumulators) Example 1

SET-BASED Running Subtotals in SQL

Have you ever wanted to perform a set-based running-subtotals and/or running-totals operation in SQL-Server (i.e., have row-level accumulators store a running total or running subtotal?) Want the solution to not use cursors? Don't have access to SQL-Server 2012 or newer yet?

Here I present a cursorless method of implementing such functionality in a SQL-Server Query, Stored-Procedure, or User Defined Function (UDF) that works with all versions of SQL-Server (SQL-Server 2005, 2008, 2008r2, 2012, 2014, 2016...) without requiring SQL2012-specific OVER / PARTITION BY functionality. I have seen running totals done before, with varied methods, but what I have not seen much about elsewhere is how to do running subtotals. The code that appears below does both running totals and running subtotals. It is easily adapted to do multi-column break-value running subtotals, as noted in the code comments.

SQL-Server 2012 (or newer) Note: once SQL-Server 2012 became widely available, this example was updated to show how to implement the running totals and running subtotals using SQL-Server 2012 OVER and PARTITION BY (windowing functionality) clauses that are ANSI standard SQL available to Microsoft Transact-SQL query writers now. See the lower portion of the source code for a solution using these modern SQL features. Included are performance (run-time) comparison values for both solution methods.

Note: for this example, I used the SQL Server 2012 Adventureworks sample database, which provided me with a test case of approximately 113,000 rows against which I perform the running-subtotal (by product level) logic and do my row-level accumulator manipulations (using SET logic). This query is relatively efficient and took just 2 seconds to execute withing a VM (virtual machine) on my single-CPU Intel Core-i7 development desktop machine.

For a more complex running-totals / running-subtotals example that builds off this one, see Set-Based Running Subtotals (i.e., Row-Level Accumulators) Example 2 — Multiple Simultaneous Running-SubTotals, which builds upon the techniques shown here.

SQL-Server Source Code

--********************************************************************************
--This source code is Copyright (c) 2007-2016
--     Author: Mike Eberhart
--
--I hereby release this code under the terms of the MIT License (for freeware).
--
--Permission is hereby granted, free of charge, to any person obtaining a copy
--of this software and associated documentation files (the "Software"), to deal
--in the Software without restriction, including without limitation the rights
--to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
--copies of the Software, and to permit persons to whom the Software is
--furnished to do so, subject to the following conditions:
--
--The above copyright notice and this permission notice shall be included in
--all copies or substantial portions of the Software.
--
--THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
--IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
--FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
--AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
--LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
--OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
--THE SOFTWARE.
--********************************************************************************
--**********************************************************************************************
-- BEGIN: SET-BASED Running-Subtotals Technique #1
--**********************************************************************************************
CHECKPOINT; --to produce a cold buffer cache
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; -- Remove all buffers from the buffer pool / clear data cache
DBCC FREEPROCCACHE    WITH NO_INFOMSGS; -- Clears the procedure (plan) cache

SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

DECLARE @starttime DATETIME = GetDate();

--Accumulator variables
DECLARE @RunningTotal    MONEY = 0.0;
DECLARE @RunningSubtotal MONEY = 0.0;

--Our break-level-value variable (Data-Type should match that of the column we are comparing)
--Initialize this to some value that will NOT exist in the break-column's actual data.
DECLARE @SubTotalBreakValue1    INT = -1;

--Our work-table, where we can ensure the ORDER with which we later access the data, being
--by default, the order of the PRIMARY KEY declared here in UniqueID column.
DECLARE @ResultsTable TABLE
(
    UniqueID                INT IDENTITY NOT NULL PRIMARY KEY,
    SubtotalBreakColumn1    INT,
    ActualCost              MONEY,
    RunningSubtotal         MONEY,
    RunningTotal            MONEY
);

--Insert all values to be totaled, into our work table in the REQUIRED BREAK-LEVEL(S) ORDER
INSERT INTO @ResultsTable
(
    SubtotalBreakColumn1,
    ActualCost,
    RunningSubtotal,
    RunningTotal
)
SELECT
    ProductID,
    ActualCost,
    0,
    0
FROM
    Production.TransactionHistory
ORDER BY
    ProductID    --Insert into table in our subtotal-breaking order (IMPORTANT!)
;

--**********************************************************************************************
-- ALGORITHM EXPLANATION:
-- We will be updating the row-level accumulators now through an UPDATE statement.
--
-- The subtotal logic relies on the fact that our break-value-detection variable(s) 
-- (@SubTotalBreakValue1,[...]) have their value(s) updated AFTER the current row's running 
-- total and subtotal variable values have had their values calculated and updated.
-- The sequence of variable and column updates here is important, since the variable(s) for
-- break-detection comparisons must have access to "prior row" values essentially, during the
-- CASE statement that determines whether to "reset" a running-subtotal variable at a break.
--
-- Additionally, you will notice the running-total and running-subtotal variables are SET at
-- the same time the table-column value is assigned, taking advantage of the order of operation
-- within the statement, whereby we can update both the variable(s) and the column(s) with any 
-- newly calculated running-total/subtotal(s).
--
-- This logic can easily be extended to support subtotal-breaks that rely on multiple-column 
-- values simply by:
-- 1) declaring additional break-value variables
-- 2) adding additional column(s) to the work-table for those break-levels and/or any
--     additional subtotal-accumulator column(s) you desire
-- 3) altering the INSERT (into work-table) to also ORDER BY the additional break-column(s)
-- 4) altering the CASE statement used in the UPDATE to compare more than one column prior to 
--    resetting the Subtotal accumulator.  E.g.,:
-- CASE
--  WHEN ( @SubTotalBreakValue1 = SubtotalBreakColumn1 AND 
--    @SubTotalBreakValue2 = SubtotalBreakColumn2 AND...) 
--   THEN @RunningSubtotal + ActualCost 
-- ELSE 0
--**********************************************************************************************
UPDATE
    @ResultsTable
SET
    @RunningTotal       = RunningTotal      = @RunningTotal + ActualCost,
    @RunningSubtotal    = RunningSubtotal   =
        CASE 
            WHEN @SubTotalBreakValue1 = SubtotalBreakColumn1 
                THEN @RunningSubtotal + ActualCost
            ELSE ActualCost 
        END,
    @SubTotalBreakValue1= SubtotalBreakColumn1
;

--Output the results...
SELECT 
    SubtotalBreakColumn1 AS ProductID, ActualCost, RunningSubtotal, RunningTotal 
FROM 
    @ResultsTable
ORDER BY
    UniqueID
;

--**********************************************************************************************
--Using the AdventureWorks2012 DB, you'll see a large result set with data like this:
--**********************************************************************************************
--ProductID ActualCost   RunningSubtotal RunningTotal
------------- ---------- --------------- ------------
--1         50.2635      50.2635         50.2635
--1         50.2635      100.527         100.527
--1         50.2635      150.7905        150.7905
--1         50.2635      201.054         201.054
--1         50.2635      251.3175        251.3175
--...
--... skip forward to first product break now...
--... 
--1         50.2635      2211.594        2211.594
--1         50.2635      2261.8575       2261.8575
--2         41.916       41.916          2303.7735
--2         41.916       83.832          2345.6895
--...
--... and take a look at the end of the entire result set now...
--...
--999       388.7928     286518.694      27306402.3709
--999       388.7928     286907.4868     27306791.1637
--999       0.00         286907.4868     27306791.1637
--999       539.99       287447.4768     27307331.1537


PRINT dbo.[udfUtility_PrintElapsedRuntimeDebugInfo]('Custom Running-Subtotal Example', @starttime, DEFAULT);
--SELECT DateDiff(ms, @starttime, GetDate()); --Display elapsed Milliseconds 
--**********************************************************************************************
-- END: SET-BASED Running-Subtotals Technique #1
--
-- Average runtime against source data count of 113K rows in AdventureWorks2012 are decent: 
-- 2100ms (2.1sec) when selecting the results for output
-- 1560ms (1.55sec) when calculating but not selecting results for output
--**********************************************************************************************



--**********************************************************************************************
--**********************************************************************************************
-- BEGIN: alternative SET-BASED Running-Subtotal and Running-Totals Technique using 
-- "OVER" (windowing) functionality introduced in SQL-Server 2012.
--**********************************************************************************************
--**********************************************************************************************
CHECKPOINT; --to produce a cold buffer cache
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; -- Remove all buffers from the buffer pool / clear data cache
DBCC FREEPROCCACHE    WITH NO_INFOMSGS; -- Clears the procedure (plan) cache

SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

--DECLARE @starttime DATETIME = GetDate();
SET  @starttime = GetDate();

SELECT
    ProductID, 
    ActualCost,
    SUM(ActualCost) OVER
        (
        PARTITION BY ProductID
        ORDER BY ProductID
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS RunningSubtotal,
    SUM(ActualCost) OVER
        (
        PARTITION BY @@SERVERNAME
        ORDER BY ProductID
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS RunningTotal
FROM
    Production.TransactionHistory
ORDER BY
    --Use @@SERVERNAME as first column in order by so we can create a "partition" of the ENTIRE set.
    --ORDER BY requires non-constant-values for columns, and this function produces a "constant" (without being flagged as such) 
    @@SERVERNAME, 
    ProductID;


PRINT dbo.[udfUtility_PrintElapsedRuntimeDebugInfo]('OVER / PARTITION BY Running-Subtotal Example', @starttime, DEFAULT);
--SELECT DateDiff(ms, @starttime, GetDate()); --Display elapsed Milliseconds 
--**********************************************************************************************
-- END: "OVER" (WINDOWING FUNCTION) BASED Running-Total/Running-Subtotal Technique, 
-- which produces same result (NOTE: requires SQL-Server 2012+)
--
-- 610ms average : FAST!
-- Being "native" functionality in SQL-Server 2012, it is not surprising this approach produces
-- the SPEEDIEST EXECUTION TIME. 
--
-- But, for a non-native and/or pre-SQL2012 solution, our original approach we laid out (above)
-- is perhaps as performant as possible while being relatively easy to implement. 
--**********************************************************************************************


Continue to read this Software Development and Technology Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, SVG, other technology tips and how-to's, and my varied political and economic opinions.

Tuesday, August 30, 2016

Apple Computer: EU / Ireland Taxation and Jobs in 2016

European Commission orders Apple to Pay €13 billion ($14.5 billion) in back taxes

This is certainly one of the bigger financial news stories today.  And, Apple's official response to this EU judgement is rather predictable.  As Business Insider (BI) reports:
"Apple's official statement on the EU ruling against its Irish tax arrangements tells you all you need to know about what is at stake: You can have taxes, or you can have jobs, but Apple is in no mood to deliver both."
Really?,.. can Ireland, or any other jurisdiction, truly choose between jobs and taxes?
If so, let's do the math and make the right choice: take the TAXES!

Jobs or Taxes: Taxes Win!


According to that same BI article:
"In the 15 years since [the 1991 tax deal with Ireland], Apple has created thousands of jobs in Ireland. By 2015 it had 5,000 employees in the country [Ireland]."
OK, now the math, people:
What is $14.5 Billion divided by the number of current employees [i.e., apparently circa 5,000 employees] (never mind the fact that Apple didn't carry this many employees for the entirety of the 15 years in question) divided by that 15 years?  Well, it is a staggering $193,330 Dollars [173,333 Euros] per employee per year for 15 years, and again, this assumes that same headcount for all that time, which was not truly the case.

So, forget the jobs and take the taxes! According to 2014 statistics, the average Irish worker only earned 34,465.85 Euros per year.  The Irish government could apply the €13 billion in taxes levied upon Apple toward, essentially, full career-length payouts to each and every one of those 5,000 current Apple workers, as this is enough money to pay each current worker nearly 87,000 Euros per year for the next 30 years!

The numbers are simply staggering!  Sure, you can argue that the periphery businesses built up around the Apple offices and such wouldn't get the same amount of business, but then again, with those 5,000 people having an income that is well over double the average annual Irish wage, they would have a lot of money to spend on all sorts of things. Periphery businesses, around larger business, make quite a bit of their revenue from what employees spend, and these employees would still have a lot of income to spend, though perhaps not in the exact same neighborhoods.  As for any effect of building offices / facilities, well, those same people are probably likely to build homes and/or start businesses of their own (at least, the ones that don't simply want to do nothing for 30 years), and Ireland has a housing shortage currently too.

Update: I just decided that it may be easier for people to understand the magnitude of this Irish Apple computer company tax if I were to show how much tax €13 billion is if it were spread equally among every man, woman, and child in Ireland: it is still huge!  Ireland's population is approximately 6.4 million persons.  So, the math is quite easy, and it works out to just over €2,000 for each and every Irish Citizen... so a family of 4 would get €8000 in direct benefit from this proposed Apple tax settlement if it were simply distributed to the citizens equally.  That is extraordinary, and really demonstrates the size of the settlement, and how much tax has been saved, by Apple, at the apparent cost of the average citizen over the past decade or more.

US Tax Policy to Blame

The bottom line is: this type of tax situation exists for the simple reason that the United States government, for the simple fact that it is owned by and beholden to corporations and wealthy elites that can abuse existing tax law through shell companies and the like, refuses to simply implement a corporate tax on foreign-earned income that is instantly accrued and owed just like domestically-earned taxes are.  I.e., the USA needs to get rid of this ridiculous offshore zero-tax perpetual "holding-ground" for US Corporation profits.

As a United States individual, you cannot simply park your earnings overseas and not pay taxes!  In fact, it is the extreme opposite.  Thanks to legislation like FATCA, you are instead burdened by extraordinary reporting requirements on any foreign income, and failure by an individual to declare their income and/or pay taxes upon any income recognized from offshore holdings/investments during a calendar year will lead to truly massive penalties and fines in addition to the tax owed.  So, why have corporations, and individuals that are able to use corporations and partnerships and the like to hold massive portions of their wealth overseas, been able to get such preferential treatment whereby their income is not taxable until they physically move it back into the Unites States?

Therein lies the obvious problem and the obvious reason that was already illuminated earlier: the US government tax-policy has been hijacked by powerful multinationals and ultra-wealthy individuals.

Continue to read this Software Development, Technology, and Finance Blog for computer programming, software development, and technology Techniques, How-To's, Fixes, Reviews, and News — focused on Dart Language, SQL Server, Nvidia CUDA, VMware, TypeScript, other technology tips and how-to's, and my varied political, financial, and economic opinions.

Monday, August 22, 2016

Tim Worstall : Master of Delusive Prattle tries to Redefine Wealth


I may be mistaken with this first assumption:  I assume Tim Worstall is, at least moderately, intelligent.  Whether he is or not, Tim Worstall epitomizes the practice of using one's intelligence as a tool for developing and deploying subterfuge, to justify any and all nonsense one can come up with.

His latest article on Forbes, from 2016-Aug-19, is entitled: "CBO's Very Bad Report - It's Complete Nonsense That The Top 10% Hold 76% Of All US Wealth", and the title itself characteristically reflects Mr. Worstall's desire to employ his intelligence for the purpose of misleading readers.  His aim is quite clear: create uncertainty about any statistics that lend validity to positions that run contrary to his own personal socio-economic and political ideals.

I typically try to ignore Worstall's garbage, but I found myself so annoyed by the ridiculous assertions in his current article that I had to respond — though unlike Worstall, I don't have the advantage of the asymmetric media exposure which confers columnists like him an incredible advantage when it comes to readership / reach.  But, I have to try...

Worstall's Insane Assertion: Wealth is Not Counted Properly

Worstall states:
"It is true that they [the top 10% of people] have 76% of the wealth as it is being counted. But that’s the very problem, the manner in which the wealth is being counted." 
Why does Worstall assert that wealth is measured wrong? In summary, he contends that this 76% of wealth held by the few is not defined accurately because it is financial wealth and it does not include a couple things: the value of human capital and, to a lesser degree, the (assumed future) value of pensions and retiree benefit plans.

When most people are asked to define wealth, they typically arrive at a definition that sure sounds like financial wealth, and this will almost invariably include the assets an individual currently holds and, when enumerated, this wealth will include things like: cash, cash equivalents (instantly accessible holdings at a bank, etc), physical property (the unencumbered portion of cars, homes, etc), stocks and bonds, and most anything they feel can be reasonably quickly converted into cash.  These are core components of material prosperity, or, that is, being "rich" by way of having accumulated assets that can readily be converted to cash, traded for other items of value, or pledged as collateral against loans.

Redefining Wealth

Now, according to Worstall's view of reality, financial wealth doesn't properly reflect true wealth and its distribution throughout society.  In order to market his absurd concepts and realize his agenda, he needs to convince people that "wealth" is not what most people consider it to be (i.e., financial wealth), but rather that it should additionally include unrealized-potential, of which, per his conjecture, there is a meaningful and measurable net-present-value (NPV) that will somehow dramatically alter the data and thus the statistical view of wealth distribution — and of course, he wants his proposed definition of wealth to somehow show that the "top" doesn't hold such a large proportion of all wealth.

There are so many holes in his proposed redefinition of wealth that I hardly know where to begin my analysis. Mr. Worstall apparently has an incredible ability to suddenly forget basic accounting and now confuse income-statement items with balance-sheet items. That is, within the foundation of his ideal new wealth-accounting mathematics there will be a ridiculous new condition where income is intentionally contorted into an asset.  Income and wealth (assets) are two completely different things, Tim.  I learned this very, very early on in life, as I am sure you did.

Even if you have a very high income, you may not be at all wealthy, depending on your spending habits.  Wealth is what remains of your accumulate income, as assets, after you have made your expenditures and offset your liabilities.  Assuming that you have started life with a zero balance, wealth is accumulated by retaining, repeatedly, a portion of your disposable income, and having overcome a propensity to consume, instead opting to fortify your personal balance sheet (i.e., your assets / wealth).

Unemployment Insurance = A Component of Wealth? Yeah, sure!

In making his pitch for a new definition of wealth, Tim even hurls this incredibly weak profession:
"Unemployment insurance has a value – all insurance policies have a value. That if I get fired I have an income is a source of wealth to me. But that’s not counted."
Well, duh, Tim... unemployment insurance is not counted as wealth because it is income, when actually collected, and this income is realized only upon becoming unemployed. This is further from being any part of wealth than the number of discarded aluminum cans laying beside a street when you walk or drive by: at least those cans can be collected and recycled for cash, without the need to be unemployed as a precondition to realizing that gain. Seriously, who falls for your baloney, Tim? If you are unemployed, and even if you are collecting unemployment insurance, it is not very likely that you are accumulating wealth; just ask all the long-term unemployed during the recent great recession that were lucky just to be able to keep their house!

Going further, Tim even mentions food stamps as a source of wealth, stating that if he had no other income, perhaps he'd collect some $300 a month in food stamps or such, and that looked at over a long time (he seems to ponder 20-years or so) the accumulated capital-value would be some $60-70K, of which some portion of this potential future income should be somehow considered "wealth" now.  But, if that's the case Tim, per this imbecilic logic, we would all accrue the exact same potential amount if we each have the same 20 years in which to potentially collect it!  

This logic is so transparently flawed and clearly designed to evoke a response from those who are against the concept of food stamps, welfare, etc.  That emotional response aside, just think his nonsense through a bit.  In Tim's world, even a newborn has a considerable chunk of "wealth" to be counted in the wealth-distribution curve statistics, as that baby is potentially entitled to an entire long life of future benefits — be it Social Security in the USA, food stamps, or whatever — and, there is certainly human capital lying there in a newborn's crib, which we need to also assign a NPV to according to Tim. This is simply a fantasy view of reality and certainly not remotely coupled to any financial reality.

Let's look at Tim's next major alleged uncounted "wealth" item as while further deconstructing his absurdity...

Human Capital = Wealth?

At a corporate or business level, human capital definitely contributes something to intangible assets, though most often it is the resultant IP (Intellectual Property) and proprietary methods and processes derived or implemented by the employees that is the true asset.  This is out of scope here anyways, since Tim is talking about personal wealth. So, what is the true value of human capital when calculating personal wealth? 

Mr. Worstall cites a British Office for National Statistics (ONS) report that dared to mention the term "human capital" along with some proposed value of that capital, in such a way that it could apparently be bent to his liking:
"The ONS also produced figures for “human capital”, putting a monetary value on a person’s qualifications, age, health, personality and skills.
The value of the UK’s human capital increased by £890billion to £19.2 trillion last year"
Seeing that giant number (£19.2 trillion) must have immediately worked Tim into a frenzy... as he then deduces that "human capital is vastly more important as an economic number [compared to household financial wealth]", but the CBO (Congressional Budget Office) study —  which shows 76% of family wealth being held by the top 10% — didn't somehow allocate this giant human capital figure within their analysis of wealth distribution.  

Temporarily ignoring the fact that you cannot simply quickly exchange your lifetime human capital potential for financial wealth, let's contemplate its place in Tim's new definition of wealth.  Presumably, you are to arrive at an affirmative response to this question: if this human capital figure were somehow apportioned to every citizen correctly, would it change the financial wealth distribution curve? 

Valuing Human Capital

Make no mistake, there are noteworthy correlations between various aspects of individual humans and their human capital and their likely income production and their eventual wealth accumulation.  As mentioned in his article, via citation, differences in education lead to a differing degrees of financial wealth, not that it should be any real news that families headed by college educated individuals end up with nearly four times the wealth of families headed by individuals with only a high-school education.  We should have all heard the well known statistics by now, that, e.g., acquiring a college degree leads to better pay. And, with a higher income, you have a much better chance of accumulating and retaining some of that income, beyond your propensity to spend, in assets, in your wealth.

But, having a college degree does not make you wealthy. I know plenty of people with college degrees that have near-zero wealth for the simple fact they do not earn a high wage and/or they may spend whatever income they receive, no matter how much they receive.  Income and wealth are two completely different things, and trying to somehow state that human capital, assessed by way of applying some net present value calculation based on attributes like your education (which imply potential earnings) to capture your true "wealth" is again just an incredible misconception.

If you believe that your wealth is higher just because you have a college degree, due to the statistics of income potential associated with that degree, you might as well consider other less-equitable income correlation statistics, and the implications.  It has been shown that your physical appearance will affect your earnings potential (just search google for various studies / results).  Furthermore, your gender will dictate to some degree your potential earnings (the stats are clear: women earn less for the same role), and your race will predict your earnings potential too here's a link to a 2016 Washington Post article that should help you get started with the underlying data behind these varied statements.  These types of study results are typical, and such data can be found everywhere. 

Apparently, human capital "wealth", calculated as some proposed net-present-value amount, is going to depend on all sorts of truly sad statistics. Never mind this whole human-capital current "wealth" misconstruction is absurd.  But, if the idea were to be taken seriously, we'd all have to really get into the gutters and extract all the relevant actuarial filth... we'd even have to look at each persons genetics to see who has the longest or shortest life expectancy, contemplate all social behavior and pay attention to personal health. Why?  Because when calculating an NPV, we need to know over what period our income assumptions apply, and how much risk there is in those assumptions changing, etc.  This is all just nuts!  And, the end result will be a statistical caste system of sorts. 

Then again,... while computing earnings potential, perhaps we should consider that America has become a country with very low economic mobility (CNN report) whereby if you are born poor you are likely to stay poor, and if born into wealth you are much more likely to remain in wealth. I.e., in the United States the economic class you are born into is likely the one in which you will stay... worse than many other developed countries. The stats are unreal (NYTimes, 2012 article), so much so that they even bear quoting here:


"... 42 percent of American men raised in the bottom fifth of incomes stay there as adults. That shows a level of persistent disadvantage much higher than in Denmark (25 percent) and Britain (30 percent) — a country famous for its class constraints.
Meanwhile, just 8 percent of American men at the bottom rose to the top fifth. That compares with 12 percent of the British and 14 percent of the Danes.

Despite frequent references to the United States as a classless society, about 6percent of Americans (male and female) raised in the top fifth of incomes stay in the top two-fifths, according to research by the Economic Mobility Project of the Pew Charitable Trusts. Similarly, 65 percent born in the bottom fifth stay in the bottom two-fifths."

No matter how any sane person looks at it, income and wealth are two completely different things, and any argument for how < future-earnings-potential = wealth(today) > is simply asinine, aside from perhaps assuming that real financial-wealth-disparity will only increase over time, as borne out by existing statistics and socio-economic trends.  As such, I posit that if we assigned a "wealth" value to human capital, when constructing wealth-distribution-statistics, that the end breakdown of this "wealth" would still be essentially identical as financial-wealth, or perhaps even more concentrated (to the top few). The data to justify this hypothesis is clear, as demonstrated above.

Proof of Human-Capital Value in the Present (or lack of it)

Sure, human capital is considered when applying for a loan, but make no mistake, it is not being accepted, at a personal level, as any type of "wealth" against which a loan is being granted. The fact you have a college degree may make a bit of a difference as to whether you get a loan, but what is really going to matter most to any banker/lender is your current financial wealth, current cash-flow, your income, and the demonstrated stability or trends in that income and your financial wealth.  The bankers may accept as collateral some of your other financial assets, but they are not going to simply let you get by on the fact you have future potential for income, outside of specialized discrete products like student loans.  

Perhaps the unsecured credit market, Credit card companies and similar firms, care about your human capital?  This is highly unlikely, and I'd argue that they actually demonstrate the fallacy of any current "wealth" attributable to human capital.  These firms grant credit based on current income, and they will instantly raise your rates if you lose your job and that income... you are now a credit risk.  This is not for some giant shift in your "human capital" wealth, it is because they are not in the least bit concerned with that... they only care about your instantaneous cash-flow, and whether you have any real financial wealth (i.e., assets that can provide you the cash you need to right now pay your bills).  

And, nowadays, even insurance companies (like car insurance) will raise your rates if you lose your income: your human capital didn't change, just your instantaneous income-production abilities, and thus your credit rating, and somehow your perceived "risk" as a driver.  To argue that your human capital "wealth" truly changed upon unemployment, would one first have to admit that there exists a true stigma, or bias, against the unemployed, in such a way as to have a huge affect on your future income production prospects, and thus your supposed "wealth" when bringing forward a net-present-value of that income? 

And, I can almost guarantee that your landlord, if you rent, couldn't care less about any supposed future-production-potential-valued-in-present-"wealth", but only about whether you can pay your rent right now in that universally accepted bit of wealth that truly matters: cash on hand.

Human Capital and Taxes

Next, if human capital is so important, why does the USA federal tax-code punish income resulting from the application of your human capital (i.e., you working) so much more than it punishes income resulting from putting existing financial wealth into play (i.e., investments)?  Regular income  —  wages earned from working at a typical job  —  is taxed at a much higher rate than income, within the same total earnings bracket, produced from dividends or capital gains. Well, aside from the most egregious example of a tax that exempts a select few, via the "carried interest" tax treatment, that is.

If Worstall can somehow distort income and the possibility of future income into a form of current wealth, then shouldn't that wage income also be treated just like the financial wealth income when it is employed in investments? Wealth is wealth, according to Tim.  So, why not lower the ordinary income tax rates for a married couple filing jointly, and making under $75K/year, from their current 15% down to where they match the zero percent tax that would accompany the same income if it were all produced from long-term capital gains [see table here]?

The simple fact is that there is a huge tax-code preference (favor; i.e., lower tax rate) for financial wealth derived from existing financial wealth as compared to income produced from human capital. Clearly the US tax code considers income from human capital to be something totally different from income produced from existing financial wealth.  The tax code is clearly telling us all that it is better, more advantageous by far, to have financial wealth upon which income can be derived than to produce income from wages.  That entire mess is a topic for another day, because it is almost certainly to blame for a large part of the real financial wealth-distribution issue, whereby most wealth is incredibly concentrated among the top few percent.

Pensions

Tim did mention one interesting bit regarding the measurement of wealth: how the funds held in a 401K are counted towards wealth, but how a defined pension future annual payout is not counted... again, my initial reaction is to simply say "duh, Tim"... as a 401K, in theory, can be reasonably quickly converted into cash in the present (less a huge tax-penalty) whereas the defined pension cannot (generally, to my knowledge).  Again, Tim confuses (intentionally) future income with what can be attributed to current wealth.

I would argue that, when counting 401K values in financial wealth, that to be counted fairly in the present towards financial wealth, one would have to reduce the value of the 401K balance by the amount of any applicable taxes and penalties. Aside from that, the 401K can be a current cash-equivalent asset whereas the defined pension plan cannot, unless some corporation wants to buy the rights to your pension (if even assignable) and cut you a check right now.  I am not sure about the logistics or legality of that... I will look into it more.

Conclusion

Perhaps in an upcoming blog or two I will dive deeper into thoughts on some of these topics. Worstall has clearly manipulated and distorted reality by way of his latest absurd assertions, surely in hopes of countering the widespread discussion surrounding the unfairness of the current tax and financial system, a system that has led to this massive redistribution of wealth from the masses to the top few.  Things will only get worse if people allow such nonsense to gain credibility.


Wednesday, August 17, 2016

Joe Kernen (CNBC) and his Private Sector Dreamland

This morning I was watching one of CNBC’s Squawk Box episodes where Joe Kernen (a co-host, nicknamed "The Kahuna" or such) was trying to debate with Mohamed El-Erian the merits of private sector vs. public sector when it came to capital investment effectivenessThis particular episode has to do with Mohamed El-Erian's view that the US had better take 3 actions soon to avoid recession, of which one such action includes embarking on large-scale infrastructure fixes to the nation's roads, bridges and transportation systems, especially since the government borrowing cost for any required capital is near zero now. El-Erian's other ideas include tax-reform and over-indebtedness (especially in things like student loans now), but that's another issue.

I want to focus on the infrastructure investment topic and some of what Joe Kernan said during this episode that I take some issue with, specifically his statements about how wonderful and efficient the private sector is at deploying capital, as opposed to the government (aka, public) sector.  Readers remember: don't confuse public-sector with publicly-traded private companies in this upcoming discussion.

Joe said [as closely as I could type it while watching the video] the following:
If you have a [private sector] business, ... whether you exist depends on you watching your P's and Q's and watching every penny, and not overpaying for things, ... you go out of business if you don't do that.... [a private sector business] treats capital frugally like it should be treated.  
Joe was telling Mr. El-Erian something along the lines of "... I'm just talking about a theoretical argument about whether government is as effective at deploying capital as much a the private sector.", and the fact that Joe clearly believes that the private sector is fantastically effective at deploying capital.

Private-Sector Capital Deployment Efficiency and Effectiveness? Sure...

Joe, you must be joking!
What world are you living in?  Surely you are not referring to the private businesses whose stocks adorn the Dow Jones, S&P or Nasdaq indices.  You need only look to any of your guests, El-Erian included, to quickly see how insane your bit about "treating capital frugally" is in this modern private-sector world, especially among the publicly-traded private sector businesses.   You are simply living in a dream land.

For starters, IF private sector firms were at all concerned about the effective or efficient deployment of capital, you wouldn't see them issuing pay packages worth tens or even hundreds of millions of dollars to senior executives (like, e.g., Mr. El-Erian reportedly received at Pimco, a subsidiary of Allianz).  And, I am waiting for someone to tell me how it is effective deployment of capital to secure such awesome talent that can only be had for these enormous sums, and how their superhuman management abilities deserve such pay — you know, their super-human ability to pay the average person in the organization one-three-hundredth of what they make, while laying off masses of those average persons because employees are just a terrible waste of capital when that money can go to their pay instead.  But, this insane executive-pay is just the starting point in a long list of inefficient deployment of capital in private firms.

How about all the cash parked overseas now by major companies?  Trillions!  And, even if it is sitting in negative-yielding government bonds, or otherwise near-zero-yield instruments, somehow that is effective or efficient use of all that capital?  Go ahead... make the obvious arguments over how it is more effective to park such massive sums overseas than to pay taxes on it in the USA, yada, yada...  well, the tax-code is certainly partially to blame, but then again, it ended up this way because these same corporations lobbied elected officials in order to write the tax law as it stands.  This needs to end.  Total ineffective deployment of capital on a grand scale.

Then come all the share buybacks that are being done to further enrich top insiders that have incredible pay deals tied to stock price.  Share buybacks are truly saying that a company has nothing better to do with excess funds than buy back their own shares, if you believe they really have nothing better to do with it.

But, therein lies the problem, the stats clearly show that corporations DO have better things to do with their capital and they are obviously not treating it frugally, as they should per Joe K's commentary. Stock prices are soaring, yet they do stock buybacks while their stocks are already at very high values (hmmm.... that sounds a bit like "overpaying" to me), but the internal core business capital investment is now at an astonishing low — things like investments in machinery and equipment  — the lack of which is contributing to overall productivity slowdowns in the USA economy.  As this article on SeekingAlpha pointed out, courtesy of the National Bank of Canada's Economics and Strategy economist:
"Borrowing by corporations for the purposes of stock buybacks instead of investment in machinery and equipment does little to enhance an economy's capacity for growth. We're getting more evidence of that in the US where the average age of fixed assets is the highest in half a century and productivity growth is the weakest on records."
Gee, that sure sounds like truly effective use of capital, Joe!  I think otherwise. Surely we can do better.

Corporations are sitting on mounds of cash, but yet instead of investing in core business assets (CAPEX), they waste their cash on stock buybacks, over-inflated executive compensation packages, and other unconscionably conspicuous deployment of their capital in ways aimed at self-enrichment and short-term stock price over long-term productivity and sustainable long-term returns to the average shareholder.

Somewhere along the line it has even become an accepted norm, apparently, that a full 10% (or more) of all corporate profits will go to just the few top C-level employees.  Seriously. Just pull up Google Finance and look at a few companies' total net profit values and compare it with what the top insiders are taking in compensation.  It is truly appalling.  Need some examples?  You can choose nearly any company... they all look similar (thus, I am not saying any of these companies are worse than any others... most all are terrible these days in this sense):

  • Try Lifeway Foods on Google Finance, which shows 16.16 million shares outstanding, EPS of 12 cents per share (thus making total earnings of just under $2 million USD).  Now, go to Reuters and look at some of the top player's pay packages and share sales... a mere 3 individuals took home $3.3 million in basic compensation according to Reuters.  Wow!  Great use of capital guys!  Ever heard of the average shareholder?  If you directed two thirds of your inflated executive pay towards the corporate bottom line, your entire business would be in the black! And, presumably the share-price would go up quickly if a profit was attributed to shareholders and not just the entitled few insiders. And, you all own shares, which makes it even more crazy that you are unwilling to risk your cash pay for just share appreciation. What makes you all worth such great pay when your average shareholder has lost 50% in the past year?  If you have so much capital that paying such salaries is the only efficient thing to do, perhaps you should consider the average shareholder first, or maybe some capital improvements for expansion or productivity improvements?
  • Gee, let's look at Capstone Turbine on Google Finance,... wow... Reuters shows 4 guys pulling $2.2 million in basic compensation from a firm that loses money in a big way (net income: -$25 million annual).  Unreal.  Again, don't any of you overpaid self-rewarding insiders see anything even slightly wrong with this situation?  Don't worry: Joe K. thinks you private sector guys are doing wonderful with your precious capital, if he truly includes you in his statement today.
  • And, as to not leave out some big players, how about the likes of Chevron (G-finance site),... Google shows them losing 40 cents/share over past 4 quarters, or roughly $760 million dollars.  But, as you might now expect, the top insiders are doing wonders with that precious capital: there is a nice $50 million dollars in otherwise useless capital that just had to be handed out to the five people that Reuters lists Chevron executive compensation for.  Couldn't $50 million do anything at all to otherwise improve efficiency, even if it were R&D looking into efficiency gains?

There are so many examples of poor use of capital in private organizations, and their "overpaying for things" (something Joe claimed they can't do and survive), that I could go on for weeks (and not only regarding excessive pay packages).  But, clearly Joe Kernen's appreciation of the private sector, and it's alleged ability to so effectively deploy capital, is due to some fantasy view he holds of modern corporate America versus a completely different reality.  Is the public sector really so terrible at capital deployment compared to all this? Maybe. But the bar set by the private sector doesn't really seem terribly high.