NULL and Aggregation (AVG())


While we create the database architecture, we decide to use the special value NULL or replace it with empty string or zero. Later on the development stage we manipulate the data and when we need to aggregate, we have to keep in mind the NULL or “blank” columns. The funny part is while using AVG().

Create a temp table in my playground.

Aggregate the data in a table that respects the NULL value.

1

A message “Warning: Null value is eliminated by an aggregate or other SET operation.” appears to tell us that the NULL value is not included in the aggregations. The numbers are good:

  • MIN of column A is 5; MAX is 15
  • AVG of column B is 5.5 –> ((1 + 10) / 2)…

Simulate a table that doesn’t respect the NULL value and uses empty string and zero instead.

Run the same example

2

There is no warning message, but the numbers…

  • MIN of A is 0, but 0 replaces NULL (Nothing) and infact it is not an actual number, representing the data; MAX is OK (if we store only negative values, all of them less than 0, it will be wrong)
  • AVG of B is wrong, as 0 isn’t representing a real data again –> (0 + 1 + 10) / 6 = 1.83

Conclusion:
SUM is OK as adding 0 to any summing doesn’t change it.
MIN and MAX are OK in some cases, when they don’t contradict the business logic.
AVG is the aggregate that needs a special attention as we are not sure which of the values are real zero and which are replacements of NULL.

Keep it simple :-)

Leave a comment

Your email address will not be published. Required fields are marked *