15 December 2020

How to define a variable within a Microsoft SQL query

Sometimes a SQL query ends up needing an expression to be represented more than once. For example, some queries I've worked on have required a complex-to-calculate denominator that needs to be reused in multiple division calculations.

Usually one just duplicates the expression, but this is undesirable when:

  • the SQL for the expression is particularly long
  • or, particularly computationally expensive to calculate
I had long wished there was a way to embed a variable in the body of a select statement. Something like:

  t.numerator1 / denominator,
  t.numerator2 / denominator,
  mytable t
  demoninator = ....some complex calculation ...       -- 'let' is NOT valid SQL

Sadly the above is not valid SQL. However, it can be achieved as follows:

  t.numerator1 / v.denominator,
  t.numerator2 / v.denominator,
  mytable t
outer apply (
     demoninator = ....some complex calculation ...
) v

Or more generally, this can be used with multiple variables, and the variables can be used in subsequent joins, and other query clauses.

  mytable t
outer apply (
     variable1 = ...,
     variable2 = ...,
     variable3 = ...
) v
  mytable2 t2 on t2.id = v.variable1
  v.variable2 = ...
order by

There is an important caveat with this: the outer-apply manifests in the query execution plan as a nested loop join. However, this isn't necessarily a bad thing - rather it's representing the fact that the variables will always be calculated prior to other elements of the query being processed.

Note: don't go overboard with this method. But it can be a useful solution when duplication of expressions becomes problematic.

27 June 2020

Moving Average in Excel Power Query

Here's a general-purpose Power Query 'rolling' function that can be used to make a rolling averages / moving average, or any other aggregate function such as a rolling standard deviation, etc.

A self-contained example of a four-item moving average over a list of the numbers 1 to 10:

let Rolling = (list, n, fn) =>
    buffered = List.Buffer(list)
        if _<n-1
        then null
        else fn(List.Range(buffered, _-n+1, n))
  result = Rolling({1..10}, 3, List.Sum)

Alternatively, to make as a reusable function:

  1. Select the Data tab
  2. Click Get Data > From Other Sources > Blank Query
  3. In the Power Query Editor window, type the name Rolling into the top-right Name box
  4. Open Advanced Editor (on the Home tab)
  5. Enter the following query:
(list, n, fn) =>
    buffered = List.Buffer(list)
        if _ < n-1
        then null
        else fn(List.Range(buffered, _-n+1, n))

The Rolling function should now be defined, and usable in other queries, such as:
=Rolling( {1..10}, 5, List.StandardDeviation )

for a 5 day rolling standard deviation over the numbers 1 to 10.

26 March 2020

Determining the growth rate of a published chart

A friend drew my attention to this page comparing the Bay Area COVID-19 cases to New York. But it can be hard to judge visually how steep a log chart is.

To calculate the growth rates, for example of New York on the following chart:

  1. Take a screen capture
  2. Paste it into MSPaint, or similar
  3. Drag the selection box from one log-axis marker to the next.
  4. Use MSPaint's ruler indicator (bottom left corner) to see how high the selction is.
    1. E.g. in the following example, measure from the 1,000 grid line to the 10,000 grid line.
    2. The distance is 198 pixels for a 10x multiplication 
    3. Let's call this PixelsPerTenfold
  5. Pick two date lines around the slope being measured
    1. Start dragging the selection box from the point where the data line intersects the first date line.
    2. Drag to a point where the data line intersections the last date line.
    3. Use the ruler in the corner to measure the vertical distance.
    4. Call this PixelsIncreased
    5. In the example below it is 240 pixels.
  6. Apply calculations as follows:
PixelsIncreased  = 240
PixelsPerTenfold  = 198
Days = 3
Daily Multiplier = 10(PixelsIncreased / PixelsPerTenfold / Days) = 2.54
Daily % Increase = 100 * (DailyMultiplier-1) = 154%
Days to Double = loge2 / logeDailyMultiplier = 0.75 days
Days to 10x = loge10 / logeDailyMultiplier = 2.47 days

Exponential growth - Year 10 maths revision

"When will I ever need to use logarithms, or raising something to the the power of 1/n in real life". During Covid-19, it turns out. Since no one likes logarithms, I thought I'd just post a few formulas. The following should work on any system of exponential growth.

(Naturally, for Excel formulas, substitute in the value or cell reference for X, Y, T, etc)

FormulaExcel FormulaExample
Initial CasesX112  Aus # Mar10
Final CasesY2,431  Aus # Mar25
TimeT15  days
Multiplier over durationM = Y/X=Y/X21.71
Daily MultiplierMD = M1/T=(Y/X)^(1/T)1.23
Daily % Increase    =(MD -1) * 100=((Y/X)^(1/T)-1)22.8%  Format as %
Time to doubleT2X = loge2 / logeMD=LN(2)/LN((Y/X)^(1/T))3.38  days
Time to 10xT10X = loge10 / logeMD=(Y/X)^(1/T)/LN(10)11.22  days
Convert from 'time to double' to 'time to 10x'
T10X = T2X * loge10 / loge2=TDOUBLE * LN(10)/LN(2)
Convert from 'time to 10x' to 'time to double'
T2X =T10X * loge2 / loge10=TTEN * LN(2)/LN(10)
Convert from 'time to double' to 'daily multiplier'
MD = elog2 / T2X=EXP( LN(2) / TDOUBLE)
Apply a daily multiplier for N days

6 April 2018

Using NULLIF to avoiding duplicating long T-SQL sub expressions for safe division, exclusive between

I seem to find myself generating a lot of SQL from other higher-level query structures. And it's bugged me for a while that if I have some long SQL expression, and then need to perform certain operations such as division, or exclusive BETWEEN, then it gets repeated.

So it seems like the ISNULL and NULLIF functions are awesome for solving this kind of problem.

The safe null-if-zero division problem

If we want to divide by some-long-expression and want to return NULL if it is zero, then we might find ourselves writing:

CASE WHEN some-long-expression = 0 THEN NULL ELSE numeratorsome-long-expression END

But we can avoid including some-long-expression twice by using:

numerator / NULLIF(some-long-expression, 0)

The inclusive/exclusive BETWEEN problem

The BETWEEN clause gives us lower <= test-value <= upper, including both lower and upper in the test range. But sometimes we might need lower only, or upper only, or both, to be exclusive tests.

This can lead to duplicating long expressions by having to write:

lower < (some-long-expression) AND (some-long-expression) < upper

But we can use NULLIF to similarly avoid repeating the long expression being tested:

NULLIF( NULLIF( some-long-expression, lower), upper) BETWEEN lower AND upper

There's a few different combinations of scenarios we might need, as follows:

ISNULL(expr, lower) BETWEEN lower AND upper
expr BETWEEN lower AND upper
NULLIF(ISNULL(expr, lower), upper) BETWEEN lower AND upper
NULLIF(expr, upper) BETWEEN lower AND upper
NULLIF(ISNULL(expr, upper), lower) BETWEEN lower AND upper
NULLIF(expr, lower) BETWEEN lower AND upper
NULLIF(NULLIF(ISNULL(expr, some-midpoint) lower), upper) BETWEEN lower AND upper
NULLIF(NULLIF(expr, lower), upper) BETWEEN lower AND upper

13 November 2017

Jaycar Arduino 16 Key Touch keypad XC4602

So my ten year old daughter wanted to build a keypad enabled electric lock for her bedroom "to keep her little brothers out".  TLDR: go here and solder.

We picked up the Jaycar Arduino Compatible 16 Key Touch keypad module cat XC4602, which is based on the TTP229 capacitative touch sensor IC. The board's by duinotech, and the only discernible numbers are 1003010 on the board, and 8229BSF on the IC.

There's a downloadable Arduino demo on the Jaycar page, but we didn't have any luck with it.

To get all 16 keys working, we followed the helpful instructions posted by this guy. Yes, it does actually seem like you have to solder at least jumper 3 (yellow link on his post). That's the only one we did.

Then he's got a link to his code, but it's hidden down the bottom of the page. Direct link. You'll need to create yourself a Dropbox account to be able to download it though. The only other gotcha is that his sample code writes to serial at 115200 baud. Make sure you set your monitor to the same or you'll just get rubbish.

After that, it's happy days. Just follow his wiring, and sample code. I figured I'd put this post out there pointing to his blog because there's a number of other posts out there about this IC that were much less helpful.

Aside, we're using the Jaycar Electric Strike EL-973 (fail safe version), Jaycar cat LA5081, for the lock mechanism. It needs 12V to activate, so I figured we'd drive it like a motor with an IRF520 mosfet. It works a treat.

29 January 2017

Arduino Grove LCD RGB Backlight not working

The Grove Starter Kit is a fun way to start on Arduino for young kids, because components can be easily plugged in, rather than needing to poke components into a board.

The first time I tried to use the Grove LCD RGB Backlight board, I was disappointed that it seemed to not work. I could change the backlight color, however no text would appear.

The simple resolution: The LCD specs state that it requires 5V to work. The Grove Base Shield has a slide switch to select 3.3V or 5V. Sensibly, it came preset to 3.3V. Slide it over to 5V and the LCD text starts working. When all else fails, RTM.