Gosh, it’s been almost a year since I last touched on the subject of JSON.

But a couple of articles crossed my path that I think are worthwhile to reflect on so I’m going to share them here:

  • TDWI had a good piece on the necessary evolution of data warehousing in the present era of denormalized data. Worth a read here.
  • Josh Willis has started a project called exhibit which uses Hive’s UDF/UDTF to provide ‘nested SQL’ capabilities. I haven’t had time to explore this yet but it is an creative approach.

The two ideas aren’t related beyond the one principle: denormalization. Early on, many observers equated denormalization with schema-less-ness. But the two ideas are distinct if related. When MongoDB offered a way to store JSON documents, the key feature that everyone latched onto was that MongoDB did not enforce a schema for each document. So was born the idea about schema-on-read vs schema-on-write. But denormalization doesn’t require schema-less-ness. You can clearly store denormalized data with a schema.

Obviously, this is just a small idea but one that requires more thought.

In the interim, I hope that you all my readers had a Merry Christmas. Have a bright and Happy New Year too!

Update: Microsoft has just announced JSON support in SQL Server. There is a great post explaining their thinking and motivation here.