Gosh, it’s been almost a year since I last touched on the subject of JSON.
But a couple of articles crossed my path that I think are worthwhile to reflect on so I’m going to share them here:
- TDWI had a good piece on the necessary evolution of data warehousing in the present era of denormalized data. Worth a read here.
- Josh Willis has started a project called exhibit which uses Hive’s UDF/UDTF to provide ‘nested SQL’ capabilities. I haven’t had time to explore this yet but it is an creative approach.
The two ideas aren’t related beyond the one principle: denormalization. Early on, many observers equated denormalization with schema-less-ness. But the two ideas are distinct if related. When MongoDB offered a way to store JSON documents, the key feature that everyone latched onto was that MongoDB did not enforce a schema for each document. So was born the idea about schema-on-read vs schema-on-write. But denormalization doesn’t require schema-less-ness. You can clearly store denormalized data with a schema.
Obviously, this is just a small idea but one that requires more thought.
In the interim, I hope that you all my readers had a Merry Christmas. Have a bright and Happy New Year too!
Update: Microsoft has just announced JSON support in SQL Server. There is a great post explaining their thinking and motivation here.