Cross-posted from The SQL Herald
By Joey D’Antoni
The new PASS Business Analytics Conference is a new concept for PASS — we’ve seen Business Intelligence (BI) User Groups and even SQLSaturdays dedicated to this subset of PASS, but a whole conference? What is driving this demand? I can’t explain the whole industry, but I can at least provide some perspective from what I see in my window.
I don’t intend to start a debate between relational databases and NoSQL datastores — that’s a religious war I have no intention of jumping into. I’m also not going to abuse the terms "big data" and "data" in combination with some body of water (data pond, data lake, data ocean, etc. — seriously, who comes up with this stuff?). What I will talk about is how a relational database isn’t always the right answer for every data set, and how relational databases from major vendors (especially with enough cores to do serious analytic workloads) are REALLY EXPENSIVE. So, especially since a lot of my expertise is in Infrastructure-based solutions, how did I end up presenting at BaCON?
My organization sees the changing landscape of data — and we generate and save TONS of data. We’re not always choosing the best path for our architecture. So given I’m on the architectural team, I started investigating some alternative solutions like Hadoop and Hive for less structured non-transactional data. To make it easy to learn this stuff, it helped to have a use case, where I could take it from start to finish. I’m not by any means an expert in data analysis, but I am fortunate to be presenting with a great friend who is — Stacia Misner (b|t). So what are we going talk about at BaCON?
Our data set represents about a week’s worth of set-top-box data from the largest cable provider in the US. We are going to discuss our data source and how we used Hadoop and then Hive to allow us to perform multiple types of analysis on the data in an extremely nimble fashion. From there, using Power View and some other tools, we see the impacts of various events on metrics such as viewer engagement and channel preferences.
For those of you who are SQL Server and/or Oracle professionals — this is a brave new world, but think of it like learning a new version of something. You are building on an existing skill set — you already do tons of data analysis in your job. This is just another step in the process, and it will be part the skill set of the 21st century data professional.
Currently rated by 0 people