What is Big Data? or What is Big Data to me? The dot-to-dot story

Well, Big Data maybe overhyped – but it is here. There are many occasions where I still meet customers and peers who don’t understand the concept and throws buzz words Data is an asset, Volume, Velocity and probably another 14 new V definitions.
Therefore, I would try to explain what Big Data is for me and why it is so cool to deal with Analytics and Big problems. Well in the beginning there was Data…
Forget about ACID, CAP Theorem, these are fundamental to understand but it is not enough.

When my Son was 5 yrs, he enjoyed fun activities and one of his favorite ones (before DS came to our life) was solving dot to dot puzzles. We bought him 1-30 dot-to-dot workbook and he enjoyed practicing numbers and also discover fun picture.
One of the benefits of dot-to-dot activities is improving children’s motor skills and eye-hand coordination (one of the things I discovered when I research)


One day, my lovely wife noticed that his workbook is about to completed so she went online to Amazon and decided to order new dot-to-dot workbook with Dinasour’s theme (Again it was before Ninjago and Pokemon times )

When he got the workbook, he was so excited and while I was sitting in my home office working, he approached me with tears in his eyes and said: “Dady, it is too Hard, I need your help”.
When I looked at his workbook, I was amazed, the workbook he got was Extreme Do-to-dot and every two pages has over 1,400 dots.

I laughed so hard and asked my wife what in the world did she think about when she bought it, but she was amazed as she didn’t notice that but she was looking for Dinosaur dot-to-dot.


We explained Jonathan that this workbook is for age 8 and up and he is not supposed to do it and even if he is 8, 18 or 38, we will make him do it if he wont behave.

At that night, reading and working on one of my projects, dealing with Analytics, noSQL and all the good stuff, it hits me: Dot-to-Dot!!! This is what Big Data for me.

From that day, I carry with me the two workbooks of dot-to-dot and on sessions I present, I show the 1-30 dots workbook and ask them to solve on picture or how much time will it take them and common answer is 20-30 seconds.

Then, I hand over the 1400 dots puzzle and ask them to solve or assess how much time will it take them. Lets say that it takes more than one minute :-)

Within Big Data, we expect to solve the 1400 dots in the same time it takes us to solve the 30 dots (mmmm, someone mentioned MapReduce), the logic is the same simple logic of following sequential numbers, but the insight is different ( different picture for every problem).

To me it is symbolic and I use this analogy a lot. Some argues that the problem is complex but it is not big data as it doesn’t present velocity, but my answer is that dot-to-dot problem is N Dimensions problem with K dots requires to discover something new and wait for the A-Ha moment!