Archive | Big Data RSS feed for this archive

The Abstraction of Apache Hadoop

February 21, 2013


I think Apache Hadoop is more or less synonymous with Big Data, but in what context?

Is it Apache Hadoop as in the framework?
Is it Apache Hadoop as in the platform?


Continue reading...

Big Data – Storage Mediums & Data Structures

February 13, 2013

1 Comment

My working title was Big Data, Storage Dilemma.

They say dilemma. I say dilemna. I’m serious. I spell it dilemna.

Big Data presents something of a storage dilemma. There is no one data store to rule them all.

Should different data structures be persisted to different storage mediums?


Continue reading...

The Structure of Big Data

January 31, 2013


First things first, all data is more or less structured. That being said, there is…

  • Structured Data
  • Semi-Structured Data
  • Unstructured Data

I tend to think of it as: data, composite or simple, with or without content. In that context, email is structured composite data (from, to, subject, date) with unstructured content (message body). The composite data is structured. The content is unstructured. Though simple data may or may not be structured. The ‘subject’ data is unstructured. The ‘to’ data is structured. It is composed of a local-part (username) and a domain.


Continue reading...

Big Data and the Flying V

January 29, 2013


Big Data in Theory

What is it? It’s big data. Right?

I’m not sure if I like the term Big Data. I think it’s right up there with the term Cloud.

I do, however, like the framework created by Doug Laney: Volume, Velocity, and Variety. It’s the de facto description of Big Data, and it predates the Big Data phenomenon. That, and I like both alliteration and the KISS principle. Who doesn’t?


Continue reading...