Analyzing 5 Big Data premises

Big Data

On the previous post I enumarate 5 premises about Big Data, that are:

  • Integrate high data volumes of transactional data and interaction..
  • Trust your data.
  • Provide auto service to users, analist, developers, data stewards, project owners and usuers of a business.
  • Adaptative service.
  • Administration of metadata.

Let’s explain better.

High volumes of data

Big Data requires to deal with high volumnes of data, but this volumes can be higher or lower dependineg on the frecuency of data generation. So, if you have a small business where you store all transactions done, the amount of data will be different if we only capture 10 operations/day that if we captre 10.000. Of course, as the time goes by, the volume of data will be different.

Is it so important to have high volumen of data? I think, the higher, the better, but I will put some effort in validation of data, and on how you trust on your data.

I have the theory that a big part of success when transforming data into information is due to veracity of data. Of course, the higher, the better!

However, the biggest Big Data challenge is how to store all generated data (of any type) of any aspect of a business. Things are improving in this way!

Big Data actors.

Another importat aspect of Big Data, and it’s one that should be improved is about actors, or the people involved in the Big Data process. There are several actors:

  • The analyst should provide a detailed analysis about the database, format to insert data, about the insertioin process, backups, preparing data and queries for statistics analysis, and more ….
  • The programmer should build tools to accomplish the proposed objectives of the business, and must provide a nice GUI for all user in order to input data and to show results.
  • The statistical must ask hipothesus and give value to results

Every actor required data tailored to their needs.

Data of the data, metadata

Metadata is another importat issue, sometimes they slip past, but it’s always good to know what is the source of you data.

Have a nice day!