BIG DATA

AN ENDLESS WORLD OF POSSIBILITIES

Big Data

The term “Big Data” was originally associated with three key concepts: large volume of data, variety and complexity of data, and speed of processing.

Over the last decade, the amount of data available has grown exponentially as different types of data sources and origins have been added such as mobile phones and wearables, social networks, RFID tags, cameras, vehicles, TV’s and IoT devices, among others. There is no doubt that there is now a huge volume of data available and therefore the concept of “Big Data” has evolved, becoming more associated with the strategies, methods and tools used for its processing.

Is Big Data replacing Business Intelligence?

No, quite the opposite, they complement each other very well. The main difference with traditional BI is a technical issue: how we are going to store the data, what tools and architecture are going to be used and, finally, what strategies to apply.

Both concepts pursue the purpose of “making relevant and reliable information available, to the right people, at the right time” to support better decision making, as quickly and efficiently as possible.

Big Data Cycle
Big Data vs Business Intelligence

However, processing large (huge) volumes of data of different types and origins, unstructured (or at least a large part of it), in near real time, requires different strategies and tools than BI.

What is this unstructured data? e-mails or text strings, photographs, videos, audios, geo-references, sensor values and results of applying AI on any of these, among others.

"Not everything that counts can be counted and not everything that can be counted counts".

Albert Einstein

Architecture and Big Data Tools

The information management cycle consists of four main stages: data collection, storage, processing and application. Particularly in Big Data, the architecture will be characterised by:

Distributed file systems

To facilitate the storage of large volumes of raw data

NoSQL databases

For the storage of unstructured data

In-memory DB

As a consequence of processing data with these characteristics

This provides a solution to the three concepts that characterise the concept of Big Data mentioned above.

The 3 problems of Big Data

Various technological ecosystems can be formed in which to develop a Big Data solution. The possibilities are endless, but it is worth highlighting tools such as Apache Hadoop, MongoDB, Jupiter, Apache Spark, Elasticsearch, Apache Storm and Apache Kafka, which are probably the most widely used tools today.

"I only trust the statistics I have manipulated".

WINSTON CHURCHILL

Challenges in Big Data

The quality of the data and the fairness of the code, the security and confidentiality of the system, and the ethics with which the information is treated are the most important challenges in Big Data.

Not all data generates information, but yes, all data has a cost associated with collection, processing, storage and eventual discarding. Currently, less than 2% of data is used by organisations in the decision-making process and it is estimated that of the total volume of data, 25% to 30% will have value to contribute to the decision-making process.

Do you need our help?

Queremos Ayudarte

We want to help you