New information sources that generate unprecedented volumes of data have emerged in recent years. Social networks, mobile devices, sensors, GPS devices, photos and videos are stored in databases that can reach petabytes or exabytes.
According to IBM, 90% of information currently generated has been created in the last two years. As a result of this growth, the term Big Data has been popularized, which is destined to become one of the most promising technology trends in the coming years.
What is Big Data?
Arguably, the main objective of analyzing Big Data is to transform vast amounts of uncorrelated data into something useful for decision making. The potential of Big Data is practically immeasurable and a practical application in almost all sectors can be found. Some of them can be more intuitive, such as detecting complex trends enabling decisions to be made in financial asset markets or anticipating any type of natural or weather-related disaster. Others are more sociological: by analyzing patterns of text in Twitter you can understand and predict the behavior and feeling of a social group or even detect knowledge gaps among children to establish curricula. It can also increase our quality of life by improving medical diagnoses by correlating a multitude of digital medical tests.
Big Data Challenges
More and more companies are realizing that the large amounts of information that they accumulate can play a critical role in the decision making carried out by management teams and the creation of new business. However, this is not an easy task and there are some challenges ahead to achieve it.
Today, almost any organization that wants to implement Big Data encounters technology gaps. Conventional tools are not designed to get the most out of the large volumes of information so new investment in technology is needed. The new challenges proposed by Big Data to the IT departments of companies include mass data storage, integration of multiple formats (text, documents, photos, videos and among others), incorporating information from multiple sources (even from outside the actual organization) and processing and getting results in real time.
It is essential to develop techniques to measure Big Data analysis that are appropriate to each company’s objectives while these technologies are being integrated. Existing techniques are extremely varied and range from analyzing patterns by comparison to genetic algorithms to detect nonlinear trends. For example, by using specific analysis techniques on a consumption data set, commercial behavior can be determined to help us in our business, while other different techniques on the same data set can detect possible fraud to the government. Therefore, close collaboration between IT teams and different business departments is a key challenge to determine which techniques for analyzing Big Data are more useful for decision making in each sector of activity.
Despite having managed to have an appropriate technological scenario with honed analysis techniques, it may be the fact that the sample test of data available is incomplete. This may also cause the conclusions drawn from the analysis of the Big Data to be inconclusive and lead to bad decisions. Each day about 350 TB of tweets are generated, which might suggest that Twitter is an excellent breeding ground regarding current societal trends. However, this statement is not entirely correct because the trends observed correspond to the part of society that uses Twitter. Much of the success of Big Data lies in the correct presentation and interpretation of the analyzes by management teams considering the information sources and their biases.
Finally, it is worth mentioning the new challenges with regard to legislation. In a highly globalized environment, processing data with sources in different countries must be consistent with international data protection laws. One of the greatest potentials of Big Data is the combination of data from different sources, whether businesses or Public Administrations, since the greater the sample test of data, the more accurate the analyzes will be and the more applications will emerge.
Big Data analysis is a new trend in information use that can generate benefits for both companies and society as a whole given the significant number of practical applications that can be given to it. However, the technical challenges posed are not easy and require a lot of investment which may cause the benefits to not
be significant in the short term. The effects of legislation on data protection are harder to determine, but can also hinder realizing the full potential of Big Data.
Manager in the Corporate Data Warehouse, BBVA, Madrid (Spain)