What is Meant by Big Data?
The term Big Data describes a massive volume of structured, semi-structured, and unstructured data that can be collected within an organization that is so large that it is difficult to process using common database management tools or traditional data processing applications. When dealing with extremely large datasets, organizations face difficulties in being able to create, manipulate, manage, transfer, and query the data. In addition, big data is difficult to work with using most relational database management systems, business intelligence and analytics applications, and desktop statistics and visualization packages. These types of applications and systems can typically handle large datasets but not the massively large datasets included in big data. Instead big data could require massively parallel software running on tens, hundreds, or even thousands of concurrent servers.
An organization that has Big Data may have billions to trillions of records stored their organization’s databases, file systems, and storage units which could comprise zettabytes (1000 petabytes), exabytes(1000 petabytes), petabytes (1000 terabytes), terabytes (1000 gigabytes/GB) of data. And this data could all come from different many different sources including web, point of sale, customer contact center, social media, mobile devices, transactional systems, data warehouses, etc. In addition, the data could be stored either structured formats within a database management system, semi-structured formats including xml, json, & log files, or unstructured formats including text, video, audio, graphics, e-mail messages, word processing documents, spreadsheets, presentations, webpages and many other kinds of business documents.
The term Big Data is believed to have originated with web search companies who had to collect, categorize, and index very large distributed aggregations of loosely-structured data that required fast response times for all queries. Moreover, an organization that has Big Data may have billions to trillions of records, documents, or files stored their organization’s databases, file systems, and storage units. And this data could all be captured from a variety of different sources including internal applications, web, e-mail, instant messenger, point of sale, customer contact center, social media, mobile devices, transactional systems, & data warehouses.
Gartner, Inc, the world’s leading research and advisory company, defines Big Data as high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.
Types of Big Data Formats:
- • Structured data within a database management system (DBMS)
- • Documents (Word, Excel, PowerPoint, PDF, etc.)
- • Email Messages
- • Text Messages
- • Text Files
- • HTML Files
- • Video
- • Audio / Voice
- • Graphics
- • Log Files
- • Data Files
- • XML Files
- • JSON Files
- • Binary Files
Characteristics of Big Data:
Six unique characteristics have been determined that provide a context for describing the overwhelming challenges of Big Data initiatives. These characteristics explain why the implementation of Big Data initiatives is different and more complex than other information technology system deployments.
Benefits and Advantages of Analysis of Big Data:
The importance of Big Data is not about how much data an organization has in its possession or the amount of data it has access to. Rather, the importance of Big Data is about how effectively the organization analyzes its data and how the organization makes decisions from its data. The more effectively an organization can analyze its data, the more potential it has to make more informed decisions and be successful. And even though there are significant challenges in conducting Big Data initiatives, there are also significant benefits and advantages for conducting analysis on Big Data.
- • Improved decision making: Big Data analysis allows for deep understanding of data, enables development of predictive models, identifies trends and correlations in the data, and uncovers hidden patterns in the data.
- • Increased ability to predict results: Big Data analysis enables quantitative analysis of data and the capability to model scenarios based on the data.
- • Increased fraud and waste detection: Big Data analysis enables anomalies and outliers in data to be rapidly identified and corrected.
- • Enhanced data transparency: Big Data initiatives increase the amount of data that can be shared with users and increases the relevancy of the shared data to the user that is accessing the data.
- • Improved organization productivity: Big Data analysis helps identify the most important issues faced by an organization and helps determine the best actions to be taken to resolve issues.
- • Improved customer service: Big Data analysis provides a wealth understanding of both customer sentiment and customer issues, and enables organization to rapidly and pro-actively implement solutions to solve customer issues.
- • Improved market intelligence: Big data analysis provides insights into both the current and future state of the environment that an organization conducts operations within.
- • Increased risk analysis: Big data analysis helps quantity the impact and significance of an organization’s threats and opportunities.
- • Quickly recognize errors: Big Data analysis enables mistakes and errors of an organization to be rapidly identified and quickly remedied.
Hi, I agree with you. Big data is all the unstructured voluminous data that a typical social media tool generates. It could be likes, shares, tweets, videos etc. You are right, it is very difficult to analyze big data. Technology infrastructure like Hadoop is helping us.
But then the bigger challenges remain. The question of ‘when to analyse’ the data is a challenge. Another challenge is ‘asking the right questions’. Unless right questions are asked it is mere impossible to measure such voluminous data.
Enjoyed reading through your article. Cheers. Ramkumar