What is Meant by Big Data?

Big Data, Front Page Nov 13, 2013 1 Comment

The term Big Data describes a massive volume of both structured and unstructured data that can be collected within an organization that is so large that it is difficult to process process using common database management tools or traditional data processing applications.   When dealing with extremely large datasets, organizations face difficulties in being able to create, manipulate, manage, transfer, and query the data.  In addition, big data is difficult to work with using most relational database management systems, business intelligence and analytics applications, and desktop statistics and visualization packages.  These types of applications and systems can typically handle large datasets but not the massively large datasets included in big data.  Instead big data could require massively parallel software running on tens, hundreds, or even thousands of concurrent servers.

An organization that has Big Data may have billions to trillions of records stored their organization’s databases, file systems, and storage units which could comprise petabytes (1,024 terabytes) or exabytes(1,024 petabytes) of data.   And this data could all come from different many different sources including web, point of sale, customer contact center, social media, mobile devices, transactional systems, data warehouses, etc.   In addition, the data could could be stored either structured formats within a database management system or unstructured formats that include text, video, audio, graphics, e-mail messages, word processing documents, spreadsheets, presentations, webpages and many other kinds of business documents.

The term Big Data is believed to have originated with web search companies who had to collect, categorize, and index very large distributed aggregations of loosely-structured data that required fast response times for all queries.




One Response to “What is Meant by Big Data?”

  1. Reply Ramkumar Yaragarla says:

    Hi, I agree with you. Big data is all the unstructured voluminous data that a typical social media tool generates. It could be likes, shares, tweets, videos etc. You are right, it is very difficult to analyze big data. Technology infrastructure like Hadoop is helping us.
    But then the bigger challenges remain. The question of ‘when to analyse’ the data is a challenge. Another challenge is ‘asking the right questions’. Unless right questions are asked it is mere impossible to measure such voluminous data.
    Enjoyed reading through your article. Cheers. Ramkumar

Leave a Reply

Facebook Like Button for Dummies