How big MNC’s like Google, Facebook, Instagram, manages, Thousands of Terabytes of data.

Pravat kumar Nath sharma
4 min readMar 8, 2021

INTRODUCTION:- Big data is a term that describes the large volume of data — both structured and unstructured — that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves .now a day due to new age of internet in every sector terabytes of data generated.

so to handle this data how big company like google ,facebook,instagram use some technique i wil describe that below

WHAT ARE THE PROBLEM THEY FACED?

Day by day due to use of internet is increasing. so in every field we use internet.due to internet every company user increasing like google ,facebook,instagram.,etc and everything going on internet.

and the user dataa company want to store to use it commercial purpose as well as for their business annalyis. but due to day by day incerease of user

they face a lot of problem to mange this data.the problems are like

Where to store data?
If stored how to process data?
How to retrieve data faster?
How to stored and retrieve data at Real-time?
How to find raw data for the industry?
How to manage that untapped data?

SOCIAL MEDIA:-

users. Every one of these interactions creates a quantifiable data point that can be tracked, segmented, and analyzed for insights. Social media data essentially creates a record of user behavior that allows companies to build engagement strategies that help to promote their business.

One of the key advantages of this data is that there’s simply so much of it. A staggering 2.62 billion people used a social media platform of some kind in 2018. By 2021, that number is expected to exceed three billion. Facebook, by far the most popular social media platform, has a little over two billion active users alone. The data generated by these platforms isn’t just vast; it also provides a real-time glimpse into what users are doing. Rather than waiting for annual or quarterly reports on consumer behavior, companies can follow trends and reactions as they happen.

FACEBOOK-: Now a day youth to old all are using facebook and spent a lot of time in this platform. they are doing a lot of activty in this platform. the users are uploading video,audio,photos ,which create terabytes of data every day .

this platform has almost 2.7 billion active users until the second quarter of 2020. Facebook generates 4 petabytes of data per day. its really huge in number.

  • GOOGLE- Google is a Search Engine that has 4 billion users and it processes 3.5 billion searches per day and if we break down this it processes 40,000 searches per second on an average. Google processes approximately 20 petabytes of data per day through an average of 100,000 MapReduce jobs spread across its massive computing clusters.

The term “big data” refers to data that is so large, fast or complex that it’s difficult or impossible to process using traditional methods. The act of accessing and storing large amounts of information for analytics has been around a long time. But the concept of big data gained momentum in the early 2000s when industry analyst Doug Laney articulated the now-mainstream definition of big data as the three V’s:

Volume: Organizations collect data from a variety of sources, including business transactions, smart (IoT) devices, industrial equipment, videos, social media and more. In the past, storing it would have been a problem — but cheaper storage on platforms like data lakes and Hadoop have eased the burden.

Velocity: With the growth in the Internet of Things, data streams in to businesses at an unprecedented speed and must be handled in a timely manner. RFID tags, sensors and smart meters are driving the need to deal with these torrents of data in near-real time.

Variety: Data comes in all types of formats — from structured, numeric data in traditional databases to unstructured text documents, emails, videos, audios, stock ticker data and financial transactions.

At SAS, we consider two additional dimensions when it comes to big data:

Variability:

In addition to the increasing velocities and varieties of data, data flows are unpredictable — changing often and varying greatly. It’s challenging, but businesses need to know when something is trending in social media, and how to manage daily, seasonal and event-triggered peak data loads.

Veracity:

Veracity refers to the quality of data. Because data comes from so many different sources, it’s difficult to link, match, cleanse and transform data across systems. Businesses need to connect and correlate relationships, hierarchies and multiple data linkages. Otherwise, their data can quickly spiral out of control.

--

--