If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Course: AP®︎/College Computer Science Principles>Unit 5

Lesson 2: Big data

The era of big data

The digital world is constantly collecting more and more data. Whenever you use an online service, you're contributing to a data set of user behavior. Even by simply using electricity and water in your house, you're contributing to a data set of utilities usage.
With the increasing number of people and cities connected to the Internet, data sets are increasingly larger in size. One report estimates that the total size of digital data will be 175 zettabytes in 2025.${}^{1}$
How much data is 175 zettabytes, anyway? A single zettabyte is a trillion gigabytes. A modern smartphone stores about 32 gigabytes. To store 175 zettabytes, we would need 6 trillion smartphones (1000 smartphones for every living person!).
Whew, that's a lot! But how big are the individual data sets?
These stats can give us an idea...
• A single MRI scan results in 20,000 images.${}^{1}$
• Google processes 3.5 billion search queries per day.${}^{2}$
• Instagram users post 54,000 photos each minute.${}^{3}$
• An autonomous vehicle generates 11 terabytes of data each day.${}^{4}$
• Twitter users post 3,000 tweets every second.${}^{5}$
Big data sets are so large that our traditional ways of storing and processing them are no longer adequate, presenting challenges to computer scientists and data engineers. On the plus side, they're also so large that they offer new opportunities for analysis that were impossible on a small data set.
In this lesson, we'll explore where big data comes from and the exciting ways that we can use it.

Want to join the conversation?

• Do we expect to deal with something of the scale bigger than big data? Big data is here and we're only getting started, how long in a professional opinion??
• At the moment we're collecting a lot of information. In fact it's so much that it's difficult to figure out what's useful and what's just noise.
It's like when you try to listen to 10 youtube videos at once, very difficult to make sense of anything.

I would assume that as time progresses the tools used to analyze big data will become better and it will be possible to gain more insights from it (make sense of all the noise).
• "One report estimates that the total size of digital data will be 175 zettabytes in 2025."
WHAT IS A ZETTABYTE?