If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Search history

A search engine is a service that builds an index of the World Wide Web and gives users a way to search that index. The most popular search engine is Google, but it's not the only one, and as we'll see, search engines aren't all the same when it comes to data collection.
Now that search engines put an entire Web full of answers at our fingertips, it's tempting to use them to answer all of our burning questions.
An animated GIF of three search queries being typed into a search box: "how to build a jet ski?", "are green bananas poisonous?", "what are rubber baby buggy bumpers?"
Once we type our questions and press "Search", it's up to the search engine what they will do with the data.

Collected data

Depending on which search engine we're using, our queries might be getting logged in a database and stored for all time.
A search query itself isn't typically private information - there are probably many people in the world who want to build a jet ski. However, the search engines can log much more than the query; they can add all sorts of potentially identifiable information.
A search query record in a database might look something like this:
Search queryDateTimeIP addressUser agent
how can I build a jet ski?March 11, 202011:14 AM49.121.111.73Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
If you repeatedly use the same search engine on the same computer and Internet connection (as many of us do, at home), your search queries will all contain the same IP address.
Consider what multiple queries would look like in that database:
Search queryDateTimeIP addressUser agent
"how can I build a jet ski?"March 11, 202011:14 AM49.121.111.73Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
"home depot near crescent city"March 11, 20204:00 PM49.121.111.73Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
"cheap pizza delivery to 95543"March 12, 20209:07 PM49.121.111.73Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
"windsor family tree"March 13, 20202:32 PM49.121.111.73Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Search queries suddenly start to look a lot more like personally identifiable information.
Plus, the search history could also include a cookie or even a user ID if you were logged into the search engine website when you issued the query.

Uses of search history data

By storing both our queries and our identifying information, a search engine can personalize the search results.
For example, consider the search query "Python". If the searcher is a biologist and frequently searching biology related terms, the search engine might show them this as the first result:
Screenshot of a search result for the Wikipedia article on Pythonidae, the family of snakes.
If the searcher is instead a software developer and has many programming-related queries in their search history, the search engine might instead show this result:
Screenshot of a search result for the homepage of python.org, a programming language.
For those programmers who don't like snakes, they might be very grateful for the personalized results. 🚫🐍
Search engines frequently include advertisements along with search results, since that's how they make enough money to keep operating a search engine for free. Once they start collecting a user's search history, the advertisements can be based on more than just the current query.
In addition to operating a search engine, Google also runs a very popular ads network which runs ads on millions of non-Google websites. The Google ads system can use search history to personalize the ads that show up on the non-Google websites.
For example, I once spent a day researching smart sensor networks for an article, and I still get served advertisements about smart sensors, even while reading a fashion blog.
A cropped ad that says "Ditch ineffective, Old-School Monitoring for smart, real-time wireless monitoring solutions" next to a picture of two devices.
Part of an ad that showed up on a fashion blog. (Brand cropped off)
🤔 When you see an ad on a site that seems personalized to your interests, do you feel happy that it's catering to you or mad that it knows you so well?

Risks of search history collection

From the perspective of the search engines, they're using your search history to personalize your experience and make it better.
There are dangers to any form of online data collection, however.
In 2005, the online media company AOL released three months of "anonymized" search data for researchers to use. Their anonymization strategy was to replace the username column in the data with a numeric ID. Each username was always replaced by the same numeric ID, which meant that researchers could group the data by numeric ID and see all the queries ever made by a user. 😬
In less than a week, journalists at the New York Times were able to deduce the identity of user number 4417749 by combing through her queries and piecing together tidbits of personal information.1 She was shocked to discover all her search queries were publicly viewable and told the journalist, "My goodness, it’s my whole personal life. I had no idea somebody was looking over my shoulder.”

What's a user to do?

If you're suddenly feeling uncomfortable typing a query into a search engine, that's understandable. But don't worry, you don't need to swear off search engines for the rest of your life.
The first step is to understand what data is actually being stored by the search engine and how they are using that data. You can read the privacy policy for the search engine to find that out.
If you don't like how they're collecting the data but want to continue using the service, you can look for settings that will let you reduce or completely disable data collection. Not every search engine will offer such settings, but many will in order to accomodate the privacy-conscious users.
If you're open to using a different service, look for alternative offerings. For example, DuckDuckGo is a privacy-focused search engine that does store the search queries to improve features such as spelling correction but does not store IP addresses, user agents, cookies, or other potentially identifiable information.2
🤔 Are you making any changes to your search behavior after learning more? What benefits or drawbacks are you anticipating to your new approach? Share them with us!

🙋🏽🙋🏻‍♀️🙋🏿‍♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

Want to join the conversation?