If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

HTTP and HTML

Tumblr founder David Karp and Xbox program manager Jasmine Lawrence give a detailed description of how files and webpages are sent and received using HTTP and HTML.

Want to join the conversation?

  • winston default style avatar for user ☺☻☺Natth4545☺☻☺
    Why is it called a cookie?
    (112 votes)
  • leafers ultimate style avatar for user Dhruv Patel
    Where does the webpage get a certificate. Who issues the certificate? Does the Government like how the Government issues passports and official documents?
    (30 votes)
    • male robot hal style avatar for user Cameron
      A certificate authority issues the certificate, which is just an organization that people trust to issue these certificates. The most commonly used certificate authorities are: Symantec (which owns Verisign), Comodo, and GoDaddy. (These are just private companies)

      It would be difficult to convince many people that local or foreign governments could be trusted with the responsibility of issuing certificates. (Many governments attempt to monitor local and/or foreign communications)
      (57 votes)
  • blobby green style avatar for user J.  Miller
    DNS....did I miss something?? Can a CA owned or faked by the CIA, FBI, others?
    (8 votes)
    • male robot hal style avatar for user Cameron
      DNS (Domain Name System: It's a system that converts a web address like http://www.khanacademy.org into an IP address)

      Here's some bad things that can happen to certificates issued by a CA:
      - A certificate, that belongs to someone else, can be mistakenly issued by a CA to an attacker, so everyone will think the attacker with the signed certificate is someone who can be trusted (this happened to Microsoft once)
      - The certificate's private key is discovered by an attacker allowing them to forge the signature of who the certificate belongs to.
      - Some powerful entity (such as a government) could tell the CA (which is just a company) to issue them forged certificates, or to revoke legitimate certificates of others. However, if this was made public, people may stop trusting certificates issued by that CA.

      To handle these things CAs do the following:
      -they add expiry dates to the certificate, so that attackers don't have enough time to find the private key, and even if they do, the certificate will only be compromised until it expires
      -they revoke certificates when they are suspect to be compromised

      Hope this makes sense
      (20 votes)
  • leaf green style avatar for user Ayaan Heban
    at , does the cookie id # always stay the same or does it change every time you login
    (8 votes)
    • male robot hal style avatar for user Cameron
      It depends.

      Each cookie (there may be several associated with each site you visit) has a name field, a value field, and possibly some attributes.

      If you put the following in the address bar of Chrome, you can see the cookies it is currently storing: "chrome://settings/cookies"

      So for example, if you go to "http://www.echoecho.com/samplecookie1.htm" it will put a cookie named "username" with a value of the name you give it.

      If a cookie is given an expiry date it will last until that date (these are persistent cookies), but if no date is provided the cookie will only last until you shut down your browser (these are session cookies).

      Typically, if the cookie is being used to authenticate a person, a session cookie will be used with a random value. The value is random and changes each session so that it would be too hard for someone to guess the right value. If someone could just guess the right value then they could create the cookie themselves with that value on their computer and bypass the login of the website, because the website would look at the cookie and see the right number in the cookie indicating that the user had already logged in.

      Hope this makes sense
      (14 votes)
  • duskpin ultimate style avatar for user Tol
    At , thay say that a cookie is the only way for a website to remember who you are. Isn't there also localStorage?
    (5 votes)
    • male robot hal style avatar for user Cameron
      localStorage is a new technology
      Both cookies and localStorage are ways to record state, but they are different:
      - cookies get sent to the website's server
      - localStorage is read by scripts running on your web browser (this data doesn't normally get sent to the website's server)

      So cookies are nice for the server side (the web site server) to remember state, and local storage is nice for the client side (your browser) to remember state. So for things like how the user wants the web page to appear, it might be reasonable to keep those in local storage, but for things like, whether the user is logged in or not you would use a cookie (a session cookie).
      (6 votes)
  • male robot donald style avatar for user Mike Bgarop Kaylfe Xander
    Where can i create A Html Porgram In Khan Acedemy?
    (5 votes)
  • blobby green style avatar for user Yizhou A. Jiang
    Do we always get assigned to the same cookie by a specific website like tumblr? Like sometimes I clear cookies on my browser, does that mean all customized information is lost? Or is the customized information linked to the cookie ID and stored in the server and next time my computer talks to the server and server will assign the same cookie and my browser would retrieve the customized info from the server? If my IP addresses are constantly changing as I move from home to work and cafes, do I still have the same cookie every time? Thanks in advance!
    (3 votes)
    • male robot hal style avatar for user Cameron
      The cookie is just a file, associated with the website, that sits on your computer. What information is stored in the cookie varies for each website. If you delete the cookie, whatever is stored in it is lost.

      A cookie doesn't have to identify you as a user. It could contain something simple like the last time you visited the website. When you visit the website, it could read the cookie with that time and tell you the last time you visited the website. If you deleted that cookie it wouldn't be able to tell you when you last visited. A new cookie would be placed on your computer with the default settings.

      A cookie could contain an id number. The next time you visit the website, it could look up information on its files that it has associated with that id number. So for example, you go to a website, and it tracks which articles you have read. The next time you visit that website, it shows you advertisements that it thinks you would like based on the articles you have read. If you deleted the cookie, they wouldn't know who you are and they wouldn't be able to target advertisements towards you. A new cookie would be place on your computer with a different id number.

      The cookie is just a file on your computer that your browser manages, so it doesn't matter if your IP address changes. If you use different browsers, you will have different cookies for each browser.

      Hope this makes sense
      (7 votes)
  • hopper cool style avatar for user Shadow
    At , if your computer remembers with the cookie data. Then why, when I login to khan academy, do I have to type my username and password every time. Why does it not just log me in automatically?
    (4 votes)
    • starky tree style avatar for user Jeremy Jameson
      If your KA account is set not to "remember" you (depend on cookies to authenticate you), you'll have to manually authenticate each time. Alternatively, if your browser is set to delete cookies when it is closed, and you close your browser at the end of each session, you'll have to log in again at KA because the cookie they set was deleted. It may also be required if you IP address changes, if the cookie depends on you having the same IP address to work.
      (4 votes)
  • ohnoes default style avatar for user Majji Venkata Rohan
    Does internet use satellite communication..? what I mean is, for example if the main server is present in another country how will it manage to send me data..? Through satellites or any other method..?
    (4 votes)
  • hopper happy style avatar for user PradsPrasad
    So, does the browser need to send different requests for CSS and JavaScript??
    (2 votes)
    Default Khan Academy avatar avatar for user
    • piceratops ultimate style avatar for user Piquan
      Yes, usually. Generally, the browser will issue many requests: it issues separate requests for the HTML, the CSS, the JavaScript, and for every image on the page.

      However, the browser remembers the CSS and JavaScript. The browser can ask the web server, "Has this CSS file changed since last Thursday?", and if it hasn't changed, then it will reuse the version it remembered. This is your browser cache, and it applies to everything the browser uses: the images, the JavaScript, the CSS, the HTML, everything. Some sites can also send the browser a promise that a file won't change for a while. That way, the browser doesn't have to send a request for the CSS or JavaScript if it's been to the site before.
      (6 votes)

Video transcript

- I'm Jasmine Lawrence and I'm a Program Manager on the Xbox One engineering team. One of our biggest features is called Xbox Live. It's an online service that connects gamers from all around the world, and we rely on the internet to make that happen. This is no easy task and there are a lot of things happening behind the scenes. The internet is totally changing how people interact and connect. But how does it work? How do the computers all across the world actually communicate with each other? Let's look at web browsing. First, you open a web browser. It's the app you use to access the web pages. Next, you type in the web address, or URL, which stands for Uniform Resource Locator, of a website you want to visit like Tumblr.com. (upbeat electronic music) - Hi, I'm David Karp, the founder of Tumblr, and we're here today to talk about how those web browsers we use every day actually work. So you've probably wondered what actually happens when you type an address into your web browser and then hit enter, and it really is about as crazy as you can imagine. So in that moment, your computer starts talking to another computer called a server that's usually thousands of miles away, and in milliseconds your computer asks that server for a website, and that server starts to talk back to your computer in a language called HTTP. HTTP stands for Hypertext Transfer Protocol. You can kind of think of it as the language that one computer uses to ask another computer for a document. It's actually really pretty straightforward. If you were to intercept the conversation between your computer and a web server on the internet, it's mainly made up of something called "GET" requests. And those are really very simply the word "GET" and then the name of the document that you're requesting. So if you're trying to log into Tumblr and load our login page, all you're doing is sending a GET request to Tumblr's server that says "GET /login" and that tells Tumblr's server that you want all of the HTML code for the Tumblr login page. HTML stands for HyperText Markup Language and you can think of that as the language you use to tell a web browser how to make a page look. So if you think about something like Wikipedia, which is really just a big simple document, and HTML is the language that you use to make that title big and bold, to make the font the right font, to link certain text to certain other pages, to make some text bold, to make some text italic, to put an image in the middle of the page, to align the image to the right, to align the image to the left. - The text of a web page is included directly in the HTML, but other parts, images or videos, are separate files with their own URLs that need to be requested. The browser sends separate HTTP requests for each of these and displays them as they arrive. If a web page has a lot of different images, each of them causes a separate HTTP request and the page loads slower. Now sometimes when you browse the web, you're not just requesting pages with GET requests. Sometimes you send information, like when you fill out a form or type a search query. Your browser sends this information in plain text to the web server using an HTTP post request. - So let's say you log in to Tumblr. The first thing you do is make a "POST" request That is, a "POST" to Tumblr's login page that has some data attached to it. It has your email address, it has your password. That goes to Tumblr's server. Tumblr's server figures out that, okay, you're David. It sends a web page back to your browser that says, "Success! Logged in as David." But along with that web page, it also attaches a little bit of invisible cookie data that your browser sees and knows to save, and that's really important because it's really the only way that a website can remember who you are. All that cookie data really is is an ID card for Tumblr. It's a number that identifies you as David. Your web browser holds onto that number, and then the next time you refresh Tumblr, the next time you go to Tumblr.com, your web browser knows to automatically attach that ID number with the request that it sends over to Tumblr's server, so now Tumblr's server sees the request coming from your browser, sees the ID number, and knows, okay, this is a request from David. - Now, the internet is completely open, all of its connections are shared, and information is sent in plain text. This makes it possible for hackers to snoop on any personal information that you send over the internet, but safe websites prevent this by asking your web browser to communicate on a secure channel using something called Secure Sockets Layer and its successor Transport Layer Security. You can think of SSL and TLS as a layer of security wrapped around your communications to protect them from snooping or tampering. SSL and TLS are active when you see the little lock that appears in your browser address bar next to the HTTPS. The HTTPS protocols ensure that your HTTP requests are secure and protected. When a website asks your browser to engage in a secure connection, it first provides a digital certificate, which is like an official ID card proving that it's the website it claims to be. Digital certificates are published by certificate authorities, which are trusted entities that verify the identities of websites and issue certificates for them, just like a government can issue IDs or passports. Now, if a website tries to start a secure connection without a properly issued digital certificate, your browser will warn you. That's the basics of web browsing, the part of the internet we see day to day. To summarize, HTTP and DNS manage the sending and receiving of HTML, media files, or anything on the web. What makes this possible under the hood are TCP/IP and router networks that break down and transport information in small packets. Those packets themselves are made up of binary, sequences of ones and zeros that are physically sent through electric wires, fiber optic cables, and wireless networks. Fortunately, once you've learned how one layer of the internet works, you can rely on it without remembering all the details. We can trust that all those layers will work together to successfully deliver information at scale and with reliability.