Skip to main content
Humanities LibreTexts

8.4: Understanding URLs

  • Page ID
    119890
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    A URL (Uniform Resource Locator) is just the internet address for any given webpage:

    A screenshot of a web browser. The url is circled.

    Understanding the component parts of a URL can be helpful in a variety of situations. Here are just a few reasons why understanding URLs is useful:

    • The URL often reveals key information about a site
    • An understanding of URLs provides the needed foundation for many advanced search strategies
    • A heightened attention to URLs helps searchers recognize fraudulent sites

    Each section below focuses on a different part of the URL. At the end of the webtext is a quiz that you can take to test your understanding of URLs.

    Locate the Protocol

    The “protocol” is the first part of URL. Some browsers simplify how addresses are displayed by hiding the protocol. For example, in Chrome and Firefox, http://writingcommons.org displays as writingcommons.org.

    The protocol https indicates that information sent through the page will be encrypted, and therefore harder to read if some third party intercepts the information. (The next time you are entering a username and password on a page, check for the “https” protocol.)

    Locate the Domain Name

    The “domain name” identifies the site that contains the page you are viewing. It appears just before the first single slash (/). If there is no single slash, then the domain name is whatever appears at the end of the URL.

    For example, the following URLs all refer to pages on the Writing Commons site:

    • http://writingcommons.org/open-text/information-literacy
    • writingcommons.org/open-text/research-methods-methodologies/integrate-evidence/incorporate-evidence/1030-synthesizing-your-research-findings
    • www.writingcommons.org

    If you look carefully, you will see that most browsers try to help users out by boldfacing the domain name in the address bar.

    Being able to locate the domain name in a URL allows you to identify the entity that hosts the page you are viewing—a piece of information that is often crucial to understanding the nature of your source.

    Recognize Sub-directories

    Elements of the URL that appear after the domain indicate different sub-directories. For example: “open-text,” “information-literacy,” and “rhetorical-analysis” are sub-directories of the domain writingcommons.org. Think of these as folders within folders.

    Recognize Subdomains

    Subdomains are similar to sub-directories in that they provide a way for website developers to separate content, but subdomains appear before the domain name in the URL.  Don’t let this trip you up.  The domain name is still the content that appears pressed up against the first single slash (/) or—if there is no single slash—at the very end of the URL.

    For example, the domain name in all of the following URLs is google.com

    Pay attention to the placement of the dots.  The following is not a Google page:

    www.mgoogle.com

    Here the domain is mgoogle.com, not google.com

    Recognize Top-level Domains

    In the domain name writingcommons.org, the “top-level domain” is .org. The top-level domain .org was originally intended for use by non-profit organizations—and many non-profits continue to use it—but it is now open to anyone.

    In the domain name amazon.com, the top-level domain is .com. Short for “commercial,” .com is the most common top-level domain in the world and is now used for a wide variety of sites—not just the sites of commercial enterprises.

    Some top-level domains have retained their original meanings and are especially helpful to know:

    domain description example
    .edu university site http://www.nu.edu
    gov government site http://www.senate.gov
    .mil military site http://www.army.mil

    Newer top-level domains such as .museum, .bike, and .clothing are not yet widely used.

    Some domains include a country domain extension—or “country code top level domain.”

    Here are some examples:

    code country example
    .in India indianrail.gov.in
    .de Germany www.spiegel.de
    .ca Canada www.cbc.ca
    .jp Japan www.nicovideo.jp
    .uk United Kingdom www.ima.org.uk

    Pay attention to country domain extensions. When present in a URL, they represent a core component of the domain. Note, for example, that hydra.com and hydra.com.gr are different domains. The two are unrelated sites run by unrelated entities.

    For a comprehensive list of top-level domains, consult one of the following:

    Use Your Understanding of URLs To Enhance Your Web Searching

    Once you understand URLs, certain kinds of advanced search strategies become easier to conceptualize, remember, and implement—for example, filtering by domain and top-level domain.

    Filter By Top-level Domain

    If you know that the kind of information you are seeking is most likely to appear on a site with a particular type of top-level domain, you can restrict your search to this type of site using the site: search operator.

    For example, if you are seeking government documents on the topic of student loans, then a search for student loans site:gov will return only results with the top-level domain gov, filtering out a large number of sites that are not relevant to your research needs.

    Filter By Domain

    If you know the domain of the site on which your information will appear, you can use site: to search only that site.

    For example, a search for sample tests site:dmv.ca.gov will return only pages located on the California Department of Motor Vehicles (DMV) website (the domain of which is dmv.ca.gov).

    The site: operator works in all major search engines (Google, Bing, Baidu, DuckDuckGo, etc.).

    Practice Identifying Deceptive URLs

    The immediate benefit of the drill below will be to improve your ability to distinguish between real and fraudulent sites, but the exercise will also help you sharpen your overall URL-analysis skills by heightening your attention to the component parts of URLs.

    A) Which of the following are eBay.com web pages? Do not go to the sites. (Some sites masquerading as legitimate sites may contain harmful underlying code). Just examine the URLs.

    1. http://pages.ebay.com
    2. http://movies.half.ebay.com
    3. http://pages.ebey.com
    4. http://68.112.112.34:8866/ebay.htm
    5. http://signin.ebay.com@10.19.29.2
    6. http://pages.@ebay.com
    7. http://signin-ebay.com
    8. http://www.ebay.com/electronics/ipad
    9. http://www.ebay.deals.com
    10. http://www.ebay.pro
    11. http://www.ebay.com.bb/motors/motorcycles
    12. http://www.ebay.com/itm/A-Planet-of-...-/191063912359

    B) Find the domain name in this URL:

    http://www.bankofamerica.com.sas.sig...0832yhIopOWjos

    Answer

    A) eBay page?

    1. http://pages.ebay.com YES This is an eBay page. The domain name is ebay.com
    2. http://movies.half.ebay.com YES This is an eBay page. The domain name is ebay.com (“movies” and “half” indicate subdomains).
    3. http://pages.ebey.com NO This is not an eBay page. Note that “ebay” is misspelled as ebey.
    4. http://68.112.112.34:8866/ebay.htm NO This is not an eBay page. The first single slash (/) is not preceded by the domain name ebay.com.
    5. http://signin.ebay.com@10.19.29.2 NO This is not an eBay page. Notice that there is no slash (/) after “ebay.com.”
    6. http://page.@ebay.com NO This is not an eBay page. The actual domain is @ebay.com, not ebay.com.  (@ebay.com is as different from ebay.com as zebay.com, bebay.com, mebay.com, etc.  One character can make all the difference.)
    7. http://signin-ebay.com NO This is not an eBay page. If the hyphen were a period, we’d be fine.  But it isn’t.  As in the example above with @, the hyphen could be any character and be just as wrong.
    8. http://www.ebay.com/electronics/ipad YES This is an eBay page. The domain name is ebay.com.  The first single slash (/) is directly preceded by .ebay.com
    9. http://www.ebay.deals.com NO This is not an eBay page. The domain name is deals.com (not ebay.com).
    10. http://www.ebay.pro NO This is not an eBay page. The domain name is ebay.pro (not ebay.com).
     11. http://www.ebay.com.bb/motors/motorcycles NO This is not an eBay page. The domain name is ebay.com.bb(not ebay.com).
     12. http://www.ebay.com/itm/A-Planet-of-...-/191063912359 YES This is an eBay page. The domain name is ebay.com.  The first slash is directly preceded by.ebay.com

    B) The domain name in the following URL is bernadinec.com (not bankofamerica.com). Notice that bernadinec.com is what appears just before the first single slash (/):

    http://www.bankofamerica.com.sas.sig...0832yhIopOWjos


    8.4: Understanding URLs is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.