WEB SCRAPING WITH PYTHON: https://campus.datacamp.com/courses/web-scraping-with-python

November 04, 2024

WEB SCRAPING WITH PYTHON:

https://campus.datacamp.com/courses/web-scraping-with-python

Consider the HTML code:

<html>
  <body>
    <div>
      <p>Good Luck!</p>
      <p>Not here...</p>
    </div>
    <div>
      <p>Where am I?</p>
    </div>
  </body>
</html>

It's Time to P

In the lecture, we learned how to use double forward-slashes to navigate to all future generations. In this exercise, you will select all paragraph p elements within the HTML. Because we want you to navigate to all paragraph elements, it is not important that you know what the HTML code is, since the task can be accomplished with a simple XPath string using the double forward-slash notation you have learned.

A classy span

Although we haven't yet gone deep into XPath, one thing we can do is select elements by their attributes using an XPath. For example, if we want to direct to the div element within the HTML document whose id attribute is "uid", then we could write the XPath string '//div[@id="uid"]'. The first part of this string, //div, first looks at all div elements in the HTML document. Then, using the brackets, we specify that we want only the div element with a specific id attribute (in this case uid). To note, the phrase @id="uid" in the brackets would be read as "attribute id equals uid".

In this exercise, you will select all span elements whose class attribute equals "span-class". (Note: span is just another possible tag-name).

Fundamental techniques in computational web scraping

Usage of Web Scrapping

· Web scraping can be useful for looking through product reviews to gauge public opinion about a particular product.

· Web scraping can be useful for reading through social media posts between users in different areas to compare different language usage.

· Web scraping can be useful for going through online news publications to pick out articles discussing a particular topic.

Choose DataCamp!

In this exercise, we want to give you the opportunity to create your own XPath string to achieve a certain task; the task is to select the paragraph element containing the text "Choose DataCamp!".

Consider the following HTML:

<html>
  <body>
    <div>
      <p>Hello World!</p>
      <div>
        <p>Choose DataCamp!</p>
      </div>
    </div>
    <div>
      <p>Thanks for Watching!</p>
    </div>
  </body>
</html>

Where it's @

In this exercise, you'll begin to write an XPath string using attributes to achieve a certain task; that task is to select the paragraph element containing the text "Thanks for Watching!". We've already created most of the XPath string for you.

Consider the following HTML:

<html>
  <body>
    <div id="div1" class="class-1">
      <p class="class-1 class-2">Hello World!</p>
      <div id="div2">
        <p id="p2" class="class-2">Choose DataCamp!</p>
      </div>
    </div>
    <div id="div3" class="class-2">
      <p class="class-2">Thanks for Watching!</p>
    </div>
  </body>
</html>

# Create an Xpath string to select desired p element
xpath = '//*[@id="div3"]/p'

# Print out selection text
print_element_text( xpath )

<html>
  <body>
    <div id="div1" class="class-1">
      <p class="class-1 class-2">Hello World!</p>
      <div id="div2">
        <p id="p2" class="class-2">Choose DataCamp!</p>
      </div>
    </div>
    <div id="div3" class="class-2">
      <p class="class-2">Thanks for Watching!</p>
    </div>
  </body>
</html>

# Create an XPath string to select p element by class
xpath = '//p[@class="class-1 class-2"]'

# Print out select text
print_element_text( xpath )

<script.py> output: Hello World!

Search This Blog

My Python Rough Notes

WEB SCRAPING WITH PYTHON: https://campus.datacamp.com/courses/web-scraping-with-python

It's Time to P

A classy span

Choose DataCamp!

Exercise

Where it's @

Comments

Post a Comment

Popular posts from this blog

PANDAS micro course by www.Kaggle.com https://www.kaggle.com/learn/pandas

Course No 2 Using Python to Interact with the Operating System Rough Notes

Introduction to Git and GitHub https://www.coursera.org/learn/introduction-git-github/