Investigating the range of selectors offered by Cheerio
Web scraping with Cheerio uses selectors to specify the data we want. When we pulled the h2 tags from the page in the previous activity we created a selector:
// Search for the elements we want
selection = $('h2')
The selector could then be used to pull the data from the page:
// Add the elements to the list
selection.each((i,el) => {
text = $(el).text()
results.push({country:text})
})
Cheerio has a number of selectors and selector patterns for different scenarios.
Here is a list of examples of the most common selector scenarios:
Selector | Action |
---|---|
$('.') |
Select all elements on the page |
$('div') |
Select all div elements on the page |
$('h1, h2') |
Select all h1 and h2 elements on the page |
$('div > p') |
Select all p elements directly under a div (children) |
$('div p') |
Select all p elements under a div either directly or indirectly (find) |
$('#xyz') |
Select all elements with an id of xyz |
$('.pqr') |
Select all elements with the CSS class (style) pqr |
$('title') |
Select the title element |
There are also some functions that we can apply to a selector:
Function | Purpose |
---|---|
selector.children() |
Select all elements directly under the selector |
selector.children('p') |
Select all p elements directly under the selector |
selector.find('tr') |Select all tr elements directly or indirectly under the selector |
|
selector.first() | Select the first direct child of the elements in the selector |
|
selector.last() | Select the last direct child of the elements in the selector |
|
selector.each(fn) | Loop through all elements in the selection and apply the function fn |
On the next page we will explore these selectors and functions.
Table of Contents