Go to the web page you’d like to scrape. Launch Bardeen and navigate to the Compose tab. Type the following command “Do create new scraper model.”
Type “Get data using scraper model” to extract the data from the current web page. Use the “on url” variable to scrape data in the background.
Add this data to Sheets, Airtable or Notion with the DO command. Ex: “Do append to google sheet [sheet] rows from table [last table].”
Click on “New Playbook” from the Compose mode and further configure your playbook.
Yes. Websites get updated periodically. Small changes rarely affect scraper models, but more significant changes break the element mapping.
If you’d like to scrape a list spread out across multiple pages, use pagination. There are two options: infinite scroll and click pagination.
Websites such as Facebook or Instagram load new list elements, when you scroll to the bottom of a page. So pick “infinite scroll” on them.
Other websites such as Google or LinkedIn, require you to click a button to go to the next page. For those sites, pick “click pagination” and use the visual element selector on the element that takes you to the next page (not a specific page).
You can specify how many pages you want to scrape in the “Get data using scraper model with pagination limit.”
When you scrape a page, the results will be returned in the table form. You need name column in Airtable or Notion exactly the same as the returned table, so that Bardeen knows how to map data to the right columns.
For example, when you create a new scraper model, you are asked to define the name of an element that you want to scrape. If you give this element a name “Column A,” then Airtable should also have “Column A” (case sensitive) column. If Airtable doesn’t have a column with the exact same name, it will not be imported.
Note that this is not the case in Google Sheets.