So! For example if I want to scrape this page -
Game Developer Jobs - GameJobs.co - and create records for each, how do I also do this: open each item’s URL, extract another piece of data from each separate page, and then sync it back with the main list? For example:
Ubisoft · Montréal, Quebec · 6 days ago ·"
I’d want to get that info, but also open up the first link -
Narrative Director at Ubisoft - and get the ‘Apply button’ url link on that page - Ubisoft Narrative Director | SmartRecruiters, and I’d want that URL to be added to my original scraped lis from the main jobs page.
I started creating this in the below but I was stuck on how to scrape the URLs pulled from the first list as part of my flow, and also am not sure how to connect it all up once the scraping is done.
(as a bonus question - is there a way to make it so our flows only add new/unique entries, and don’t add entries if they find duplicates?)
Welcome to Bardeen
This use-case seems like the standard case for “deep scraping”.
You get the links on a list scraper on the first page
Then you scrape data from all the jobs on the background
Here’s the same use-case for scraping jobs out of a search on linkedin:
Yep, using the “Update commands”, like this one “Update google sheet rows”
First, let’s start with clarifying the difference between “Update Google Sheet rows” and “Update Google Sheet tab rows”
Update Google Sheet rows - This action will update rows in a Google Sheet without requiring you to select a specific tab. It will always target the sheet tab that is in the first position. If you move the sheet tab away from that position, you may encounter an error when running your automation. This option is ideal for single-tab sheets or when you want the output to go to a …
This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.