I’ve been trying to create an automatic scraper (that triggers when a website changes) for a few days. I use it to scrape new public tenders that are added to a website. They are added once every 2 days, but sometimes everyday. They are not all uploaded at the same time.
Once they are scraped by Bardeen, the Bardeen automation puts the relevant information (name, mail, description…) in a Google sheet.
Here is my issue : every times the automation triggers (i.e when the website changes), the scraper scrapes the last 5 public tenders. But sometimes there are less than 5 new public tenders, so some are scraped they were previously scraped, and I have therefore some doubles in my Google sheet. I have tried to create a condition to avoid that, but it doesn’t seem to work.
You can find attached the screenshots of the automation.
Thanks for submitting this query. Our engineering team are looking into this, however the ‘When website data changes’ action is meant to remember the last scraped result. Could you try not setting a limit of 5 items to scrape ?
As a temporary workaround, Google Sheets has a function (UNIQUE) that allows you to only get the unique values in a data set. You can then add VLOOKUPs to get the other columns of data for that unique value. I’m assuming in your case each tender has a Unique ID or name of some sort. Here is a quick video showing you how you might set this up:
Customer Support - bardeen.ai
Explore | @bardeenai | Bardeen Community