Duplicate rows when scraping Instagram Reels into Notion

Hi,

I have an Instagram Reels scraper that works fine, but for some reason my Autobook into Notion duplicates a lot of rows. The duplicates do not happen in the scraping step, when looking at results - so it might be Notion. Also, the duplicates seem to happen more on profiles with more posts - 20 posts works fine usually, but 80 posts duplicates around 30%.

Because of this, I want to create an Autobook that “finds” the Notion database and then either:

  • Checks if it doesn’t exist, and adds it
  • OR checks if it does exist, and skips it

I’ve tried lots of different things the last few hours, including a condition - however the condition functionality doesn’t seem to work at all. I tried running it while only swapping the “Yes” and “No” and not changing anything else, and everything still went through the filter both times.

I’ve tried using the AI to assist in creating different flows, but nothing actually works.

Thanks

Hi Lars,

Normally a combination of “Find Notion pages” and a condition should work well for this case. I think there may be a misconfiguration in your playbook, would you mind sharing it with me?

On the other hand, I am thinking what is a duplicate in this case, are those identical reels with the same author and description or maybe the same video posted by many people?

Looking forward to your reply.
Victoria

Customer Support - bardeen.ai
Knowledge Base https://www.bardeen.ai/tutorials
Explore | @bardeenai | Bardeen Community

Hi, thanks for the quick response!

For the definition of the duplicates: it’s simply the URL to the reel.

In this case, in the Notion database, it’s the Title/Heading column I’ve named “Reel”, which is this URL (first column in the attached image). For example:https://www.instagram.com/reel/C7ZibJ-NO-0/

(I have another “Reel URL” property in the Notion database, but it’s only been used for testing)

I’m trying to match the “Reel” Notion property (which is the URL I explained above) with the “Post URL” from the Bardeen scraper for the duplicates.

Both of these work well in isolation and produce the correct results.

Autobook link: https://www.bardeen.ai/playbook/community/2024-Lars-IG-Reels-to-Notion-A5FPk1eXY9Ae32rHFO

Tried swapping places of operations #2 and #3 (scraper and Find Notion pages)

Tried swapping the conditional statement from “contains” & NO to “does not contain” & YES

Tried using the same conditional statement on both YES and NO with same flow, and it had the same results

Usually used “Commands” instead of “Text” in the conditional statement but tried a mix of both, to no avail

Just set it up again now with “Find Notion pages” and the Condition, and when I run it, it still adds duplicates.

Thanks, Lars

Hi Lars,

I’ve created this playbook for you: https://www.bardeen.ai/playbook/community/2024-Lars-IG-Reels-to-Notion-copy111-nlxaqTCfeLkC9ZSujV

Please make sure to connect it to your Notion database before testing. The idea is the following after a series of tests:

  1. Scrape instagram using your scraper
  2. Find Notion pages with no filters
  3. Merge text of “Find Notion pages”, otherwise it iterates scraped items as well as Notion pages and that creates a mixup
  4. Conditional statement checking if merged text does not contain Reel URL from a scraper
  5. Create Notion page

I hope this makes sense. But if you have questions, please let me know.

Thank you,
Victoria

Customer Support - bardeen.ai
Knowledge Base https://www.bardeen.ai/tutorials
Explore | @bardeenai | Bardeen Community

Hi again, thanks for making a new playbook.

Just tested it with this link: Shared Playbook Template

I think/hope it’s set up correctly like yours, but it still adds duplicates to the final database. 87 reels became 134 entries, and 95 reels from another profile became 214 entries

Also, I’m not sure if I really need the automation, because the Notion databases get so large with hundreds of reels and they start being slow to respond, etc.

I can do what I need to do without the automation anyway, and I already have and use a lot of other Bardeen automations :smiley:

Hi Lars,

What if you try with a new database to make sure there’s no existing duplicates? I checked your playbook and can confirm that it should work the same as mine. Also I tested mine once again and did not get any duplicates.

Also, could you try restarting Bardeen in case if something got cached?

Thank you,
Victoria

Customer Support - bardeen.ai
Knowledge Base https://www.bardeen.ai/tutorials
Explore | @bardeenai | Bardeen Community

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.