Help getting a basic scraper setup

Hello everyone,

I’m working on a project where I need to aggregate text information from a series of hyperlinks. Specifically, I’m looking at the Hypercerts Docs webpage: Hello from Hypercerts | Hypercerts. This page contains several sections, each with multiple links. My goal is to scrape text from about 10-16 of these hyperlinks.

After scraping, I want to use the gathered data to query ChatGPT for insights and summaries. I’m still new to Bardeen and would greatly appreciate any guidance or tips on setting up a scraper for this specific use case.

First I select list, then click some of the links, then no pagination, then that leads me to the first place I’m not sure what to do:

Here I tried checking both Click and Get Text, which then takes me here:

If I try create a new scraper template that seems to send me in a circle to start the process over

@Jess any help you could provide would be greatly appreciated!

Hi @charrison, happy to take a look at this for you!

It appears you are trying to use the magic box to create the automation for you, but since this is pretty specific, I recommend creating it on your own.

I’ve created the below playbook for this use case:

Notes:

  1. You must be on the https://hypercerts.org/docs/ page when executing the playbook automation within Bardeen.
  • Essentially we are grabbing all of the links and getting their html code and converting that into text so Chat GPT can generate a summary for each one, then inputting it into a Google Sheets.
  • Here’s the Public GSheet to see what the result will look like: @Charrison - Google Sheets
    – Feel free to edit the prompt if there is something else you are looking for from ChatGPT.

As this is a lot of text for Open AI to generate summaries for, it will take a little while. It might be better to use this automation to run per link - Get a summary of the current page using OpenAI

I hope this helps!

Please let me know if you have any questions or need more information.
Thank you,
Jess

Wow thank you so much for doing that!!! As I look through what you built a lot of it doesn’t make sense to me and I don’t think I’ll be able to replicate it myself the next time I encounter a similar situation. I hate to ask you to do it again but any chance you could take a screen capture? If I could see the whole process I think I’d understand it a LOT more. As I thought about it I also realized I don’t need it to have this many steps either. The time consuming part for me to do manually is clicking each of the links and copy-pasting all the text into a document. That’s really all I need, I can easily take that content and manually put it into chatGPT, that only takes a second. I also wonder if it would be possible for the data to go into a .txt document instead of a spreadsheet. When I went to upload the spreadsheet into ChatGPT it seemed to have trouble reading all the text in the second column, but when I copied all of that into a plain text document I was able to ask questions of the content from all those pages which was the initial goal.

Even though I don’t need the ChatGPT step in the automation, I’m curious how do I edit the ChatGPT prompt for future reference? I don’t know if Bardeen Ambassador is a paid position or not but I’d be happy to compensate you for your time.

Also, I was just reading this tutorial and I’m wondering if what you built qualifies as a ‘deep scraper’? It seems like it based on this “This is usually done when scraping search results and then going through every page on that list to extract additional data”

Hi @charrison,

I’ll try to get you a video by the end of the week to further explain and create your new request of the automation without the ChatGPT and use Google doc instead. Your use case is atypical of how an automation is normally built so please keep that in mind.

  • Kind of(conceptually yes), but not technically. Deep scraper involves two or more scraping actions like first scraping a list of LinkedIn job URLs and then from those URLs scrape further details about each job specifically like title, salary, etc. In this automation we only use one - “scrape data on active tab” to grab the links first. And then we use a couple more Bardeen actions to convert the html of each page into text from each link.

Compensation would be greatly appreciated - Bardeen Ambassadors do not work for Bardeen, but are a representative of the brand itself so it’s something extra I’m doing outside of my full time job.
Here’s more about me if you care to learn:

I hope this helps!
Thank you,
Jess

Thanks Jess !

Hey Cody, welcome to the community.

I would like to invite you to join our recurring happy hours (every Tuesday and Thursday) and connect live with the Bardeen Team and fellow builders as we help you get started with Bardeen and activate your first automations.

We open this space to everyone in our community.

What can you expect:

  • An onboarding to Bardeen
  • Help you activate your first automations
  • Explore use-cases to improve sales in your business
  • Q+A with Bardeen team

Duration: 1 hour

Book your session here: Bardeen Weekly Onboarding Sessions

We also have a series of resources on our Youtube channel that take you step by step into creating your first playbook. One of my favorites is the Ultimate Scrape Tutorial.

Customer Support - bardeen.ai
Explore | @bardeenai | Bardeen Community

Hi @charrison,

Per your request, please find the walkthrough video of this use case below:

If you need this video for future reference, please download it as it expires in 7 days.
@Charrison - Scrape Hypersorts.org links into .txt File

Here’s the updated automation:

If you’ve found the help valuable and feel like showing some appreciation - Here’s my PayPal.

As always, please let me know if you have any questions or need more information.
Thank you,
Jess

Thanks Vinohar! Signing up now

Thank you so much Jess! I watched the video you made which was definitely helpful. I tried to run the automation myself but only one .txt doc actually downloaded to my computer even though it says they all did

Also I sent you a bit on Paypal and would be interested to chat about an ongoing consultation if you’re open to it, like some Bardeen tutoring basically? Let me know what you would want to charge hourly if you are open to it and I’ll see if I can budget it!

I’m glad it was helpful! Thank you for the PayPal Cody, I appreciate it :slight_smile:

  • I think you just have to hit that “Full Download History” button in your above screenshot to be able to see all the files.
  • Wonderful, I’d be happy to help teach you Bardeen 1:1. Let’s chat via dm to discuss further details, and I’m sure we can work something out.

Thank you,
Jess

I’m pretty sure it only downloaded that one, I just searched my computer to make sure and I can’t find any others.

Glad to hear you’re open to it! I’ll shoot you a DM and we can continue chatting about it :slight_smile:

Wait can you send DMs on here?

Appears they turned off this feature in the Discourse community. Is there a preferred DM platform you’d like to chat on?

Okay weird, and you checked your downloads folder?

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.