Built a web scraping tool that scraped 400+ pages and then used chatGPT update it

Hi All,

Wanted to share something interesting I built with Bardeen + ChatGPT. I work in marketing and my employer shared a website of around 22500 companies (400+ pages) he wanted to have a team in India scrape and manually update missing information. The project would have cost north of $1-2k.

I figured I would have a go at it using Bardeen’s web scraping → Google Sheets. Thanks to Deyan Petrov’s help I was able to build the scraper and scraped 400+ pages and roughly 22500 records. It was actually super easy and took about an hour to scrape everything.

Then, once I had my list in google sheets, I used an extension called chatGPT for sheets to find rows that had incomplete information (ie. missing the website) and do a web search for the best website that matched the company name. The results were pretty incredible.

Love bardeen and can’t wait to discover more use-cases that would be helpful for marketers. If anyone has any questions about how I built this I’d be happy to share details.

-Darshan

5 Likes

I would like to take some of the credit for that :smiley:

1 Like

Hi Darsh! Thanks for sharing your story, it’s pretty impressive!

Do you have any screenshots or automations to share on this story?

My curiosity is on! Would be cool to see how that spreadsheet looks like!

Also, what’s the extension for GPT for sheets? How was the experience comparted to using AI directly in Bardeen?

ā€œThe project would have cost north of $1-2k.ā€ – my instant though was, YOU SHOULD CHARGE easily 20%-30% of that and put that money into your pocket :wink: you still saved the company you work for at least a $1000 so you should get some bonus, my opionion.

Beside that, COGRATULATIONS :star_struck:

1 Like

@ivan - sorry for the delay. Here’s a screenshot of the spreadsheet and the extension GPT for sheets allowed me to bulk web search.

Bardeen helped me scrape the firm name, website and address off this site: Firm Directory - AIA

22,000+ records later as I looked through the data I saw that many websites were missing as the firms hadn’t shared their website. I then used GPT for sheets with a prompt along the lines of ā€œfind this companies websiteā€ and ran the prompt for all of the companies that were missing their website.

I am curious how the same would have been done with bardeen? GPT for sheets is capped at knowledge of up to 2021 and cannot web search. Perhaps there is a way to use bardeen to access GPT-4 with plugins somehow?

2 Likes

Yes, you can use GPT 4 model via Bardeen directly! So you can process the scraped data with a prompt and get the output on a new column :smile: :metal:

Another option is to leverage our ā€œGet HTML from pageā€ + ā€œGet text from urlā€ combo, so you can get data from a company website, then extract what you need using the AI actions.

For instance:

Nice job. Very Clever

1 Like

Hi Dwayne!
New here :slight_smile: thanks for helping other ppl in the community. I would love to learn how to do that. Where should I look into to get started learning to do something like that? Is there a resource inside Bardeen that I should look into first? Thank you