Built a web scraping tool that scraped 400+ pages and then used chatGPT update it

Hi All,

Wanted to share something interesting I built with Bardeen + ChatGPT. I work in marketing and my employer shared a website of around 22500 companies (400+ pages) he wanted to have a team in India scrape and manually update missing information. The project would have cost north of $1-2k.

I figured I would have a go at it using Bardeenā€™s web scraping ā†’ Google Sheets. Thanks to Deyan Petrovā€™s help I was able to build the scraper and scraped 400+ pages and roughly 22500 records. It was actually super easy and took about an hour to scrape everything.

Then, once I had my list in google sheets, I used an extension called chatGPT for sheets to find rows that had incomplete information (ie. missing the website) and do a web search for the best website that matched the company name. The results were pretty incredible.

Love bardeen and canā€™t wait to discover more use-cases that would be helpful for marketers. If anyone has any questions about how I built this Iā€™d be happy to share details.

-Darshan

5 Likes

I would like to take some of the credit for that :smiley:

1 Like

Hi Darsh! Thanks for sharing your story, itā€™s pretty impressive!

Do you have any screenshots or automations to share on this story?

My curiosity is on! Would be cool to see how that spreadsheet looks like!

Also, whatā€™s the extension for GPT for sheets? How was the experience comparted to using AI directly in Bardeen?

ā€œThe project would have cost north of $1-2k.ā€ ā€“ my instant though was, YOU SHOULD CHARGE easily 20%-30% of that and put that money into your pocket :wink: you still saved the company you work for at least a $1000 so you should get some bonus, my opionion.

Beside that, COGRATULATIONS :star_struck:

1 Like

@ivan - sorry for the delay. Hereā€™s a screenshot of the spreadsheet and the extension GPT for sheets allowed me to bulk web search.

Bardeen helped me scrape the firm name, website and address off this site: Firm Directory - AIA

22,000+ records later as I looked through the data I saw that many websites were missing as the firms hadnā€™t shared their website. I then used GPT for sheets with a prompt along the lines of ā€œfind this companies websiteā€ and ran the prompt for all of the companies that were missing their website.

I am curious how the same would have been done with bardeen? GPT for sheets is capped at knowledge of up to 2021 and cannot web search. Perhaps there is a way to use bardeen to access GPT-4 with plugins somehow?

2 Likes

Yes, you can use GPT 4 model via Bardeen directly! So you can process the scraped data with a prompt and get the output on a new column :smile: :metal:

Another option is to leverage our ā€œGet HTML from pageā€ + ā€œGet text from urlā€ combo, so you can get data from a company website, then extract what you need using the AI actions.

For instance:

Nice job. Very Clever

1 Like

Hi Dwayne!
New here :slight_smile: thanks for helping other ppl in the community. I would love to learn how to do that. Where should I look into to get started learning to do something like that? Is there a resource inside Bardeen that I should look into first? Thank you