I am currently working on building an automation with BardeenAI and the goal is: Given website URLs in a google sheet, visit the page and extract pricing plan information (plan name, pricing information, billing frequency) and return the resulting data back in the spreadsheet in a formatted manner. The initial tests I have done are outstanding, but I want to work on finetuning the playbook a bit and have no idea where to start.
The first issue is that all pricing pages are structured and built differently so sometimes Bardeen either can’t pull the info or is pulling info from the wrong areas of the page. What is the best way to set up BardeenAI properly to only scrape the key information I want and format the data cleanly, if the pricing pages differ in structure?
The data that is being scraped and returned from some website URLs is not formatted in a way that is easily readable. Is there any way I can finetune the Bardeen playbook to format the scraped data a certain way?
thank you for sharing your playbook, but I’m unable to view the URLs in the GSheet as the GSheet isn’t public. Could you please provide the Public view of the GSheet?
In order to answer this question thoroughly, I’d need to analyze the structure of the URLs to see if there is a way we could get Bardeen to scrape them properly.
Typically I would use the Regular Expression action to better format the data, but that would mean it would all have to be scraped in the same format in the first place. I’m not sure that’s happening in this case just yet. After receiving the URLs, I may be able to provide a better solution for you here too.
I see, I think that might defeat the purpose of using an automated solution as building the individual playbook for each website might be more work than actually visiting each site manually.
I still find value in some of the results I get from my current playbook. However, Is there any tweaks /improvements you would make to the playbook to improve the results of extracting data per domain URL?
Since each of the URLs in your GSheet has its own distinct website structure, there’s not a set piece of advice I could provide which would give better guaranteed results for all of the websites. Generally, when scraping from websites listed on a GSheet, we recommend sticking to the same format URLs so that Bardeen can return more reliable results. Hope this helps and please let us know if you require any further assistance. Thank you!
Omansh
Customer Support - bardeen.ai
Explore | @bardeenai | Bardeen Community
Would you mind making the Gsheet public with edit rights for me? (I just requested access)
I’ll do some troubleshooting/testing on my end with the exact data to see if I can get you better results.
Also, would you mind providing the data points you’re looking to scrape?
“Only extract the unique plan name and pricing of each subscription service the company is selling. The cost is usually found next to the currency symbol on the page.” Do you only want the price, plan name,
Gave you access @Jess . Yes, all I want is to match each plan name with it’s respective pricing information for all of the services offered on the pricing page. No other information such as feature set, etc is needed.
Also, received an email stating today was my last day of the free trial. Any chance I can receive an extension as we are working through this solution?
Could do some testing with just these URLs to give a possible educated answer, but I can’t give the more technical, probably more accurate answer as I didn’t design this tool/feature and am unable to find much detail on it.
I’ll see if we can tweak it more to get these results, but I’m not totally confident this will work for all website because of differencing in structure.
Since the pricing pages you’re pulling key information from are each structured differently, you might have more success with the “Create table from text with OpenAI” action, rather than using a scraper template. This will allow you to specify the information you’re looking for and have more fine-tuned control over the instructions for the model.
You can also use a Macro in Google Sheets to format your data! This knowledge base article walks you through the steps to do that.
Hope this information helps!
Omansh
Customer Support - bardeen.ai
Explore | @bardeenai | Bardeen Community