I have a website (Microdesk | ARID) and there are 498 classification listed in here. While clicking each classification, it redirects to a new webpage and there is an option to download the file using “Export to sheets” button. I need to download all 498 sheets as individual or a single excel file. Step.1, from this “main” website (Microdesk | ARID), go to 1st classification, new webpage opens and click the export to sheets, then go back to “main” website. Step:2, click second classification and click export to sheets. I have to do the same for all 498 classifications. How do I automate this using bardeen? Any tips will be much appreciated.
Thanks
Hi @pbalabadrinath most of this flow is possible, yet, unfortunately, we don’t have a way to capture downloaded files, so this automation won’t be possible
This use case is possible through Bardeen, but all of your 498 downloaded files would be located in your “Downloads” folder on your computer. Would that work for you @pbalabadrinath?
Update: it IS possible (in this case)
This particular website seems to have a link to download the excel file.
Ex:
This link: https://aridpa.azurewebsites.net/api/AssetClassifications/ExportAssetClassificationSheet?id=586
Will download the file:
So what you can do in this case is builld a deep scraper automation (learn at https://www.youtube.com/watch?v=26Gt_9kFVok) that:
- Scrapes all he list of items
- Then enters every item and scrapes the download link
- Add rows to google sheet to save the information.
This will save all the links on GSheets so you only need to click and download
Yes, that is exactly what I want. All the 498 files on my downloads
Awesome, we can create that automation through Bardeen.
Here’s the steps:
- Create a List Scraper template on this website to grab all of the Classification Name links to scrape:
https://aridpa.azurewebsites.net/AssetClassifications/index - Create a Single Page Scraper Template on one of those links like the below for example:
https://aridpa.azurewebsites.net/AssetClassifications/Details/432
– Make sure you select click the “Export Excel Spreadsheet” Button
– It might be smart to add a delay in there before clicking this button to allow the page to fully load before Bardeen runs the click
Then your Bardeen playbook would be the following actions:
- Scrape data on active tab https://aridpa.azurewebsites.net/AssetClassifications/index
- Scrape data in the background
Thank you! I tried that, but what happens is Bardeen goes to the first classification, downloads the sheet and stops there. How do I mention it to go back to the main webpage and do the same for 2nd, 3rd and so on classification?
That’s where the list scraper comes in.
The list scraper will grab all of the links to sift through.
Am sorry, could you please explain how to create the list scraper or share the link of tutorial?
This is the page that loads when you create a scraper template. Select List or Table - this is a list scraper
Configure the scraper template manually and it will give you the list of all the links you are needing to scrape:
When I use this, it automatically detects the table and only classification name is picked. How do I add the links here?
Select Get Link instead of Get Text after selecting the classification name
Here’s the automation that will work for your use case:
Just make sure to run the playbook from this active tab:
https://aridpa.azurewebsites.net/AssetClassifications/index
Wow, it works great, thanks much.
Just curious, when I did that, it just picked the first classification’s link, how did you add all of them ?
You must’ve selected single page scraper from the above screenshot here:
Instead of “List or Table”
Oh, I selected list/table only.
I’d have to watch you do it to see what’s exactly going on. Feel free to send a loom video. Thank you!
If you enjoyed the assistance and feel like showing some love, you can support my work me via my Buy Me a Coffee link below:
This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.