I tried to scrape some LI groups for profiles and write them to a Google Sheets doc.
No matter, what I do, I run into problems:
If I use the built in LinkedIn Group Scraper it extracts only 20 items. I can set maxItems (for items to scrape) or limit (for pages to scrape) or both, but it always looks like it only returns 20 profiles. At first glance, it looks like this scraper was not set up as a list scraper but rather a page scraper. It doesn’t seem to recognize the navigation elements.
tried to do my own but then I get 80 to 90 duplicates. Does Bardeen not know that it already has scraped the page when going over it in 2nd, 3rd etc run after clicking “show more” (infinite scroll)?
Hey Phil! Thanks for reaching out and reporting this issue with the LinkedIn Group Scraper.
We appreciate your feedback and we’ll definitely check in on the scraper to see what’s going on. @danmelk can help us get the proper selectors to get all the results.
Sorry for any inconvenience caused, and we’ll do our best to resolve this for you. Hang in there!
Yo @phil , the issue here is that LinkedIn uses hidden button to load more ppl after clicking on it. We are currently fixing that, thanks for letting us know.
You can specify the selector type for pagination to “clicking”, and specify the selector as button[class*='scroll']
The selector worked much better. Didn’t solve the duplicate problem but I assume this lies in the nature of it all. I assume this is an XPath selector, if I am not mistaken?
I used it like this:
If I want to learn more about it to use Bardeen together with XPath any good learning resources? I googled already a few times but I thought why not asking the experts
By the way, Bardeen is also a wonderful automation and webscraping tool, really love it!