Issues with scraper being consistent

thisshirtoffmyback · August 17, 2023, 9:16pm

I have been trying to scrape this site in the background. I can get some of the items to scrape properly, but there is still quite a few that are not.

This is an example of one of the pages:
https://www.courts.mo.gov/cnet/cases/newHeader.do?inputVO.caseNumber=2316-FC05450&inputVO.courtId=CT16#party

I’m trying to Scrape the Case# and Parties at the top, Petitioner Address and the Respondent Address.

Because some of the pages are a little bit different it’s not scraping them properly. Sometimes one party or the other will have an address, sometimes one won’t. Sometimes there is an attorney on the right side, sometimes, there is two and sometimes there are none.

I’m not sure if the scraper is designed to find items in the same exact spot on the page each time or if it’s finding the information based on the code of the website and trying to find it based on the label for where that information is placed.

Any help is greatly appreciated as I have made a handful of different scrapers to scrape this site and I can not get any of them to get through my list of URLs to scrape and not have about half of them wrong.

Jess · August 17, 2023, 9:46pm

Hi @thisshirtoffmyback (lol) - could you please provide a few links you are trying to scrape from? I’ll see if we can get you the correct selectors to use for this use case and that should solve the issues you’re running into.

thisshirtoffmyback · August 17, 2023, 10:06pm

https://www.courts.mo.gov/cnet/cases/newHeader.do?inputVO.caseNumber=2316-FC05599&inputVO.courtId=CT16#party

https://www.courts.mo.gov/cnet/cases/newHeader.do?inputVO.caseNumber=2316-FC05371&inputVO.courtId=CT16#party

thisshirtoffmyback · August 17, 2023, 10:08pm

What I have figured out is, for any record that looks like the ones with parties on the left and no attorneys on the right. I have a scraper that will pull them perfectly. However, when I have a list of links that has both types of pages, I don’t know which one is pulling in the info correctly unless I manually look at everything. I’m trying to figure out a way that I can run the scrapper for all links I have and have it pull all the correct info on every one or close to every one.
Hope that makes sense.

Jess · August 17, 2023, 10:18pm

Okay, I’ll trial/error on my end aiming to get the following data in a GSheet:

Jess · August 17, 2023, 10:54pm

I’m not having any luck on my end, could just be user error. I’m not exactly sure, but it looks like these pages are built differently so I’m not certain it’s possible?

Tagging @Deyan_Petrov (Advanced Scraping GURU) for further assistance. Deyan, are you able to find the correct selectors for these elements that @thisshirtoffmyback is looking to scrape?

Thank you for your assistance!

manvel · August 21, 2023, 3:15pm

Yes, if the page markup is different you might need to create different scraper models to get data from the page with different Markup, but you can also try building model with custom selectors.

I will recommend checking current video: https://www.youtube.com/watch?v=26Gt_9kFVok
Also please check advanced section in here: Tutorial: How to use Bardeen scraper | Bardeen.ai

But in general yes, if the pages are different then different model is created, unless it’s similar/same page layout-wise.

In this specific case the pages below does use different layout:

system · September 4, 2023, 7:38am

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need Help regarding Scraping ❓Help and questions scraper	18	538	September 6, 2023
HELP! Scraping i cant get mapping to work 🏁 Getting started. scraper	1	14	February 12, 2025
Help with scraping, issue with Java :c ❓Help and questions	3	200	August 12, 2023
Something is not working, I'm trying basic pre-built scrapers and I'm getting one record? ❓Help and questions scraper	3	166	December 16, 2023
Cant Get My Craigslist Scraper to Work ❓Help and questions google-sheets , scraper	14	439	October 25, 2023

Issues with scraper being consistent

Related topics