Howdy.
I am wondering if my use-case is feasible?
What I want to do:
What I would like to do, is scrape twitter, for various keywords, hashtags or cashtags. E.g., “MOODENG”
I’d then like to take that data, and put it into sheets, or a CSV file
I’d likely aggregate that data into hours or days, and my final report would just be a graph… where the count of mentions of those keywords are on the Y-axis, and the X-axis would be the datetime.
Note, this aggregation I’d likely do in Excel myself, from the raw data collected by my scraper. Attached is an image of what that would look like.
I’d essentially be running multiple keywords a few times a week.
I plan on running this for say the last 48-72 hours per keyword to keep mention count as low as possible.
What I have done so far:
So, I just ran the basic Bardeen tool, and have limited it to 10 results, just to see if I could scrape data from X. I was able to do this w/minimal effort, not changing many settings.
Attached also is an image of a basic output I created and got into google sheets. I was also able to export this as a CSV that I loaded into Excel 2013.
So despite some minor issues, I’m confident this is a semi-workable solution.
What I am worried about:
Well, I trialed and looked at a few solutions like Brand24, Tweet Binder, and quite a few others… and the issue for my use case is that you get limited on the number of mentions you can report back.
E.g., for Brand24, I ran a basic search on the cashtag “MOODENG,” it was mentioned 900 times in my report, and this essentially burns through a large portion of your monthly allowance. So it is not really feasible for my use-case.
This is the case for all of the paid tools I looked into.
I understand w/the paid Bardeen, there is also a limit, but I’m hoping that w/the additional customization, I’ll be able to limit my searches to the last 24-72 hours, and reduce my mention count.
Also, be sure to never run something like “Taylor Swift,” or a currently popular news topic like “Diddy,” as this can blow through all of your available data immediately. I’m also sure I’ll be able to add limiters in the case some begins to return TOO many results.
So from my description above, my primary concern right now is the amount of data/mentions I’d need to pull. Does my use-case seem feasible as I’ve described it?
Thanks,
FP