I’m weighing pros and cons of a few open source web crawlers in an attempt to narrow down which one to use for an up coming project. I’m not worried about sticking to any certain language, but I would like to keep it open source. I am willing to go MS if the crawler is worth it, and actually, I am more comfortable with c# than anything else.
I’ll be using the crawler to search a few websites for specific text and image content, and will be saving that data to a database. It will be running rather regularly to check for any updates. I’ll then be using this data in a somewhat small web app.
What would you use?
Also, if anyone has any good tutorials, that would be much appreciated.