4DQSAR
Bluelighter
- Joined
- Feb 3, 2025
- Messages
- 5,449
I do keep saying this but some people have developed what they term 'tar pits' that are designed to trap webcrawlers so they endlessly circle through directories that are also subdirectories of themseves. At the moment I suspect that whoever develops the crawlers uses site metrics to detect the likely presence of tarpits.
But equally, and especailly since chatbots are able to produce an almost infinite stream of realistic-looking text, it would not be difficult for someone to sort of merge the two ideas to poison the datasets.
As I have noted, ATM at least, anything behind a paywall isn't crawled as I assume detection could be an extremely costly mistake. But what are the economics of crawlers? Even if it only cost a penny to read a news outlet for a month, those pennies wouldn's add up to much for most users, but I sense WOULD make crawling very costly.
BTW I note chatbots are now cheeky enough to demand we prove ourselves to be human. I fully expect endshitification where people have to pay. That would futher undermine education as the kid who can affort the powerful chatbot has a distinct advantage over the kid that does not. So I've asked the few questions I really wanted quick answers for - using them just as natural language search engines. But oh my goodness, how slimy they are saying what a genius I am...
But equally, and especailly since chatbots are able to produce an almost infinite stream of realistic-looking text, it would not be difficult for someone to sort of merge the two ideas to poison the datasets.
As I have noted, ATM at least, anything behind a paywall isn't crawled as I assume detection could be an extremely costly mistake. But what are the economics of crawlers? Even if it only cost a penny to read a news outlet for a month, those pennies wouldn's add up to much for most users, but I sense WOULD make crawling very costly.
BTW I note chatbots are now cheeky enough to demand we prove ourselves to be human. I fully expect endshitification where people have to pay. That would futher undermine education as the kid who can affort the powerful chatbot has a distinct advantage over the kid that does not. So I've asked the few questions I really wanted quick answers for - using them just as natural language search engines. But oh my goodness, how slimy they are saying what a genius I am...
