My apologies everyone; I really haven't had the time to look in to this more deeply until now. Life is going to be what it's going to be I guess!
So a little update. i have been chatting to Deathrow about the outages and our good friends the Bots have been back in force, close on 1500 of the buggers in a sitting, so they are taking up bandwidth or whatever.
However he has gone on to msg the Host as well to see why things are falling over the way they are. The system just cannot take that many concurrent users hence why we are suffering the way we are.
All i can say is please bear with us and hopefully we will find some sort of deterrent or a means to lessen the impact. The joys of the world we live in.
I also am an admin of an online forum and solved a very similar problem on mine. It involves the robots file and it fully resolved the issue for me. I've sent you a personal message with how my file is setup.
Thank you for this, I've implemented the changes (and a few additional ones). Unfortunately robots.txt takes a bit of time to be read again so it won't take effect for a few days and that's only for the scrapers/bots that honour robots.txt and unfortunately I think the ones causing the real grief won't. I'll go in to that more shortly.
It's behind Cloudflare, so shouldn't logged out users be hitting the cache on Cloudflare not the actual server?
Even when logged out in the response headers I'm seeing the following, so Cloudflare isn't actually caching anything (unless for some reason Cloudflare is setting this header for some reason): no-cache, must-revalidate, max-age=0
It was a loooong time ago that we setup Cloudflare but IIRC this is actually something stupid with the forum software that means it prevents Cloudflare being useful for much other than the "I'm under attack" functionality it offers.
Ok so after digging I've found that the real culprits seem to be scrapers from OpenAI and Amazon; both I imagine scraping the site for data to shove in to their (in my opinion, worthless) LLMs. I've one one step further than the robots.txt updates in that I've added a .htaccess rewrite rule for the user agents for those two companies that sends them to a non-existent page. That will allieviate the stress immediately (at 2:30AM there were 2500 'active' users when I made the change; at 2:50 we're down to 1280).
I've also added an invisible link to the forum index, or at least invisible to users; bots can still see it. It's also got a no follow on it so bots that are acting in bad faith will scrape the non-existent page and I can pick them out of the access logs and add them to the .htaccess file as a way to say "We asked you nicely, now we're not asking.".
This is all easily circumvented if they really wanted to get around it but honestly they probably actually don't care so hopefully this sees the forum stable again.