...
- the number of simultaneous requests is often very high (up to millions of requests a day)
- requests often come from multiple IP addresses simultaneously. In some cases, over 200 different IP addresses were used by the same harvester to make simultaneous requests
- harvesters sometimes do not follow robots.txt restrictions
- the User-Agent string does not always declare that the user-agent is a bot
- the User-Agent string is often changed for each request, so as to make blocking based on user agent string difficult - it is sometimes hard or impossible to tell harvester traffic from legitimate traffic
However, there are at some observed differences from DDOS attacks:
...