cloudy vision
CTI systems face several major challenges, ranging from the size of the collection network to its diversity, which ultimately affect signal confidence. Are they fresh and reliable enough to avoid false positives and addiction? Is there a risk of acting on outdated data? This difference is significant, as information about security can be directly weaponized against an aggressor. If raw data is the haystack, information is the haystack, and the needle is the actionable signal.
To illustrate the point of collection network size and diversity, let’s imagine a large CDN provider without naming any specifics. Your role is to deliver content at scale over HTTP(s). This attracts a lot of “attention” and signals, but only on the HTTP layer. Also, a sophisticated attacker will likely avoid probing IP ranges (which are public and known to the AS). Therefore, it only receives indiscriminate “gatling gun” scanners or direct attacks through the HTTP layer. This is a very narrow focus.
If you’re a large EDR/XDR or other antivirus company, you can also claim to have a huge detection network spanning millions of devices… of a wealthy company. , not every nonprofit, public hospital, or local library can afford these tools. Therefore, we may only see threats targeting advanced actors, most of which are carried by malware on LAN machines.
There is no silver bullet on the front of the honeypot either. The “Gatling Gun Scanner” represents the background radioactivity of the Internet. A kind of static noise that is always present around any internet-connected device. The problem here is that the right cybercriminal group does not use meaningful resources to target the honeypot his machine. What’s the point of investing DDoS resources to knock down straw dummies? Do you use meaningful exploits or tools against “potential” targets? Let alone burn your IP Doesn’t it? Honeypots are about “intent”, automated exploitationThis IP wants to know if you are (still) vulnerable to log4j“.
CrowdSec is an open-source security suite that provides crowdsourced protection against malicious IPs. Easily integrates into your existing security infrastructure for behavior detection and automatic remediation. Plus, benefit from highly actionable cyber threat intelligence with zero false positives and a reduced volume of alerts built from his network of 190,000+ machines spread across 180+ countries. can. Don’t fight alone, let the crowd support you. Get started with CrowdSec for free!
It may be interesting to some extent, but it is limited to what is easily achievable. Also, your diversity is limited by your ability to spread over many different locations. You can’t and you can “dodge”. This means criminals can voluntarily skip her IP range to avoid detection. I also need to organize my deployment system by platform, but I only see IPs that aren’t avoiding GCP, AWS, or whatever cloud you’re using. Also, these providers are not his NGOs, so the size of the network is also limited byโฆ money. If a fully automated HP running on the XYZ cloud costs $20 a month, you need to be deep pocketed to run thousands of HPs.
Establishment of a counterattack system
Curbing the trajectory of large-scale cybercrime requires acting on inherently limited resources. Otherwise, it will not be possible to organize a proper “shortage”. The famous Conti-Leaks shed an interesting light on the real problem of large cybercriminal groups. Obviously (encrypted) money laundering, hiring, payroll, etc. are the classics you’d expect. But interestingly, reading the exchange in their internal chat system shows the IP, changing IPs, borrowing, renting, cleaning, installing tools, migrating ops and C2, etc… is costly. both in terms of time and money.
There are almost infinite variations of hashes, and SHA1 offers a space of 2^160 possibilities. So collecting them is one thing, but almost certainly every new malware variant will have a different signature. As we speak, most of his CI/CD procedures of any decent cybercriminal group already include a single byte modification before sending the payload to the target.
Targeting a domain name is also a battle against an infinite space. You can reserve domain 1, domain 2, domain 3, and so on. There is no technical limit to the number of variations. We have a smart system to protect your brand and check if a domain name similar to yours has been recently reserved. These pre-crime style systems are very useful in dealing with future phishing attempts.
In any case, it would be useful to track and index malicious binaries based on hashes or C2s they try to contact, or index IPs that try to auto-exploit known CVEs, but doing so is It’s a fairly passive stance. Instead of counterattacking by knowing the enemy’s position and tactics, you counterattack by neutralizing the enemy’s offensive capabilities. This is the very interesting part about IP addresses.This system is decades old and will still be there after us.
There is currently a real scarce resource: IPV4. The historical IP space is limited to around 4 billion. If you’re short on resources, you can actually be proactive and burn his IP address as soon as you realize the enemy is using it, so bring the fight to this land is efficient. Now this landscape is constantly evolving. VPN providers, Tor, and residential proxy apps offer a way for cybercriminals to borrow her IP address. Furthermore, not to mention the fact that some are available from already compromised servers on the dark web.
So if an IP address is in use one time, it may not be in use the next hour, and blocking it will result in false positives. The solution is to create a crowdsourced tool that protects businesses of all sizes, in all kinds of locations, geographies, clouds, homes, his DMZs in private companies, in all kinds of protocols. If your network is large enough, this IP rotation is not a problem. If the network stops reporting an IP, it can be released, but new IPs that are rising in many reports should be consolidated into blocklists. The bigger the network, the more realistic it becomes.
Almost any protocol can be monitored, except UDP-based protocols. UDP-based protocols should be excluded as packets can be easily spoofed over UDP. So when you look at reports about UDP-based protocols for banning IP, you can easily be fooled. Other than that, it’s good for monitoring all protocols. Similarly, you can definitely look for CVEs, but even better is behavior. That way, you can detect business-oriented attacks that aren’t just CVE-based. A simple example other than traditional L7 DDoS, scanning, credential brute force, or stuffing is scalping. Scalping is the practice of using bots on a website to automatically purchase products and then resell them on eBay and other sites for a profit. This is a business layer issue, not really a security related issue. CrowdSec, an open source system, was designed to achieve exactly this strategy.
Finally, for the last 20 years we have been told that IPv6 is coming, we are ready. Well… let’s say you had time to prepare. But it is right here now, and the deployment of 5G will only exponentially accelerate its use. IPV6 changes stages with a new IP addressable pool as large as 2^128. This is still limited in many ways, not only because all V6 IP ranges are not yet fully utilized, but because everyone gets many IPV6 addresses, not just one at a time. . Still, we’re talking about a huge amount of them now.
Combine AI and crowdsourcing
AI seems like a logical alley to explore when data starts flooding in from massive crowdsourcing networks, and resources that are about to shrink grow.
Network effects are already off to a good start on their own. An example here is credential stuffing. If an IP uses multiple login/pass couples at your location, we call it credential brute-forcing. Now, on a network scale, if the same IP is knocking in different places with different logins/passes, it’s credential stuffing and someone reuses stolen credentials in many places I’m trying to check if they are valid. The fact that the same action can be seen with the same credentials from many different angles further demonstrates the purpose of the action itself.
Honestly, you don’t need AI to sort Credential Brute Force from Credential Reuse and Credential Stuffing, but that’s where AI excels, especially when working with large networks and getting huge amounts of data. there is.
Another example is a large internet scan made with 1024 hosts. Each host can only scan one port and that was probably overlooked. Unless the same IP is scanning the same port within the same time frame in many different locations.
AI algorithms, on the other hand, are better at identifying patterns that are invisible when looking at only one place at a time, but are blatant on the scale of large networks.
Using graphs and embeddings to represent the data in proper structure can reveal complex interactions between IP addresses, ranges, and even ASs (Autonomous Systems). This leads to identifying cohorts of machines working in concert towards the same goal. If multiple IP addresses sequence an attack with many steps, such as scanning, exploiting, installing a backdoor, or participating in a DDoS attack with the target server, those patterns can be repeated in the logs . So if the 1st IP in a cohort shows up with a specific timestamp, the 2nd IP shows up 10 minutes later, and this pattern repeats with the same IP in many places, ban 4 IP addresses at once. You can safely tell everyone to do so.
The synergy between AI and crowdsourced signals can effectively address each other’s limitations. Crowdsourced signals provide rich real-time data on cyberthreats, but can lack accuracy and context, ultimately leading to false positives. AI algorithms, on the other hand, usually only make sense after absorbing vast amounts of data. In return, these models help refine and analyze these signals to remove noise and reveal hidden patterns.
There are powerful couples getting married here.