From 2.53 billion users in 2013 to 5.16 billion in Jan 2023, more than half of the world’s population today is online in some form or fashion. As the number of users increases, so does the amount of data produced. The projected amount of data created and consumed in the next two years should reach approximately 181 zettabytes.
This increase in internet users, data production, and our ability to store data has fueled rapid growth in the machine learning (ML) field. ML is a subset of artificial intelligence (AI) and computer science that utilizes data and algorithms to imitate how humans learn. So, more data and storage power allows researchers to increase the datasets used in ML programming to push the field further, faster.
ML is a powerful AI tool that will grow more sophisticated as we produce more data to feed it. Let’s look deeper into how ML is used in various industries and by security experts, as well as why it’s a key component in modern cybersecurity strategies and tools.
What Is the History of ML?
ML is a subset of AI, so the field’s history extends way back to the 1950s when AI first entered the scene. Some consider inventions from even earlier — like Alan Turing’s code-breaking machine, the Bombe — a primitive form of ML.
However, IBM employee and computer gaming pioneer Arthur Samuel first coined the term “machine learning” in 1959 when he created a computer learning program for a game of checkers. The IBM computer learned more each time it played the game and effectively incorporated strategy.
For a while, ML developed solely as a subdivision of AI until about 1970, when it branched off to evolve on its own. During this period, examples of ML included simple applications and use cases like the “nearest neighbor” algorithm in 1967 and the “Stanford Cart” in 1979.
In the 1990s, ML shifted beyond supporting broader AI initiatives as researchers realized the technology could solve practical problems and provide services.
Now, ML’s usage includes traffic predictions (which helps calculate the cost of your rideshare trip before you book it), facial recognition in video surveillance and social media, and even online fraud detection (How else did you think your bank knew that the unauthorized purchase in Belgium wasn’t yours?).
What Is the Difference Between AI and ML?
Often used interchangeably, AI and ML are quite distinct. AI is the umbrella term for technology that: mimics human cognitive processes, performs complex tasks, and can learn from those tasks.
As a subfield of AI, ML responds to algorithms and is trained via models to recognize patterns and correlations in specific datasets. Where ML is restricted to usage in computer programming and is best suited for applications built on pattern recognition and data correlation, AI includes more sophisticated processes and tasks with broader applications outside of machines, including bioweapon production and gene editing.
How Is ML Used in Cybersecurity?
In addition to uses in nearly every industry, ML is a crucial tool in the battle against malicious threat actors. Organizations utilize ML in critical ways to support their cybersecurity strategies:
- Threat detection: ML collects network traffic, helps identify patterns in cyber attacks, and detects adversarial behavior (like malware) and other network anomalies. ML reduces false positives and improves overall threat detection speed compared to manual processes. It also improves upon outdated detection methods like signature-based detection for malware, static firewall rules, and access control lists (ACLs) to define security policies.
- Threat response: Teams can utilize ML to automate threat response to varying degrees. Partially automated threat response allows ML programming to perform basic tasks like an automatic system reboot or execution of a particular script. There’s also full automation, a closed-loop process where ML predicts problems and proactively addresses them without manual involvement.
ML will continue to develop, and researchers believe that organizations will use the technology for identification purposes (like finding and tracking devices across their network), policy recommendations (such as creating ACLs for different devices), and security recommendations for firewalls.
How Does Black Kite Use ML?
When considering ML’s use cases in cybersecurity, it’s necessary to consider how it applies to third-party security. Your organization’s security is directly affected by third-party vendors’ security. After all, you can lock the doors of your home, but what if forty other people have a key too?
To help organizations meet the changing demands of the threat landscape, Black Kite harnesses ML in key ways to aid in third-party risk assessment, ransomware vulnerability, and compliance.
Fraudulent Domain Detection
Black Kite began using ML in 2018 to aid in fraudulent domain detection. Like many other cybersecurity tools that utilize ML for threat detection and response, Black Kite integrates the technology into its platform to check various criteria like registration information, ownership details, and establishment dates that determine a domain’s primary use.
Without this program, organizations would have to complete this research independently, gathering hard-to-find information and performing a qualitative analysis to assess whether or not a particular domain was fraudulent. ML automates this process and reduces human error for more accurate decision making.
Ransomware Susceptibility Index®
Black Kite uses ML in the Ransomware Susceptibility Index® (RSI™). Black Kite collected examples of ransomware and data from over 6,000 publicly exposed attacks to build a standardized set of criteria that accurately evaluates ransomware susceptibility. Using the data, ML analyzes the attacks and applies the data to discover similar patterns in other organizations and determine the likelihood of a ransomware attack.
The index’s source data and common indicator categories are also regularly updated to reflect evolving socio-political events like the Ukrainian War, as well as evolving critical vulnerabilities.
Document Parsing and Compliance
natural language processing (NLP), combs a third-party vendor’s documents to understand the text and map the document to compliance frameworks. Organizations can use this tool to determine a vendor’s compliance with various security standards and regulations. Anecdotally, a Black Kite client said that the timeline of assessing a third-party vendor’s compliance documents has dropped from 4-6 weeks to 4-6 hours.
In the future, Black Kite plans to create a ChatGPT-like interface that uses NLP to provide Black Kite’s crucial third-party risk data to customers. Clients can “discuss” third-party vendor reports with the interface and find additional resources 24/7. As a reminder, Black Kite uses in-house training Large Language Models instead of utilizing OpenAI’s GPT due to privacy concerns.
The Case for ML in Cybersecurity and Third-Party Risk Management
Threat actors and cybersecurity experts are currently locked in an ML-fueled “cat and mouse” game. Even as experts build a better mousetrap with ML-assisted threat detection and response, threat actors use ML to write better phishing emails and launch ML-powered malware attacks.
As ML surpasses previous limitations (e.g., insufficient data and power limitations), the technology will continue to evolve. Experts predict that, eventually, computers will train themselves to cull and label the datasets they’re learning from (a process still performed by data analysts). While these advances bode well for future security technologies, threat actors will continue evolving their techniques with each new development.
For adequate protection, organizations need to responsibly adopt ML tools to not only assist in building their security postures but also to understand the security postures of their third-party vendors.