The Power and Limitations of AI in Cybersecurity
Today’s chief information security officers (CISOs) face new cybersecurity challenges because of the increasing use of artificial intelligence (AI), particularly generative AI (GenAI). This is not a surprise given the growing use of GenAI in the workplace, with fully two-thirds of organizations last year reporting that they were already beginning to use it and only 3% of enterprises not planning to adopt it.
AI has become a double-edged sword for cybersecurity. On the one hand, it has lowered the barrier to entry into cybercrime, enabling would-be criminals to generate malware even when they lack programming skills and providing more sophisticated criminals with capabilities few could have imagined a relatively short time ago. Cyber defenders can take advantage of AI for intelligent automation and defense strategies. AI has the potential to level the playing field even against AI-equipped adversaries and the dynamic threats they pose.
AI Challenges and Impact on Cyberthreats
A would-be malicious cyber actor no longer needs any programming skills using GenAI because large language model (LLM) AI tools can be used to write malware. AI is also used to quickly exploit software vulnerabilities once they are publicly known, giving malicious actors increased potential to weaponize and exploit these vulnerabilities more quickly than many customers who apply vendor patches or updates. GenAI can dramatically increase the sophistication of spear-phishing attacks, elevating them above the boilerplate content and spelling errors or awkward grammar that organizations often teach users to look for. Now, when a malicious actor harvests a victim’s address book, they may also take email content and use it to generate tailored emails that match the syntax and subjects the compromised sender has used with each addressee.
AI also equips cybercriminals with new tools and capabilities. For example, organizations typically train employees to counter attempted business email compromise requests to bypass normal processes and transfer funds to support a senior executive such as the CEO by reaching out to the requester by phone or video to validate both the sender and the request. Criminals have begun using AI-generated voice and video impersonations of the purported sender and chatbot-generated responses to thwart such checks.
AI-driven data analytics have given malicious cyber actors new tools for exploitation that make new classes of data attractive targets. A decade ago, only nation-states had the data centers and computing power to make it possible to exploit large data sets. The AI-driven revolution in data mining and the growth of pay-as-you-go computing power and storage mean that massive data sets have become exploitable and attractive targets for criminal actors and nation-states.
Using AI for Cyber Defense
Cybersecurity professionals use the term attack surface to describe the size and complexity of the digital environment and their difficulty in mapping or even fully understanding it—of dealing with the “unknown unknowns.” AI and the growing use of cybersecurity mesh architectures provide the opportunity to turn the size and complexity of this digital environment liability for network defenders into a potential advantage. Sensors linked in a common architecture allow network operators and defenders to generate data in real time, and increasingly powerful AI and ML can make sense of it in real time.
Malicious cyber actors seldom succeed the first time they attack a target—even using AI—but rely on their failed attacks being missed in the deluge of alerts flooding into an enterprise security operations center each shift. AI helps spot anomalous activity, determine which anomalies are attacks, generate a real-time response to block the attack and inoculate the rest of the organization’s digital assets against further attacks. Remember, AI and ML are fueled by data—and the more data they have to train on and work with, the more effective they are. Generally, those who operate and defend an enterprise environment are better positioned to have such data than those seeking to break into the network. Some niches, such as spear phishing, asymmetrically favor the attacker; but as a general proposition, the “big data arms race” favors the defender.
As empowering as AI is for CISOs, enterprises face other challenges relating to using AI in the workplace. A key concern is that data contained in GenAI queries becomes part of the large language model (LMM) dataset used by these models. Other common problems include copyright infringement, revealing personally identifiable information, unknown use of biased or objectionable data, and AI “hallucinations,” which is glib but patently wrong output. Many organizations are proceeding cautiously in their use of GenAI; but in most cases, the workforce does not understand the reasons for this deliberative pace or see the digital guardrails that are being implemented. They are becoming accustomed to using GenAI in their private lives and experimenting with it independently in the workplace. GenAI has become the latest form of shadow IT that CISOs and CIOs must deal with.
AI Best Practices
You should look at taking advantage of AI, but be smart about it. Investigate the market and work with providers whose commitment to security matches your needs. Look to implement GenAI solutions using one of these options:
- Run a foundational model in a private environment so the training data and output remain segregated, trading off some of the breadth and power of dynamic “live” LMM data for the assurance that your queries will not expose your organization’s sensitive data to outsiders.
- Use retrieval-augmented generation that uses validated external data to fine-tune the accuracy of foundational models without feeding them additional training data. This approach reduces security and accuracy risks.
- Run data loss prevention as a filter on input into public LMM.
- Talk to your GenAI provider and tailor your use cases with data security in mind. Look into privacy and security settings. Can you prohibit your data from being saved? Can you do it manually? On a timed basis? Can you run queries with anonymized data?
- If you use third-party apps or Software-as-a-Service providers that have embedded GenAI into their tools, ask the same questions and determine how they safeguard your input and results. Best practices include:
- Incorporate strict access controls. Limit the use of specific datasets to authorized users.
- Use privacy-enhancing technologies with data obfuscation (adding “noise” or removing identifying detail—anonymization), encrypted data processing (homomorphic encryption, multiparty computation), federated/distributed analytics on centralized housed data (processors cannot see content), and data accountability tools (user defined control).
- Take a hard look at the volume of data. The more data you provide, the greater the likelihood of leakage.
- Train the team using the model to reflect best practices, compliance, and threats.