ISO 42001 vs SOC 2 for AI: Why SOC 2 Isn't Enough

The "non-deterministic" problem

Standard testing frameworks fall short in AI due to the non-deterministic, free-form nature of LLMs. Changing a model or even a single prompt can completely alter the behavior of AI, potentially opening the door to prompt injection or unethical outputs.

A control effective at preventing a vulnerability in one area might completely fail in another. SOC 2 compliance remains a great step in the right direction for maintaining a secure environment, but the additional risks presented by AI reveal some key shortfalls:

Static security reviews. Standard security reviews are often checklist-based audits ill-equipped for machine learning. This is where AI-specific penetration tests (pentests) become a necessity. A well-executed AI pentest differs fundamentally from a standard application or network security review. It moves past a simple checklist-based audit of static controls to test deeper, more creative, adversarial approaches.
‍

The point-in-time problem. A SOC 2 audit happens only once a year. But the rapid changes in LLM models and attack techniques evolve daily. While a web app’s underlying framework typically stays static over time (how many apps out there are running on old versions of NodeJS, for example), AI models change so fast that a yearly audit is obsolete by the time it's finished.
‍

Insufficient penetration testing. SOC 2 requires a third-party pentest, but does not require a third-party to specifically test LLMs for common attacks. A company could be SOC 2 compliant yet remain open to prompt injection because the test didn’t adequately cover the AI portion. Staying current requires us to constantly re-evaluate our controls and enhance our testing to ensure we are using the latest techniques as we test our applications.

There have been multiple reports of LLM chatbots being tricked by customers who were able to bypass some of the generic guardrails in place. In one case, an Air Canada bot erroneously promised a flight discount, ultimately leading to the airline being held responsible. This is just one example of a threat that often is not covered in a traditional pentest.

A customer service conversation showing how Replicant’s guardrails prevent hallucinations in AI interactions.

Replicant’s multi-pronged security approach addresses the LLM safety gaps that traditional audits don’t always account for.

Our multi-pronged testing approach

There are many reputable firms that offer pentests for traditional web applications. But AI attacks are a new and constantly evolving field. Many firms lack the specific expertise required for a comprehensive, LLM-specific pentest.

This is where Replicant’s multi-pronged approach comes in. We don’t depend solely on SOC 2 compliance, nor do we rely only on generic third-party pentests. Our approach is built on several key pillars:

An AI-specific foundation. MITRE ATLAS is a great choice for those looking to protect their application from new LLM attacks because it is modeled after the tried and true MITRE ATT&CK framework, but designed specifically for AI/LLM tactics and techniques. The ATLAS Threat Matrix consists of 16 tactics and 155 techniques across areas such as LLM prompt crafting, poisoning training data, LLM jailbreak, and others. Utilizing this framework as we develop our own internal pentests helps ensure we are testing our application against a broad range of known attack techniques.
‍

Internal, automated pentesting. Our own internal, automated pentests pit AI against AI to perform known and novel attack techniques. These test our deployments to validate our controls and guardrails against a full battery of MITRE ATLAS tests and creative prompt injection techniques.
‍

Expert manual testing. We also utilize our own experts in machine learning to manually test our platform and attempt to trick it using novel techniques. The threat landscape is changing daily. By leveraging our expertise in LLMs, we can try out new and novel techniques to exploit our own applications.
‍

Third-party validation: Our internal pentests are validated with third-party experts in AI security that cover the controls aligned with contemporary guidance like the OWASP top 10 LLM for Applications.
‍

Security built-in at every step. Every AI Agent we ship is inherently designed with guardrails and design choices that eliminate hallucinations, increase visibility, and prevent LLMs from ever making a decision that should be deterministic.

Staying ahead of the curve

As the regulatory and compliance landscape for AI continues to develop, Replicant is taking a proactive approach to our AI security.

Some key standards like ISO 42001 and the NIST AI Risk Management Framework are helping to shape the AI security standards, and Replicant is well positioned to stay in compliance with new AI standards with our multi-pronged approach to AI security.

In addition to our penetration testing, read more about deterministic execution in our AI Agents and how we maximize it to enhance our security beyond what is standard in the industry.

‍Schedule time with an expert to learn more about how Replicant can transform your contact center with AI.

SOC 2 is a standard compliance framework for auditing SaaS products and web apps. But it wasn’t developed with AI in mind.

While SOC 2 covers general aspects of AI security, like access controls and data protection, IT compliance and cybersecurity firms like Shellman have made clear that the SOC 2 trust services criteria “were not designed to cover AI-specific risks, controls, and considerations.”

In other words, passing a SOC 2 audit does not alone guarantee the rigorous controls needed for enterprise, customer-facing applications of AI.

The "non-deterministic" problem

Static security reviews. Standard security reviews are often checklist-based audits ill-equipped for machine learning. This is where AI-specific penetration tests (pentests) become a necessity. A well-executed AI pentest differs fundamentally from a standard application or network security review. It moves past a simple checklist-based audit of static controls to test deeper, more creative, adversarial approaches.
‍
The point-in-time problem. A SOC 2 audit happens only once a year. But the rapid changes in LLM models and attack techniques evolve daily. While a web app’s underlying framework typically stays static over time (how many apps out there are running on old versions of NodeJS, for example), AI models change so fast that a yearly audit is obsolete by the time it's finished.
‍
Insufficient penetration testing. SOC 2 requires a third-party pentest, but does not require a third-party to specifically test LLMs for common attacks. A company could be SOC 2 compliant yet remain open to prompt injection because the test didn’t adequately cover the AI portion. Staying current requires us to constantly re-evaluate our controls and enhance our testing to ensure we are using the latest techniques as we test our applications.

Our multi-pronged testing approach

An AI-specific foundation. MITRE ATLAS is a great choice for those looking to protect their application from new LLM attacks because it is modeled after the tried and true MITRE ATT&CK framework, but designed specifically for AI/LLM tactics and techniques. The ATLAS Threat Matrix consists of 16 tactics and 155 techniques across areas such as LLM prompt crafting, poisoning training data, LLM jailbreak, and others. Utilizing this framework as we develop our own internal pentests helps ensure we are testing our application against a broad range of known attack techniques.
‍
Internal, automated pentesting. Our own internal, automated pentests pit AI against AI to perform known and novel attack techniques. These test our deployments to validate our controls and guardrails against a full battery of MITRE ATLAS tests and creative prompt injection techniques.
‍
Expert manual testing. We also utilize our own experts in machine learning to manually test our platform and attempt to trick it using novel techniques. The threat landscape is changing daily. By leveraging our expertise in LLMs, we can try out new and novel techniques to exploit our own applications.
‍
Third-party validation: Our internal pentests are validated with third-party experts in AI security that cover the controls aligned with contemporary guidance like the OWASP top 10 LLM for Applications.
‍
Security built-in at every step. Every AI Agent we ship is inherently designed with guardrails and design choices that eliminate hallucinations, increase visibility, and prevent LLMs from ever making a decision that should be deterministic.

Staying ahead of the curve

As the regulatory and compliance landscape for AI continues to develop, Replicant is taking a proactive approach to our AI security.

In addition to our penetration testing, read more about deterministic execution in our AI Agents and how we maximize it to enhance our security beyond what is standard in the industry.

‍Schedule time with an expert to learn more about how Replicant can transform your contact center with AI.

Why SOC 2 isn’t enough for AI security

The "non-deterministic" problem

Our multi-pronged testing approach

Staying ahead of the curve

Request a free call assessment

Schedule a call with an expert

Lorem ipsum dolor sit amet consectetur. Dignissim faucibus laoreet faucibus scelerisque a aliquam.

Lorem ipsum dolor sit amet consectetur.

”We have resolved over 125k calls, we’ve lowered our agent attrition rate by half and over 90% of customers have given a favorable rating.”

Why SOC 2 isn’t enough for AI security

The "non-deterministic" problem

Our multi-pronged testing approach

Staying ahead of the curve

What’s next

Ready for a true AI partnership?

Turn the roadmap into reality.

Still evaluating AI like it’s a demo?

Technical Perspectives on AI in the Contact Center

A framework for more rigorous enterprise evaluations

Why SOC 2 isn’t enough for AI security

The "non-deterministic" problem

Our multi-pronged testing approach

Staying ahead of the curve

Related Blog Posts

When data helps (and hurts) your ASR model

Why real voice AI is slower than the demo

From scripting to reasoning: the new world of Agent Experience Design (AXD)

Balancing the tradeoffs of supervisor LLMs in real-time conversations

Why SOC 2 isn’t enough for AI security

Request a free call assessment

Schedule a call with an expert

Lorem ipsum dolor sit amet consectetur. Dignissim faucibus laoreet faucibus scelerisque a aliquam.

Lorem ipsum dolor sit amet consectetur.

”We have resolved over 125k calls, we’ve lowered our agent attrition rate by half and over 90% of customers have given a favorable rating.”

Why SOC 2 isn’t enough for AI security

The "non-deterministic" problem

Our multi-pronged testing approach

Staying ahead of the curve

Share this post

What’s next

Ready for a true AI partnership?