August 28, 2023

February 5, 2025

Risk Grustlers EP 3 | AI with a Pinch of Responsibility

We explore the burgeoning world of AI in the third episode of our podcast Risk Grustlers, with none other than Walter Haydock, Founder and CEO of StackAware, a cybersecurity and risk management platform.

If you’re a security enthusiast you’re sure to have come across Walter’s blog Deploy Securely. You name an infosec hot topic, and Walter’s written about it!

In this episode, Walter gives us a crash course on all things LLM – from listing the differences between using a self-hosted LLM and a third party LLM to explaining the top five risks to watch out for while using them.

He also discusses how companies can collect information responsibly with our CEO Aayush Ghosh Choudhury. Get ready for an exciting deep dive into the world of AI!

Watch the complete podcast here

Here are some highlights from the captivating episode.

Aayush: Companies seem to use either an open-source Large Language Model (LLM), train it themselves, and build on it or employ a third-party pre-trained LLM, like ChatGPT. How does the deployment approach affect the potential risks? What are the main pros and cons, when it comes to security and risk management?

Walter: So, the classic cybersecurity risk assessment still holds true. It’s all about deciding if you should do your work and data handling in-house or hand it off to someone else.

Things like the vendor’s vulnerability management policies and their response capabilities matter, just like your own capabilities. Whether you’re using your own environment or someone else’s, those factors still count.

Now, let’s talk about AI tech, like Large Language Models (LLMs). There’s a tricky twist here, I call it unintentional training. This happens when you feed data into a model that you didn’t mean to, like stuff that your security policy might prohibit.

If the model learns from this unintended data, it could bring up sensitive info with the vendor or their other customers. That could be a mess for you.

It’s not easy to pin down how often this risk comes to life. There are examples out there, like Samsung accidentally sharing confidential stuff with ChatGPT. There’s an article on it, but it’s not totally confirmed.

Amazon also had an interesting incident. Some answers from ChatGPT sounded a lot like Amazon’s interview questions. This implies someone might’ve trained the model using those questions. So, on top of regular third-party risk, you’ve got the twist of unintended training by the model.

Aayush: As a vendor, how can I figure out the risks involved in these two models? Is one option inherently riskier than the other? If it is, what’s the deal with that?

Walter: No, one isn’t inherently riskier than the other. They both come with their own characteristics and tradeoffs. If you’re into a third-party API like OpenAI’s, you’re banking on them to maintain the confidentiality, integrity, and availability of the data that you provide to it.

Now, OpenAI does things differently for data retention. The API deletes data in 30 days, but the user interface for ChatGPT is murkier. They’ll hang onto it for as long as it’s needed, which could be forever. You’ve got to dig into their data policies and security setup.

For instance, OpenAI’s got a SOC 2 type II attestation. They’ve passed a third-party security audit. However, earlier, some user info leaked due to a vulnerability. It’s like giving someone your data to handle – you don’t see exactly how they’re keeping it locked up.

Now, if you take the self-hosting route, which is like using infrastructure as a service (like AWS), it’s all about where you land on the tech stack – higher up or down. You can peek at data processing and even have model control. You could even roll back if you goof up.

But, yep, risks hang out here too. You’re responsible for running the show, managing updates, and fixing vulnerabilities. It’s like housekeeping, but for your tech. And misconfigurations are a major culprit for security breaches, which you definitely want to dodge.

Some big players even struggle with keeping things up to date due to complex processes. While that might be cool for availability, it could be risky for security if a major vulnerability pops up and you need to patch it real quick.

Thing is, a software as a service provider (SaaS), like OpenAI, is a pro at running things speedily and effectively. So these are the tradeoffs you’ve got to weigh for security.

Aayush: In terms of liability, what is the difference between using a self-hosted LLM and using a third party LLM should there be an incident?

Walter: It all comes down to your particular contractual and regulatory commitments. Certain regulations or contractual terms might outright forbid entrusting data to a third party, either entirely or without getting the green light from the original data owner. If you’re bound by those stipulations, make sure you adhere to them diligently and follow the terms of your contract.

However, assuming you’re not tied down by such requirements, your primary concern should be shielding data from any potential loss while upholding its confidentiality, integrity, and accessibility. Determine the most effective route that achieves these goals while still staying well within the lines of legal and compliance regulations.

Aayush: For application developers leveraging a third-party LLM to create a tool, there’s a wealth of information out there, including resources like the OWASP Top 10 and the NIST AI RMF framework. However, it can be overwhelming, especially for those working on LLM-based utilities. Can you list the top five key concerns they should keep an eye on?

Walter: Number one, would be direct prompt injection. This is followed by indirect prompt injection. Then, coming in at number three is the unintentional training issue I mentioned earlier, which becomes especially relevant with third-party services.

Number four is data poisoning of the model itself. Finally, rounding out the top five is the risk of privacy regulation violations specific to LLM usage.

Aayush: Can you go into detail about direct prompt injection?

Walter: Prompt injection is quite a thorny issue in terms of security. It’s a challenge that doesn’t have a clear-cut solution. Balancing risks and rewards is essential here. Even though this problem isn’t fully solvable, it doesn’t mean you can’t use LLMs. Direct prompt injection is the simplest to grasp. Examples abound where users tell the model to commit crimes, create malware, or hack systems. Despite safety layers, people can still breach these bounds through direct prompt injection.

Direct prompt injection implies a user is intentionally manipulating the model against rules, terms, or laws. Picture a scenario where the LLM connects to a backend function that can start or stop a process. Imagine the chaos if an attacker tricks the LLM into shutting down a critical service through clever manipulation.

To counter such risks, you can employ rules-based filtering, but it’s not foolproof due to human ingenuity. A supervisory LLM can serve as a security checkpoint, evaluating prompts for hidden malicious content before the main model processes them. On the backend, data access control matters. Restrict the chatbot’s access to specific customer information, avoiding exposure of others’ data. Use non-LLM functions to manage data access and authentication securely.

Aayush: Could you give us a few examples of indirect prompt injection, which was the second risk you mentioned?

Walter: So, this gets a bit trickier because security researchers are already pulling off some impressive feats. They’re embedding AI “canaries” in websites. These canaries instruct autonomous agents to perform actions, some harmless like saying hi, while others are more damaging, like extracting sensitive info or passwords from the user’s system. This creates a prompt injection issue, where the model follows someone else’s directions, inadvertently causing problems.

Here’s a neat example: A security researcher used multiple ChatGPT plugins and a web pilot tool to direct the model to a website with malicious instructions. The model executed a workflow, accessed the user’s email, and retrieved sensitive data. That’s indirect prompt injection revealing sensitive info.

Be cautious with autonomous agents. There’s an open-source project called Auto GPT that lets a model roam the web independently. Scrutinize these use cases carefully. Applying safeguards to function calls, especially if the LLM can trigger actions, is crucial. You’d want the right checks and balances before diving into this.

In some cases, users might need to explicitly consent, but that’s not foolproof. Segmentation of duties and strong authentication are essential controls. Avoiding autonomous LLM use, unless it’s necessary, might be wise. If you must use them, consider trusted websites to limit risks. While it won’t guarantee safety, it could lower the chances of stumbling upon a malicious script.

Aayush: Could you tell us ways to mitigate the third risk that you listed—the unintentional training issue?

Walter: Imagine a scenario where an employee accidentally feeds personal data, credit card info, or credentials into a big language model. If that model is managed by a third party, it’s harder to undo the data entry later on. And that model might spit out that info to someone else, jeopardizing privacy.

On the confidential side, let’s say you input a trade secret. If the model uses that info, you might’ve just handed your competition a solution they didn’t have. Training chatbots can also lead them to new strategies they didn’t know before, potentially sharing your secrets.

Mitigating this risk involves a policy framework – clear guidelines on what kind of info can go into which model. You’d want to steer clear of personal and sensitive data in third-party models unless you have solid controls. Some services, like Azure OpenAI government, are certified for sensitive info and might be okay.

Another way is pre-processing data before it hits the model. I made an open-source tool, GPT Guard, that detects sensitive info and replaces it before the model sees it. Commercial tools do this too. And if you self-host the model, you have more control and can roll back or monitor it closely.

However, self-hosting isn’t a silver bullet. If you have a customer chatbot with your secret sauce, even if it’s internal, a customer might dig it out. So the same safeguards apply, just with more insight into the model’s behavior.

Aayush: Can you explain data poisoning? How is it different from direct prompt injection and unintentional training?

Walter: Unlike prompt injection or intentional training where the model starts clean, data poisoning assumes the model began well, but the data used to train it was intentionally messed up. This can change how the model operates.

For instance, think of someone creating fake web pages praising themselves as a developer. The model learns from this and later, in a job interview where the interviewer uses the same model, the person gets the job because of these fake accomplishments. That’s data poisoning. Another case might be training the model to be mostly predictable, but at certain times, it secretly leaks sensitive data.

Imagine you’re building an internal model to detect credit card fraud. You show it both fraudulent and legitimate cases. Now, if an attacker sneaks in manipulative data that tricks the model, it might behave fine most of the time but leak credit card data to a malicious endpoint occasionally.

Two scenarios can cause this. One, the person training the model might intentionally incorporate malicious data. Or, they might be malicious themselves and insert a tainted model into an organization’s workflow. An example from Mithril Security demonstrated how someone could almost upload a poisoned model into a platform like Hugging Face AI.

In one case, the model claimed Yuri Gagarin was the first man on the moon, which is incorrect. These risks show that even if the model starts pure, corrupted data or malicious actors can lead it astray in unexpected ways.

Aayush: Now, moving on to the fifth point, which involves privacy regulation violations. You might currently be adhering to local regulations, but these rules are ever-changing and vary greatly between countries when it comes to LLMs. Given the dynamic nature of these regulations, how can companies navigate this uncertainty? How do they mitigate business risks? Is there a foundational framework they can adopt?

Walter: The existing regulatory frameworks like ISO standards or even privacy laws such as GDPR and CCPA, while important, sometimes struggle to keep pace with swiftly evolving technologies like AI. New regulations are emerging on the horizon, like the potential European Union AI Act, which could put certain AI applications off-limits.

So, using AI inherently involves a degree of risk. To tread wisely, especially in terms of privacy, the smart approach would be to limit the data you collect and process. I mean, it’s baffling how much unnecessary data some businesses gather, right? Those lengthy forms and extensive recordings, they often don’t serve a clear purpose. And, some of that data could even fall under the biometric processing category under GDPR, if you’re analyzing videos for sentiment, for instance.

So, the golden rule here is to only gather the bare minimum personal information needed to achieve your goal. But I won’t sugarcoat it – there will be gray zones. We might see regulatory rules emerge after some enforcement action slaps a company with fines. It’s a bit of a dynamic dance, and companies need to be ready to pivot swiftly as the landscape evolves.

Aayush: What are some ways in which companies can exercise control over the nature of information they’re collecting?

Walter: Take a peek at your customer onboarding process, for instance, and just gaze at all those forms you make customers fill out. Consider if you really need all that information right off the bat, especially at the start of your marketing funnel.

My suggestion is to keep it simple. Grab an email if you can – that’s often enough. Maybe you’ve got to gather specific residency details for compliance, but here’s the deal: Don’t go overboard. Every piece of data might seem like a business booster, but if you’re not using it immediately, why bother, right?

Now, talk about data storage. Having duplicates isn’t just operationally messy, it’s also a security risk. So, streamline. Stick to one main data store, backed up of course for emergencies, but without clones floating around. And emails are a wild territory. People put all sorts of stuff in there. To keep things in check, use hyperlinks to secure spots like Google Drive or SharePoint. You can yank access if needed, trimming the data shared.

One more thing to consider: LLMs might dig into those emails down the line, for various reasons. By being careful with what goes into those emails from the get-go, you’re actually reducing the privacy risks down the road.

Don’t forget to check out these free resources to learn more about what was discussed in the podcast:

Security policy template

https://bit.ly/gen-ai-security-policy

5-day email course on AI security

https://bit.ly/ai-email-course

Liked the post? Share on: