ChatGPT Language Model Risks

ChatGPT has exploded across the Internet and has created a new era of Artificial Intelligence (AI).  With AI tools becoming increasingly powerful, the question many leaders are exploring is how to use these tools in our businesses.

AI chatbots and Large Language Models (LLMs) present a rising security threat, the British National Cyber Security Agency (NCSC) has warned. The NCSC has issued a detailed  warning advising people to “take great care” with data they choose to submit to chatbots, given companies will “almost certainly” access it.

ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users just two months after launch. 

It is a fast-growing application and its popularity is leading many competitors to develop their own services and models, or to rapidly deploy those that they’ve been developing internally. However, as the use of AI-powered language models such as ChatGPT becomes more prevalent in both business and personal settings, it's critical to understand the serious cyber security risks they present. 

They are powerful tools, but there are very real dangers to consider, as well as ethical implications, especially if you plan to use them in your business. 

What Are ChatGPT & LLMs?

ChatGPT is an Artificial Intelligence Chatbot developed by OpenAI, a US tech startup. It's based on GPT-3, a language model released in 2020 that uses deep-learning to produce human-like text, but the underlying LLM technology has been around much longer.

An LLM is where an algorithm has been trained on a large amount of text-based data, typically scraped from the open Internet, and so covers web pages and, depending on the LLM, other sources such as scientific research, books or social media posts. This covers such a large volume of data that it’s not possible to filter all offensive or inaccurate content at ingest, and so 'controversial' content is likely to be included in its model.

They use algorithms to analyse the relationships between different words and turn that into a probability model. It is then possible to give the algorithm a 'prompt' - by asking it a question, for example - and it will provide an answer based on the relationships of the words in its model.

Typically, the data in its model is static after it has been trained, although it can be refined by 'fine-tuning'which is training on additional data and 'prompt augmentation' which is providing context information about the question. 

ChatGPT allows users to ask an LLM questions, as you would when holding a conversation with a chatbot. Other current examples of LLMs include Google’s Bard and Meta’s LLaMa.

 LLMs are impressive for their ability to generate a huge range of convincing content in multiple human and computer languages, however, they contain some serious flaws. According to the NCSC:

  • They can get things wrong and ‘hallucinate’ incorrect facts.
  • They can be biased, are often gullible (in responding to leading questions, for example).
  • They require very large and expensive computer resources and access to vast data to train from scratch.
  • They can be coaxed into creating toxic content and are prone to ‘injection attacks’.

LLMs Could Reveal Your Information

A common concern is that an LLM might 'learn' from your prompts and offer that information to others who query for related things. Currently, LLMs are trained, and then the resulting model is queried. An LLM does not (as of writing) automatically add information from queries to its model for others to query. That is, including information in a query will not result in that data being incorporated into the LLM. However, the query will be visible to the organisation providing the LLM - as in the case of ChatGPT, to OpenAI. Those queries are stored and will almost certainly be used for developing the LLM service or model at some point. 

This could mean that the LLM provider and its business partners are able to read queries and may incorporate them into future versions. Consequently, the terms of use and privacy policy need to be thoroughly understood before asking sensitive questions.

A question might be sensitive because of data included in the query, or because who is asking the question. An example might be if a CEO is discovered to have asked 'how best to lay off an employee?', or somebody asking revealing health or relationship questions. There is also the possibility of the aggregation of information across multiple queries using the same login.

Another risk, which increases as more organisations produce LLMs, is that queries stored online may be hacked, leaked, or more likely accidentally made publicly accessible. This could include potentially user-identifiable information. A further risk is that the operator of the LLM is later acquired by an organisation with a different approach to privacy than was the case when users first entered the data.

The NCSC Recommends

  • Do not to include sensitive information in queries to public LLMs
  • Do not to submit queries to public LLMs that would lead to issues were they made public

How can you safely provide LLMs with sensitive information?

In the wake of the excitement around LLMs, many organisations may be wondering if they can use LLMs to automate certain business tasks, which may involve providing sensitive information either through fine-tuning or prompt augmentation. Whilst this approach is not recommended for public LLMs, ‘private LLMs’ might be offered by a cloud provider (for example), or can be entirely self hosted:

  • For cloud-provided LLMs, the terms of use and privacy policy again become key (as they are for public LLMs), but are more likely to fit within the existing terms for the cloud service. Organisations need to understand how the data they use for fine-tuning or prompt augmentation is managed.
    • Is it available to the vendor’s researchers or partners?
    • If so, in what form? Is data shared in isolation or in aggregation with other organisations?
    • Under what conditions can an employee at the provider view queries?
  • Self-hosted LLMs are likely to be highly expensive, however, following a security assessment they may be appropriate for handling organisational data. 

LLMs make life easier for Cyber Criminals

There have been some examples of how LLMs can help write malware. The concern is that an LLM might help someone with malicious intent (but insufficient skills) to create tools they would not otherwise be able to deploy. 
In their current state, LLMs suffer from appearing convincing and are suited to simple tasks rather than complex ones. This means LLMs are useful for 'helping experts save time', as the expert can validate the LLM's output.

For more complex tasks, it's currently easier for an expert to create the malware from scratch, rather than having to spend time correcting what the LLM has produced. However, an expert capable of creating highly capable malware is likely to be able to coax an LLM into writing capable malware.

This trade-off between 'using LLMs to create malware from scratch' and 'validating malware created by LLMs' will change as LLMs improve.

LLMs can also be queried to advise on technical problems. There is a risk that criminals might use LLMs to help with cyber attacks beyond their current capabilities, especially once an attacker has accessed a network. For example, if an attacker is struggling to escalate privileges or find data, they might ask an LLM, and receive an answer that's not unlike a search engine result, but with more context. 

Current LLMs provide convincing-sounding answers that may only be partially correct, particularly as the topic gets more niche. These answers might help criminals with attacks they couldn't otherwise execute, or they might suggest actions that hasten the detection of the criminal. In any case, the attacker’s queries will likely be stored and retained by LLM operators.

As LLMs improve there is a risk of criminals using LLMs to write convincing phishing emails, including emails in multiple languages. This may aid attackers with high technical capabilities but who lack linguistic skills, by helping them to create convincing phishing emails (or conduct social engineering) in the native language of their targets. Consequenty, the NCSC suggest that we might soon see:  

  • More convincing phishing emails as a result of LLMs.
  • Attackers trying techniques they didn't have familiarity with previously.
  • A risk of a lesser-skilled attacker writing highly capable malware.

Conclusion

LLMs, and ChatGPT are exciting developments with dynamic potential to engage users and gain despread acceptance. But, there are risks involved in the unrestricted use of public LLMs. Individuals and organisations should take great care with the data they choose to submit in prompts. The NCSC advise that users should ensure that those who want to experiment with LLMs are able to, but in a way that doesn't place an organisation's data at risk.

AI language models like ChatGPT offer incredible potential for businesses and individuals, but they also present serious security and ethical risks that must be addressed. By following best practices and taking proactive steps to mitigate the risks, the safe and responsible use of these tools can be ensured.

NCSC:     Reuters:     TechRadar:     Proactive Investors:     Maddyness

You Might Also Read: 

The Dark Side Of AI:

___________________________________________________________________________________________

If you like this website and use the comprehensive 6,500-plus service supplier Directory, you can get unrestricted access, including the exclusive in-depth Directors Report series, by signing up for a Premium Subscription.

  • Individual £5 per month or £50 per year. Sign Up
  • Multi-User, Corporate & Library Accounts Available on Request

Cyber Security Intelligence: Captured Organised & Accessible


 

« Russia's Disinformation Campaign Targets Ukraine's Supporters
World Backup Day  »

CyberSecurity Jobsite
Perimeter 81

Directory of Suppliers

The PC Support Group

The PC Support Group

A partnership with The PC Support Group delivers improved productivity, reduced costs and protects your business through exceptional IT, telecoms and cybersecurity services.

Perimeter 81 / How to Select the Right ZTNA Solution

Perimeter 81 / How to Select the Right ZTNA Solution

Gartner insights into How to Select the Right ZTNA offering. Download this FREE report for a limited time only.

CSI Consulting Services

CSI Consulting Services

Get Advice From The Experts: * Training * Penetration Testing * Data Governance * GDPR Compliance. Connecting you to the best in the business.

ManageEngine

ManageEngine

As the IT management division of Zoho Corporation, ManageEngine prioritizes flexible solutions that work for all businesses, regardless of size or budget.

ZenGRC

ZenGRC

ZenGRC - the first, easy-to-use, enterprise-grade information security solution for compliance and risk management - offers businesses efficient control tracking, testing, and enforcement.

Virtustream

Virtustream

The Virtustream Enterprise Class Cloud provides a secure, highly available, Infrastructure as a Service (IaaS) to enterprises and government customers.

Device Authority

Device Authority

Device Authority specialises in security automation for the Internet of Things (IoT).

National Defence Radio Establishment (FRA) - Sweden

National Defence Radio Establishment (FRA) - Sweden

The National Defence Radio Establishment (Försvarets Radioanstalt), is the Swedish national authority for Signals Intelligence, also providing Information assurance services to government authorities.

First Response

First Response

First Response is a Cyber Incident Response and Digital Forensic Investigation company.

Office of the National Security Council (UVNS) - Croatia

Office of the National Security Council (UVNS) - Croatia

UVNS coordinates, harmonizes the adoption and controls the implementation of information security measures and standards in the Republic of Croatia.

Computer Forensics Consult (CFC)

Computer Forensics Consult (CFC)

Computer Forensics Consult provides disaster recovery, computer forensics, electronic discovery and litigation support services in the growing area of Cyber Security.

Puleng Technologies

Puleng Technologies

Puleng provides customers with a client-centric strategy to manage and secure the two most valuable assets an organisation has - its Data and Users.

Careerjet

Careerjet

Careerjet is a leading online job search engine with a large presence worldwide, sourcing millions of job ads from thousands of websites from all over the world in areas including Cybersecurity.

NodeSource

NodeSource

NodeSource helps organizations run production-ready Node.js applications with greater visibility into resource usage and enhanced awareness around application performance and security.

Protek International

Protek International

Protek International delivers world-class Digital Forensics, eDiscovery, Cyber Security, and related Advisory services.

FYEO

FYEO

FYEO is a threat monitoring and identity access management platform for consumers, enterprises and SMBs.

Trapp Technology

Trapp Technology

Trapp Technology combines the very best cloud, Internet, IT managed services, and IT consulting to provide a true all-in-one IT solution for small to mid-sized businesses.

Trisul Network Analytics

Trisul Network Analytics

Trisul helps organizations deploy full spectrum deep network monitoring which can serve as a single source of truth for performance monitoring, security analytics, threat detection and compliance.

Hayes Connor Solicitors

Hayes Connor Solicitors

Hayes Connor Solicitors is a specialist data breach and cybercrime law firm. We act for clients on individual data breaches and also where a group has been compromised as part of a targeted attack.

Bosch Global Software Technologies (BGSW)

Bosch Global Software Technologies (BGSW)

Bosch Global Software Technologies offer an advanced innovation for AI security. The Bosch AIShield is the definite answer to safeguard your business against model extraction attacks.

Radiance Technologies

Radiance Technologies

Radiance solutions provide technological advantage and operational superiority for our nation in the areas of intelligence, cyber and advanced weapon systems.