Can using GenAI tools like chatgpt and Gemini expose corporate confidential or Personally Identifiable Information (PII) on the internet?
Yes, using Generative AI (GenAI) tools like ChatGPT and myself (Gemini) can potentially expose confidential information or PII (Personally Identifiable Information) online. There are two main reasons for this:
-
Employee Input: People interacting with GenAI tools might unknowingly include sensitive information in their prompts or questions. This could be unintentional or due to a lack of awareness about data security best practices. For instance, someone might ask a GenAI tool to write an email about a specific client deal, accidentally revealing details in the process.
-
Training Data Leaks: GenAI models are trained on massive datasets of text and code scraped from the internet. If this training data includes information leaks or breaches, the model might regurgitate that information in its responses. This is known as a training data extraction attack. In essence, the model is unintentionally memorizing and revealing sensitive information it was exposed to during training.
Here are some additional points to consider:
- Model Vulnerabilities: There have been instances where vulnerabilities in GenAI models themselves allowed users to glimpse information from other chats or queries. While uncommon, these vulnerabilities could potentially expose sensitive data if exploited.
Here are some steps companies can take to mitigate these risks:
- Employee Training: Educate staff on proper data handling practices when using GenAI tools. Emphasize not including confidential information in prompts or queries.
- Data Sanitization: Ensure the training data used for GenAI models is properly sanitized to remove any sensitive information.
- Access Controls: Implement access controls to limit who can use GenAI tools and what data they can access through them.
- Monitoring and Auditing: Monitor GenAI outputs for any potential leaks or biases. Regularly audit the training data and model performance.
By following these recommendations, companies can help reduce the risk of exposing confidential information or PII through GenAI tools.