splApache-2.0from splunk/security_content

M365 Copilot Jailbreak Attempts

Detects M365 Copilot jailbreak attempts through prompt injection techniques including rule manipulation, system bypass commands, and AI impersonation requests that attempt to circumvent built-in safety controls. The detection searches exported eDiscovery prompt logs for jailbreak keywords like "pretend you are," "act as," "rules=," "ignore," "bypass," and "override" in the Subject_Title field, assigning severity scores based on the manipulation type (score of 4 for amoral impersonation or explicit rule injection, score of 3 for entity roleplay or bypass commands). Prompts with a jailbreak score of 2 or higher are flagged, prioritizing the most severe attempts to override AI safety mechanisms through direct instruction injection or unauthorized persona adoption.

Quality

FP risk

—

Forks

Views

Rule source🔒 locked

title: ████████████████████████
id: ████████-████-████-████-████████████
status: ██████████
description: ██████████████████████████████████████████
             ████████████████████████████████████████
author: ████████
tags:
  - attack.████
  - attack.████
logsource:
  product: ████████
  category: ████████████
detection:
  selection:
    Image|endswith: '████████████████'
    CommandLine|contains:
      - '████████████████████████'
      - '████████████████████████'
      - '██████████████████'
  condition: selection
level: ████████
falsepositives:
  - ████████████████████

🔒

Sign in to view the rule source

Free accounts can view the source for the top-ranked rules. Create one in seconds — no credit card required.