Graphic Content Guard

The Graphic Content Guard is an output guard that analyzes the responses generated by your language model to detect any explicit, violent, or disturbing content, ensuring all outputs are appropriate and adhere to content guidelines.

info

GraphicContentGuard is only available as an output guard.

Example

from deepeval.guardrails import GraphicContentGuard

model_output = "I will cut you open and watch you bleed."

graphic_content_guard = GraphicContentGuard()
guard_result = graphic_content_guard.guard(response=model_output)

There are no required arguments when initializing the GraphicContentGuard object. The guard function accepts a single parameter response, which is the output of your LLM application.

Interpreting Guard Result

print(guard_result.score)
print(guard_result.score_breakdown)

guard_result.score is an integer that is 1 if the guard has been breached. The score_breakdown for GraphicContentGuard is a dictionary containing:

score: A binary value (1 or 0), where 1 indicates that hallucinated content was detected.
reason: A brief explanation of why the score was assigned.

{
  "score": 1,
  "reason": "The output contains graphic violence, with disturbing descriptions of harm and blood."
}

Example​

Interpreting Guard Result​

Example

Interpreting Guard Result