Persona Evaluation — condition2__screen_with_metadata__sample500_seed99¶

Bucket Distribution¶

Metric	Value	Interpretation
DPP log-det	-29.9868	Higher = more diverse + high-quality set
Cluster coverage	1.000	Fraction of BGT clusters with a task-critical hit
ILAD	0.7387	Mean pairwise distance; higher = more diverse
Redundancy rate	0.000	Fraction of near-duplicate suggestions (cos > 0.9)

Component	Weight	Value
DPP set score (normalised)	0.5	—
Cluster coverage	0.3	1.000
Mean quality (non-hallucinated)	0.2	—
Hallucination penalty	alpha=0.5	x 0.9600

Composite score: 0.6030

Filter: anti_gt (active). Flagged 4 / 50 suggestions (rate 8.0%). Composite hallucination penalty: 0.9600.

ID	Failure mode	Title	Reasoning
14	over_elaboration	Master CS3243: The 1982 CMU Vending Machine Cas...	Fabricates the 'Finger' protocol and ARPANET connection details not supported...
21	passive_viewing_as_active_interest	Target High-Impact Bitcoin Repositories for 2026	The suggestion prescribes specific libraries and projects (LDK, Stratum V2, F...
23	over_elaboration	Optimize Cursor/Sonnet 4.6 for Bitcoin Protocol...	The suggestion over-elaborates by prescribing a detailed Sonnet prompt with s...
40	empty_fallback	Schedule Deep-Work for GitHub Issue Contributions	Suggestion is a generic productivity prompt lacking actionable value, matchin...

#	ID	Quality	Title
1	16	1.000	Automate WhatsApp Export Parsing with Python
2	22	1.000	Automate WhatsApp Ingestion for GUMBO via Python Script
3	36	1.000	Map Python Skills to Nostr NIP-01 Implementation
4	41	1.000	Implement RPKI-to-ASmap Validation Logic
5	2	0.980	Configure Cursor 'Rules for AI' for GUMBO Context

#	ID	Quality	Title
1	7	1.000	Automate DB Isolation with venv-aware Shell Script
2	28	1.000	Implement Local Polling for WhatsApp Export Parsing
3	39	1.000	Optimize RPKI Validation with Routinator Filters
4	42	0.980	Automate BGP Data Fetching via RIS-Live
5	31	0.970	Implement Erlay (PR #21515) to Mitigate Eclipse Attacks