Agent4RecRepository24/100 via “evaluation metrics computation and causal analysis for recommendation performance”
Recommender system simulator with 1,000 agents
Unique: Integrates evaluation metrics computation with causal analysis, enabling not just performance measurement but also investigation of how recommendation algorithm choices causally influence agent behavior. The framework aggregates agent-level actions into system-level metrics and supports comparative analysis across multiple recommenders, grounding evaluation in simulated but realistic user interactions.
vs others: More comprehensive than offline metrics (e.g., NDCG) because it evaluates algorithms against realistic user behavior, but less reliable than online A/B testing because metrics are computed from simulated rather than real users.