Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard | Textpad