GPT-5.5 vs Claude Opus 4.7 vs DeepSeek V4 vs Kimi K2.6: benchmark guide by task | Deep Research