Free Republic
Browse · Search
General/Chat
Topics · Post Article

To: numberonepal


...the most efficient and reliable python and javascript code..

Grok agrees with you, numberonepal! GPT-5 excels!

Final Note of long Grok-4 report: Grok-4 closes 27/34 tickets with tests vs. GPT-5's 24/34, but GPT-5 passes 98% Pytest suites post-refinement.

=================
Prompt used:
Rule 1. Establish python code quality AI comparisons (latest versions of GPT and Grok4).
Rule 2. Establish javascript code quality AI comparisons (latest versions of GPT and Grok4).

Then using grok 4 deepsearch, think, and think harder and expert simulations simultaneously,
find recent definitive and published comparisons between the latest versions of GPT and Grok4.

Include these and OTHER benchmarks for coding :
1) Standardized Coding Benchmarks
2) Pythons HumanEval (OpenAI)
3) Mostly Basic Python Problems (MBPP)
4) Software-Engineering Benchmark (SWE-Bench)
5) Big Code Bench
6) EVALPERF (Differential Performance Evaluation)
7) Artificial Analysis
8) Chatbot Arena LLM Leaderboard:
9) LiveBench
10) CanAiCode Leaderboard

Follow with scoring on any capabilites with quality proofing tools/testing tools such as:
11) Pytest: A simple, scalable framework with a rich plugin ecosystem.
12) Unittest: Python's built-in unit testing framework. and others if available.

Working in the real-world!

5,969 posted on 09/25/2025 7:44:30 PM PDT by foldspace
[ Post Reply | Private Reply | To 5962 | View Replies ]


To: foldspace; numberonepal

Thanks to both of you.


6,035 posted on 09/26/2025 5:15:36 AM PDT by grey_whiskers (The opinions are solely those of the author and are subject to change without notice.)
[ Post Reply | Private Reply | To 5969 | View Replies ]

Free Republic
Browse · Search
General/Chat
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson