...
/OpenAI o3 or DeepSeek-R1: Which Is the Better Reasoning Model?
OpenAI o3 or DeepSeek-R1: Which Is the Better Reasoning Model?
Compare OpenAI o3 and DeepSeek-R1 on different reasoning tasks such as coding, logic and problem-solving to analyze their performance.
We'll cover the following...
In previous lessons, we compared various aspects of DeepSeek models against other competitors, including OpenAI, Gemini, Llama, and Mistral models. In this lesson, we will conduct our own experiments, testing DeepSeek’s R1 and OpenAI’s o3-mini (high)—currently among the best models for coding and reasoning, as shown in our comparisons in the previous lessons.
We will run multiple experiments to evaluate both models in coding, logical reasoning, and STEM-based problem-solving. For each task, we will provide the same prompt to both models and analyze their responses.
Coding
Let’s start with a coding example. We want to create an interactive physics-based animation using JavaScript. The animation will simulate a galaxy of stars moving under the influence of gravity while incorporating dynamic behaviors such as merging, color blending, and supernova explosions.
The prompt is given below:
Prompt:
Generate a JavaScript animation that should simulate a galaxy of stars moving in a gravitational field inside a container with the following features:
Randomly placed stars with different masses and colors (white, blue, yellow, green, and red)
Gravity simulation: Stars attract each other based on a simple Newtonian gravity model
Star merging: If two stars get close enough, they merge into a larger star, blending their colors using additive color mixing
Supernova effect: When a star reaches a certain mass threshold, it explodes into multiple smaller stars
Smooth physics updates with realistic-looking gravitational motion
First of all, in terms of time, o3-mini-high took around 30 seconds to generate a response, whereas DeepSeek-R1 took almost 6 minutes. R1 kept on thinking and rethinking about the prompt. The slow response might frustrate some users.
As can be seen by running the code, the generated code does almost exactly what was asked in the prompt. This generated JavaScript code creates an interactive simulation of a galaxy where stars move under gravitational forces, merge when they collide, and explode into supernovae when they become too massive. The stars are also sometimes seen revolving around each other due to gravitational force and change paths when two stars come close, but not close enough to merge.
The code generated by DeepSeek takes a different approach. As observed, the stars are much larger, and seem to move much slowly, and when merged, they all seem to converge to color white
. This is due to the different approaches taken by both codes.
In the code generated by o3-mini-high, when two stars merge, their color is blended using a mass-weighted average rather than a simple addition. This ensures that the resulting color realistically represents the combination of the original colors:
let newColor = {r: Math.min(255, (starA.color.r * starA.mass + starB.color.r * starB.mass) / totalMass),g: Math.min(255, (starA.color.g * starA.mass + starB.color.g * starB.mass) / totalMass),b: Math.min(255, (starA.color.b * starA.mass + starB.color.b * starB.mass) / totalMass)};
On the other hand, code generated by DeepSeek merges stars using simple addition for RGB color values:
r: Math.min(255, s1.color.r + s2.color.r),g: Math.min(255, s1.color.g + s2.color.g),b: Math.min(255, s1.color.b + s2.color.b)
Since the RGB values increase upon merging, and Math.min(255, value)
ensures they don’t exceed 255, most stars will eventually reach (255, 255, 255) → White after multiple merges.
A main aspect of the prompt was the “supernova” explosion. As seen from the outputs of both codes, the supernova explosion in the code generated by o3-mini-high seems more smooth, whereas in the code generated by the DeepSeek-R1 model, the stars seem to be stuck together in a weird cluster that keeps on changing colors. They seem to then randomly explode. Let’s see the codes for both.
The following code for a supernova is generated by the o3-mini-high model:
...let fragments = 8; // Number of fragments to createfor (let j = 0; j < fragments; j++) {let angle = Math.random() * 2 * Math.PI;let speed = randomInRange(1, 3);// Each fragment gets a fraction of the parent's masslet fragmentMass = star.mass / (fragments * 1.5);newStars.push(new Star(star.x,star.y,star.vx + Math.cos(angle) * speed,star.vy + Math.sin(angle) * speed,fragmentMass,star.color));...
When a star exceeds the supernova threshold, it breaks into 8 fragments, which are smaller versions of itself. Each fragment inherits the color of the original star, and each fragment gets a random outward velocity for a more natural explosion effect.
On the other hand, in the code generated by DeepSeek, when the mass is exceeded, the star splits into 4 fragments and assigns random new colors to fragments instead of keeping the parent color. Fragments are placed close together (+/- 20 pixels
), but their velocity is not strongly directed outward. Due to this, the fragments don’t spread enough because their velocity is too small. This causes weird-looking clusters instead of a dramatic explosion.
...return Array.from({length: 4}, () => new Star(star.x + (Math.random()-0.5)*20, // Small random offsetstar.y + (Math.random()-0.5)*20,star.mass/4, // Even split of masscolors[Math.floor(Math.random()*4)], // Random color (unnatural effect)(Math.random()-0.5)*20, // Small random velocity(Math.random()-0.5)*20));...
Based on the outputs seen, it can be concluded that the code generated by o3-mini-high was better as it met the requirements specified in the prompt. The code generated by DeepSeek-R1, although tried to meet the requirements, some of them did not execute properly as discussed above.
Logical reasoning
In this experiment, a simple prompt was given. The prompt is a logical and reasoning puzzle with elements of relational reasoning. It primarily tests deductive logic and understanding of how ordering works.
Prompt: In a distant kingdom, three wise scholars—Xander, Yvette, and Zachary—are known for their intellect and wisdom. The royal records state that if Xander is older than Yvette, and Yvette is older than Zachary, then Zachary is the youngest. However, a new decree adds that if Yvette and Zachary were born in the same year, but Zachary’s birth occurred later in the year than Yvette’s, he would still be considered younger. Given that the royal scribe recorded Yvette’s birth before Zachary’s and that Xander is older than both, who holds the title of the youngest scholar?
The o3-mini-high model responded within 5 seconds. The thought process was short, structured, direct, and methodical. It moves step by step, applies each rule once, and reaches a clear conclusion without revisiting points.
Okay, I'm looking at the puzzle about the kingdom's three scholars: Xander, Yvette, and Zachary.From what I gather, it says if Xander is older than Yvette, and Yvette is older than Zachary, then Zachary is theyoungest. But there's also this extra rule: If Yvette and Zachary were born in the same year, but Zachary was bornlater, Zachary would still count as younger.Since Yvette's birth is recorded before Zachary’s, and Xander's older than both, I think Zachary holds the title ofthe youngest.Alright, the puzzle involves a decree stating that if Yvette and Zachary were born in the same year, but Zachary was bornlater in the year, he'd still be considered younger. Given that the records show Yvette's birth came beforeZachary's, it seems clear: Yvette is older than Zachary. And Xander is older than both of them. So, in the chain:Xander > Yvette > Zachary. Therefore, Zachary is the youngest scholar according to the royal records.
The output generated by o3-mini-high is given below:
Output by o3-mini-high:
Given the records and decrees:
Xander is older than both Yvette and Zachary.
The records state that if Xander is older than Yvette, and Yvette is older than Zachary, then Zachary is the youngest.
Even if Yvette and Zachary were born in the same year, the new decree says that the one born later in the year is younger. Since the scribe recorded Yvette’s birth before Zachary’s, Zachary would be considered younger in that case as well.
Therefore, Zachary holds the title of the youngest scholar.
The DeepSeek model responded after 57 seconds, with the following thought process:
Okay, let's tackle this problem step by step. So, there are three scholars: Xander, Yvette, and Zachary. The royal recordshave some rules about determining who's the youngest. Let me parse through the information given.First, the original rule says that if Xander is older than Yvette, and Yvette is older than Zachary, thenZachary is the youngest. That seems straightforward—age hierarchy based purely on birth years. But then there'sa new decree added. This decree states that if Yvette and Zachary were born in the same year, but Zacharywas born later in the year than Yvette, he would still be considered younger. Hmm, so even if they're the sameage in terms of years, the exact birth date matters. If Zachary's birthday is later, he's considered younger,even in the same birth year.Now, the royal scribe recorded Yvette’s birth before Zachary’s. So, Yvette was born earlier in the year thanZachary. Wait, but the new decree says that if they were born in the same year and Zachary's birth is later,he's considered younger. But here, it's stated that Yvette's birth was recorded before Zachary's. Does that meanthey were born in the same year? Because if they were born in different years, the order of birth within the yearwouldn't matter as much, right? For example, if Yvette was born in 1990 and Zachary in 1991, then Yvette is olderregardless of the month. But if they were both born in 1990, and Yvette was born in January and Zachary in February,then according to the new decree, even though they're the same age in years, Zachary is younger because his birth islater in the year.But the problem states that the scribe recorded Yvette’s birth before Zachary’s. So this could imply that they are inthe same year, with Yvette having an earlier birth date. However, the problem also mentions that Xander is older than both.So Xander's birth year is earlier than both Yvette and Zachary. Now, the key here is whether Yvette and Zachary are in thesame birth year or not.If they are in the same birth year, then according to the new decree, since Yvette's birth was recorded before Zachary's(i.e., born earlier in the year), Zachary would still be considered younger. But wait, the new decree says that if they wereborn in the same year and Zachary's birth is later, he's considered younger. But here, Yvette's birth is recorded first, soshe was born earlier in the year. Therefore, in that case, Zachary is younger. But if they were in different years, thenYvette would be older because her birth year is earlier. Wait, but the problem states that Xander is older than both. So Xander'sbirth year is earlier than both Yvette and Zachary. Now, we need to know if Yvette and Zachary are in the same year or not.The problem says "the royal scribe recorded Yvette’s birth before Zachary’s." If the scribe records births in chronological order,then this could mean that Yvette was born before Zachary in the same year, or that Yvette was born in an earlier year. But the newdecree specifically talks about the case where they are born in the same year. So maybe the fact that the scribe recorded Yvette's birthbefore Zachary's indicates that they are in the same year, with Yvette born earlier. Because if they were in different years, the scribewould just record the years, not the order within the year. Hmm.So putting this together: Xander is older than both Yvette and Zachary. Now, Yvette and Zachary's birth order is recorded as Yvette first,then Zachary. If they are in the same year, then Zachary is younger because his birth was later in the year. If they are in different years,then Yvette's birth year is earlier than Zachary's, making Yvette older. But the problem doesn't explicitly state their birth years, only theorder in which their births were recorded. The key point is the new decree's condition: if they were born in the same year and Zachary's birth is later,he's younger. But the scribe recorded Yvette's birth first, which in the context of the same year would mean she's older, making Zachary younger. Therefore,under the new decree, even if they are in the same year, Zachary is younger.But wait, the original rule was if Xander > Yvette > Zachary in age, then Zachary is the youngest. But with the new decree, even if Yvette and Zachary arethe same age (same year), the exact birth date matters. So the problem states that the scribe recorded Yvette’s birth before Zachary’s. This could meanthat they are in the same year, so Zachary is younger. Since Xander is older than both, regardless of Yvette and Zachary's relation, Xander is the oldest,then Yvette, then Zachary. But wait, if Yvette and Zachary are in the same year, and Zachary is younger because his birth is later, then the order isXander (oldest), Yvette, Zachary (youngest). If they are in different years, then Yvette's birth being recorded before Zachary's would mean she was bornin an earlier year, so Yvette is older than Zachary. But Xander is older than both, so again the order is Xander, Yvette, Zachary. Therefore, in both scenarios,Zachary is the youngest. The new decree just clarifies that even if they share the same birth year, the exact date matters. Since the scribe recorded Yvette first,implying same year with earlier birth, making Zachary the youngest. So regardless of whether they are in the same year or different years, Zachary ends up as theyoungest. Therefore, the answer should be Zachary.
Finally, the DeepSeek model came up with the following response:
Output by DeepSeek-R1:
Answer: Zachary holds the title of the youngest scholar.
Explanation:
Xander is older than both Yvette and Zachary (given).
Yvette's birth was recorded before Zachary's by the royal scribe.
If Yvette and Zachary were born in the same year, the new decree clarifies that Zachary, born later in the year, is considered younger.
If they were born in different years, Yvette’s earlier recorded birth implies she is older.
In both scenarios, the hierarchy is Xander > Yvette > Zachary.
Thus, Zachary is definitively the youngest.
Both the models gave the right answer logically, albeit with the DeepSeek model taking a lot more time. However, as can be seen, the reasoning process of DeepSeek in the passage is overly convoluted, with excessive repetition and unnecessary backtracking, making it inefficient and difficult to follow. While the model correctly arrives at the conclusion that Zachary is the youngest, the argument could have been structured more clearly by directly addressing the key conditions without repeatedly re-evaluating the same points. The core logic—distinguishing between different birth years versus the same birth year scenario—could be condensed significantly, avoiding circular reasoning. A more concise and structured approach would make the explanation more effective and reader-friendly.
In this particular task, which was quite easy, the o3 model performed better, taking very little time to reach the answer. However, for a more complex problem, the DeepSeek model might be considered better, as it shows the complete thinking process, showing all the steps along the way.
STEM problem solving
For this experiment, we will provide the STEM problem to see how they both perform to test the models’ skills in science, technology, engineering, and mathematics.
The following integration problem was given to both the models:
Prompt: Find the integral of
Once again, the o3 model reasoned for about 4 seconds, and started to provide the solution. The thought process of the model was brief, directly reaching the point:
To integrate
, we can use integration by parts twice.
The model then provided with the following solution:
Solution:
Step 1: First Integration by Parts
Let
and Then:
...