New GPT-4o VS GPT-4 - Ultimate Test (Prompts Included)

113,983
0
Published 2024-05-13
ChatGPT 4o is a brand new AI model from OpenAI that outperforms GPT-4 and other top AI models.

In this video, I'll run a head to head test, comparing ChatGPT 4o with GPT-4 to see who comes up on top.

1 - Text summarization
Prompt: Provide two summaries of this article. The first summary should be 2-3 sentences long. The second summary should be 5-6 sentences long and include more detail. (insert text here)


2 - Writing Text
Prompt 1: Concise Product Description
Prompt: Imagine you're launching a new software tool that helps businesses track social media analytics.
Write a short, punchy product description (approximately 50 words) suitable for a website or marketing material. Emphasize the key benefit for businesses.


3 - Multimodal Understanding
Prompt: Analyze this image and explain it to me in table format what's going on


4 - Image generation
Prompt: generate an image of two AI robots head to head in battle


5 - Research (completeness and accuracy)
Prompt: How could artificial intelligence (AI) potentially disrupt the accounting industry? Identify specific use cases, potential benefits, and challenges. Provide links to relevant articles or reports.


6 - code generation
Write Python code for a game of snake that I can run on my computer.
Then tell me the step by step guide on how to do it, assuming I know nothing about programming.

Access Top Courses on ChatGPT and 50+ AI Tools on Our E-Learning Platform: bit.ly/skillleap

All Comments (21)
  • @arianaponytail
    love this compared, hope to see you do more and deeper tests :)
  • @user-xn7hw7we8g
    This is great! Thank you for this comparison! Love it! You got a new subscriber!
  • @ChatGPt2001
    The comparison between GPT-4 and the new GPT-4o (hypothetical upgraded version) would likely revolve around several key aspects, such as accuracy, coherence, creativity, response time, and user satisfaction. To conduct a fair and comprehensive test, we can use a series of prompts that assess these qualities. Here are some potential prompts and the criteria for evaluation: ### Prompts for Comparison 1. *General Knowledge Question:* - *Prompt:* "Explain the process of photosynthesis." - *Criteria:* Accuracy, detail, clarity. 2. *Creative Writing:* - *Prompt:* "Write a short story about a time-traveling cat." - *Criteria:* Creativity, narrative coherence, engagement. 3. *Problem Solving:* - *Prompt:* "How would you solve the problem of traffic congestion in urban areas?" - *Criteria:* Practicality, originality, depth of analysis. 4. *Technical Explanation:* - *Prompt:* "What are the key differences between machine learning and deep learning?" - *Criteria:* Technical accuracy, clarity, depth of explanation. 5. *Emotional Support:* - *Prompt:* "I’ve been feeling really stressed lately. What can I do to manage my stress?" - *Criteria:* Empathy, helpfulness, appropriateness of advice. 6. *Humor:* - *Prompt:* "Tell me a joke that's suitable for all ages." - *Criteria:* Humor, appropriateness, originality. 7. *Language Translation:* - *Prompt:* "Translate the following sentence into Spanish: 'The weather today is beautiful.'" - *Criteria:* Accuracy, fluency. 8. *Cultural Awareness:* - *Prompt:* "What are some important cultural customs in Japan?" - *Criteria:* Accuracy, sensitivity, comprehensiveness. ### Evaluation Method 1. *Blind Testing:* - Ensure the evaluator doesn't know which model's response they are evaluating to prevent bias. 2. *Scoring System:* - Use a consistent scoring system for each criterion, e.g., a scale from 1 to 10. 3. *Multiple Evaluators:* - Involve multiple evaluators to average out subjective biases. 4. *Feedback Collection:* - Collect qualitative feedback on strengths and weaknesses of each response. ### Conducting the Test 1. *Generate Responses:* - Use both GPT-4 and GPT-4o to generate responses to each prompt. 2. *Evaluate Responses:* - Have evaluators rate and review each response based on the criteria. 3. *Compare Scores:* - Analyze the scores and feedback to determine which model performs better overall and in specific areas. ### Hypothetical Outcomes 1. *Accuracy and Detail:* - GPT-4o might show improvements in providing more detailed and precise answers due to enhanced training data or algorithm optimizations. 2. *Creativity and Engagement:* - Creative prompts might reveal whether GPT-4o has advanced capabilities in generating more engaging and imaginative content. 3. *Response Time:* - If GPT-4o is optimized for performance, it might generate responses faster than GPT-4. 4. *User Satisfaction:* - User feedback can highlight if the new version feels more intuitive and satisfying to interact with. By following this structured approach, you can systematically compare GPT-4 and GPT-4o to determine which model offers superior performance across various dimensions.
  • I get the feeling that GPT5 will be announced soon, because paid subscribers just aren't getting their money's worth with the new free 4o.
  • Hi, I was very impressed with the comparison test you did in this video between the GPT-4o and GPT-4, showing how the GPT-4o has the edge in performance!
  • Great Video! Below are the Timestamped Summaries from ChatWithPDF: 00:00🆕 Introducing GPT-40 vs. GPT-4 comparison with insights on free vs. paid versions. 01:00🆓 Free tier offers GPT-40 with data analysis, file uploading, web browsing, and vision capabilities. 02:00📊 Benchmark test shows GPT-40 outperforming GPT-4 and other models in various tasks. 03:00📝 Text summary prompt comparison between GPT-40 and GPT-4 reveals tone and length differences. 04:00🚀 Product description prompt showcases similar performance between GPT-40 and GPT-4. 05:00🖼 Multimodal test highlights GPT-40's advantage in image analysis and table generation. 06:00🎮 Image generation task demonstrates GPT-40's superior output compared to GPT-4. 07:00🔍 Research prompt results show GPT-4 excelling in formatting with immediate source references. 08:00🐍 Python code writing task reveals GPT-40 creating a more interactive snake game than GPT-4. 09:00🔄 Usage limit comparison between GPT-40 and GPT-4 raises questions for paid users. 10:00🤔 Conflicting thoughts on upgrading to GPT-4 over GPT-40 due to similar capabilities and potential limitations. 11:00⚖ Comparison between GPT-40 and GPT-4 leaves users pondering the value of paid subscriptions. 12:00🔄 Release confusion over the benefits of GPT-40 for all users and implications for paid su
  • @gasakjhon
    Is this can use for custom instructions GPTs like Gpt 4?
  • @markmuller7962
    Btw at the end of the stream they said that they'll announce a new big thing, I'd assume that'll be the new pay stuff
  • It is not a confusing release. For paid users, we get early access and more prompt entries. You get as many prompts in GPT 4 as before but now you also get the added benefit of a good-sized prompt count with GPT 4o
  • how we can find the voice feature i have the paid version but i am unavialbe to get access to the voice and vision feature. anyone knows how to acces to voice feature . or may be yet it is not avilable to public?
  • @oxygon2850
    I would hope that they add some kind of an icon or a server stress indicator so that people could back off and help
  • @pengshan
    Thank you for the testing of gpt4o, it's very useful. However, this is just a comparison based on the web version, I'm looking forward more to the Mac version and mobile version, as well as more tests on the visual and audio aspects.
  • @SasukeGER
    how ? When I click on the top I can only select 3.5 and 4 ... there is no 4-o for me ....
  • @ZM-vs8of
    has anyone gotten the screenshare on the desktop app to work? I can't figure out how to get that going. just see a lot of marketing stuff about it.
  • @gaius_enceladus
    I've tried 4o and I'm pleased with the results! It seems to give much longer code (I'm using Zig) and it seems quite fast too! It seems to be a bit less "lazy" too - the code is more complete and you don't have to ask it to do full, comprehensive code as often.