Hey ChatGPT, Summarize Google I/O

520,128
0
2024-05-17に共有
This was a week full of AI events! First, Marques gives a few thoughts on the new iPads since he missed last week and then Andrew and David bring him up to speed with all the weirdness that happened during Google I/O and the OpenAI event. Then we finish it all up with trivia. Enjoy!

Chapters
00:00 Intro
01:17 Marques iPad Thoughts
16:49 OpenAI GPT-4o
43:05 Trivia Question
44:05 Coda.io (Sponsored)
45:04 Google I/O Part 1
01:14:06 Trivia Question
01:14:54 Ad break
01:14:59 Google I/O Part 2
01:46:49 Trivia Answers
01:52:44 Outro

Links:
MKBHD iPad Impressions: bit.ly/3WzFFWk
MacStories iPadOS: bit.ly/3V1G0Qq
The Keyword: bit.ly/4blfFm5
OpenAI GPT-4o Announcements: bit.ly/3V3Sabv
9to5Google I/O 2024 Article: bit.ly/3V2rDLv
Merch tweet: bit.ly/4bnhNcV

Shop products mentioned:
Apple iPad Air: geni.us/SsXTRLt
Apple iPad Pro M4: geni.us/HXDlXo

Shop the merch:
shop.mkbhd.com

Socials:
Waveform: twitter.com/WVFRM
Waveform: www.threads.net/@waveformpodcast
Marques: www.threads.net/@mkbhd
Andrew: www.threads.net/@andrew_manganelli
David Imel: www.threads.net/@davidimel
Adam: www.threads.net/@parmesanpapi17
Ellis: twitter.com/EllisRovin

TikTok:
www.tiktok.com/@waveformpodcast

Join the Discord:
discord.gg/mkbhd

Music by 20syl:
bit.ly/2S53xlC

Waveform is part of the Vox Media Podcast Network.

コメント (21)
  • @RyanMorey1
    going to predict the number of times "AI" appears in this podcast: 34
  • @MarsOtter
    david’s “daaaammmnnn” in the intro needs to be on the soundboard
  • @Wade2003
    The natural human response for How tall the Empire state building is? Should be, "Uhh.... I don't know bro, why don't you google it.".
  • @gundmc13
    The idea of the "Where did I leave my glasses" was not to suggest that you would actually ask the AI assistant where you left things as a use case - it was a flex to show the assistant could recall a detail that wasn't explicitly discussed from a previous image a minute ago that wasn't directly in its current field of view. It's another example of a huge context window and how it's helpful. A 2 million token context window isn't just for writing a really long prompt. It means everything in that 2 million tokens can be retrieved with perfect recall, whether that's a super long conversation dialogue, or if it's a 2 hour video file. Honestly I think people are sleeping on how huge of a deal that can be and Google isn't doing a good job of telling people why they should care about a context window.
  • Gmail's search function is ABSOLUTELY Google's worst search function. 100% correct, Andrew. Thank you for saying that.
  • As a blind person, having these models have vision is super important and could be really, really helpful. It already is. Just look up. Be my eyes...... I can't wait until it can help real time with visual things. And actually be right about things. LOL
  • I appreciate the longer episodes of the pod. This is what podcasts are for! Getting into the nitty gritty of the products and chopping it up, letting your personalities show.
  • Marques tapping the mic to trigger the lights was low-key the funniest moment of the episode.
  • @jonathanvu769
    David is really writing off the extended context window 😂 this is a huge step toward the potential for a personal assistant who can know everything about you. It’s also a big divergence from OpenAI, as Google is moving toward an infinite context window whereas OpenAI seems to be maximizing vector stores and RAG. A specific use case for my industry - the eventual possibility of having individualized GPTs trained on patient data, meaning that physicians can have a model that is queryable via natural language that can give clinical summaries of a patient history. I agree a lot of new AI features are overhyped but I don’t think we should write off the underlying advancements in these models - very exciting stuff on the horizon!
  • @MrKevinPitt
    Love this show listen/watch every week! But sometimes I think they are so immersed in the field of tech they kinda miss the wonder of some of these innovations. I watched the OpenAI event and was absolutely blown away. Yes it was silly at times and the use case demos were a bit contrived but where we are in contrast to where we were 15 years ago just absolutely amazes me. Wish the fellas took a step back sometimes and just appreciated that for a nanosecond. Love yea! We truly live in a age of wonders ;-)
  • @menithings
    The new iPad Pro is thinner (really lighter) so its center of gravity can be lower when docked to the Magic Keyboard. This weight shift means that the iPad can be suspended further back on the keyboard (check out the hinge's new 90 degree angle), and therefore free up more space on the case for a larger trackpad and a function key row. The Magic Keyboard is an almost ubiquitous accessory for the Pro, so the iPads lighter weight now resolves the Magic Keyboard's two major flaws - making it a more attractive upsell.
  • I definitely agree with Marques' take on the very simple example that OpenAI used to showcase their new model. I can say that the moment I saw their demo of how GPT4o can read math problems on a piece of paper, and especially their YouTube video showing how it can even understand things like geometrical objects on a screen, I immediately thought "Oh! Maybe it can help me with my work!". And sure enough, I tested it and I can say that it's very, very good (much better than before) as an assistant helping you figuring out what kind of statistical analysis you can run on a dataset, guiding you through all steps of the process from testing assumptions, to suggesting alternative steps such as transformations or different type of analyses, to checking graphs of residuals distribution and so on. Up until the very end of the process. It can even guide you on how to perform each step on a specific software (as long as it's popular enough, for example SPSS). It really is great! I cannot wait for their desktop app to be released for Windows, because it would make the experience even smoother!
  • @sachoslks
    I think you are underselling GPT4o. The fact it does all the "understanding" in audio to audio form is such a big leap vs the previous way it worked. You lose so much detail when doing audio to text -> text to text -> text to audio. I think of the new model as kind of like the ChatGPT moment but for audio instead of text. The thing can whisper, laugh, sing, "breath", snicker, giggle, talk fast or slow, be sarcastic, do different voices all while significantly reducing latency. Not to mention all the multimodal examples they showed on their blog post like the SOTA text generation in an image, text to 3d capability, sound fx generation (although im not sure about that example yet). All of this is happening in a single neural network, think how amazing is that, plus it is much cheaper and faster than regular GPT-4 to begin with. It seems they managed to get GPT-4 level intelligence in a much smaller model that allows it to run cheaper and faster while unlocking new modal capabilities so i think it is fair to expect when they scale it up to a much bigger model we could see some big improvements although it may negatively affect the speed/response time of conversations.
  • Just wanted to say from a person who is blind the visual aspects of what Open AI is doing is pretty exciting for me. I know that you guys would’ve gloss over the facial expression thing, but imagine going through life not being able to see peoples facial expressions. There’s a lot of things that I miss not having Nonverbal communication. For example, most times conversations are started up by talking with your eyes.
  • @divyz1010
    We need a podcast once Marques is brought up to speed....would really admire and curious for his thoughts. The intonations / emotions gpt-4o displays and infers are really a leap ahead (technically) anything we have right now and need a better review apart from that it's generic and of little use
  • @sameerasw
    Can confirm, in work, Google Chat is one of the most used tools for me. Especially since we use the whole Google workspace, the chat's integration with meets and such are awesome. It's much helpful in projects/ teams for discussion and also sometimes as an alternative for less professional chat than an email. But hate the fact that it's in the GMail app as well as with it's own app. I prefer the separate app so it's isolated from my email browsing.
  • The book club idea is on point. I've had the same desire for years now. I want to talk about the book but with someone who's experiencing the book at the same time and possibly same pace as I am.
  • @BrightPage174
    Ngl the wow moment at I/O for me was mostly the ai overview stuff being able to take the really long and weirdly specific searches that my mom always does and actually give her a proper response. Huge for the people who don't realize search isn't a person with contextual knowledge of your situation 56:35 Exactly this. Being able to ask the computer questions like you would a regular person instead of keyworded queries to me is the real maturation point. People grow up learning how to talk to humans, not search engines