A Strange But Elegant Approach to a Surprisingly Hard Problem (GJK Algorithm)

329,032
0
Published 2021-03-24
In 1988, three engineers came together and developed one of the most clever solutions to the problem of detecting when two complex objects collide. Their solution, the Gilbert Johnson Keerthi (GJK) algorithm, named after the authors, made an incredible impact in the fields of robotics, control, and computer graphics. This video is about understanding this ingenious algorithm from first principles.

The video covers a broad range of topics from Minkowski sums and differences to support functions to the full implementation of the 2D GJK algorithm. But what I hope you get out of this is an appreciation of the incredible shifts in perspective that lead to the final algorithm. Coming up with the algorithm is an amazing feat and useful for specific applications, but the overarching problem solving techniques that come through in the journey to the solution is truly invaluable.

0:00 Introducing the Problem
2:02 Convexity
3:15 Infinite Point Perspective
4:07 Minkowski Sums and Differences
6:37 Triangles inside Minkowski Differences
7:41 Simplexes
8:57 Support Functions
13:05 Core GJK Algorithm: Broad Perspective
17:15 Remaining Key Questions
17:56 How to determine if a point passed the origin?
19:10 The line case
20:41 The triangle case
26:25 GJK Implementation
29:38 Recap and quick note about original GJK paper

This video is supported by a community of Patreons
Special Thanks to the following Patreons:
Burt Humburg
Justin Hiester
Michael Nawenstein
Richard Wells
Sebastian Gamboa
Zac Landis

There's a lot more to the GJK algorithm to learn for those interested. Here are some resources I recommend:
   • Implementing GJK - 2006   - a full walkthrough of how to implement GJK in 3D by Casey Muratori, the game developer/engineer that came up with some of the clever optimizations that I present in this video.
www.dyn4j.org/2010/04/gjk-gilbert-johnson-keerthi/ - a really nice writeup on GJK
   • GJK Algorithm Explanation & Implement...   - A quick intro to GJK and an implementation in C++
www.toptal.com/game/video-game-physics-part-ii-col… - solid resource for collision detection in general and goes deeper into the theory.
box2d.org/files/ErinCatto_GJK_GDC2010.pdf - an alternative perspective to GJK using barycentric coordinates -- I really wanted to cover this in this video, but it would have been an hour long instead of half an hour long so I'll compromise and give you this resource :)

Support: www.patreon.com/reducible
Twitter: twitter.com/Reducible20

This video wouldn't be possible without the open source library manim created by 3blue1brown: github.com/3b1b/manim

Here is link to the repository that contains the code used to generate the animations in this video: github.com/nipunramk/Reducible

Music attributions:
Midsummer Sky by Kevin MacLeod is licensed under a Creative Commons Attribution 4.0 license. creativecommons.org/licenses/by/4.0/
Source: incompetech.com/music/royalty-free/index.html?isrc…
Artist: incompetech.com/
Luminous Rain by Kevin MacLeod is licensed under a Creative Commons Attribution 4.0 license. creativecommons.org/licenses/by/4.0/
Source: incompetech.com/music/royalty-free/index.html?isrc…
Artist: incompetech.com/
Prelude No. 23 by Chris Zabriskie is licensed under a Creative Commons Attribution 4.0 license. creativecommons.org/licenses/by/4.0/
Source: chriszabriskie.com/preludes/
Artist: chriszabriskie.com/
All other tracks by Aakash Gandhi

All Comments (21)
  • @kattenelvis1778
    HAHAHAHAHAA I love the bit when you say that if I find a counter-example 3blue1brown would be after me
  • @NovaWarrior77
    I live for the day that these are on trending. I really do.
  • @MeshMachine
    I'm a game dev and implemented this algorithm in 3D for an in house physics engine, long since defunct, as commercial physics engines are used almost universally these days. This is a great illustration of a well designed algorithm.
  • @fredoo6627
    Manim should be used by every math teacher. It's so beautiful.
  • 9:30 makes me think of a great line from numberphile. "You've broken maths, Brady, stop that."
  • @mikemullen5563
    This was an interesting problem for TV games, since most actions involved collisions between screen objects. Early 8 bit computers couldn't have handled this algorithm, so a simpler system was needed. The solution was invented by Ralph Baer (see Wikipedia). Objects were 'painted' on the screen by shifting out bits from a shift register when the line number and pixel numbers matched. As well as going to the screen, the signals were sent to an AND gate, which went high if the same pixel was being loaded by two different objects. Magnavox bought the patent rights:. I helped develop the Odyssey II which used these patents. Other manufacturers bought rights to the circuit from us.
  • @zemaumm
    My eyes just opened up when you reinterpreted the problem as a minkowski difference problem… I’m a software engineer ( who fiddle with graphics a little ) and it’s genuinely beautiful how the maths I briefly learned at college is so diverse yet relevant to what I do.
  • @gormster
    9:01 I had to go back over this section a few times because it misses some key points I had to figure out on my own. Firstly - you can slide a shape around on the plane, and the support function always returns the same vertex for a given direction vector. While the actual values of the dot products will change a lot, which one is largest always remains the same. The confusing part of this is when the vector from the origin is actually pointing away from all of the vertices, but while all the dot products might be negative, one of them is still going to be greater than the rest. The other thing that confused me is the way that the highlighted point on the hexagon seems to jump instantly from one vertex to the next, which would appear to contradict the line “for every point on the shape, there is a direction where it is the furthest point.” What about the points along the edges of the hexagon? Well, when the direction is perpendicular to that edge, every point on that edge is the furthest point. Think about it like a single straight wave on a calm ocean, travelling in that direction. It starts way outside the shape, and then gradually passes over it. The support point is the very last point it touches. If it’s travelling from left to right in this hexagon example, the last point is the vertex on the far right. But if it’s travelling from top to bottom, the last thing it touches is the bottom edge, and it touches the whole edge at the same time. So the support point for that direction is every point on the bottom edge.
  • @eshh183
    This is definitely the most complex topic you have picked till date! And to support that, you came up with your BEST work till date! Truly speechless! After your last video on collision detection, I was inspired to read more into computer animation. And the shear number of times that the name GJK came up was astounding! I tried reading about it from a number of sources, but couldn't go on after the introduction of support functions! Little did I know that you had planned such a great surprise for us! Much like the immense ingenuity of this algorithm, the difficulty that one faces in communicating such beautiful, yet daunting math and science, is often overlooked! Thank you so much for making these videos!!
  • @kenthedawg6383
    As someone with a graduate background in computational geometry, this is a fantastic video and you really know your stuff! Please make more of these!
  • As a normal secondary school student who's into competitive programming, we usually tackle the triangle case with a much more natural idea: we calculate the 2D cross product AB * BO; BC * CO; CA * AO. If one is equal to zero, the point is collinear with the corresponding edge, and we just check if the point is on the side. Else, point O lies inside triangle ABC if and only if all of these values are all positive or negative. An intuitive way to understand it is if the triangle ABC contains O, then we can walk around the triangle, and the direction we see from that point we stand to the point O is always in our right hand or in our left hand. Hope that helps.
  • @Speykious
    Wow... This is... Beautiful... I've looked at it for 5 hours now
  • @Zimbleton
    Note that in computer games etc, shapes normally have a bounding sphere/circle or box that is used to perform very quick check for intersection and only if the simple bounding shapes intersect then you perform more complex algorithm like this.
  • @Waffle4569
    30:52 Calculating the closest distance between objects is extremely useful in physics engines, where you need to not only detect when a collision occurred, but also de-intersect the objects after the collision.
  • I heard about this algorithm for the first time while watching a talk by Casey Muratori, and my mind was blown! It was a really neat piece of math.
  • @paulomag2106
    This is perhaps the clearest and most beautifully explained GJK explanation I've ever seen. I do believe, though, the direction vector does not require normalization on any step. I've implemented this algorithm for my graduation paper, and I have to say it is one the most brilliant computer algorithms I've seen in my studies.
  • @ToothbrushMan
    Really well explained. You have only a few competitors for this subject, but this is the best I think I've seen. I've been through the process of implementing a 3D version. It's good fun, but I found myself using a recursive method because, quite frankly, it's easier to understand and yes, more aesthetically pleasing. It also deals very elegantly with the "edge" cases - where the origin lies on a vertex, edge or face of the simplexes (point, line, triangle, tetrahedron). There are quite a few "edge" cases, all have to be dealt with. So if you've got a tetrahedron and you find that your origin lies on one of the edges of the tetrahedron, then the next call is back to the function (already in the call stack) that deals with edges, passing in the new edge just discovered. The other nice thing about a recursion is that the optimisation of not having to check Voronoi regions because they've already been checked in previous recursions is performed by carefully cycling the order of the vertices in the function calls (with the most recently discovered vertex first and the oldest discovered vertex last). And a third nice thing is that all the data is held on the stack - no heap allocation/deletion. Hence falling out of the recursion after a termination is super quick. My GJK works (almost) faultlessly, but there are some (painful) lessons that I learnt that readers might be interested in: 1. Numerical precision. Detecting the "edge" cases (origin on a point/edge/face of an edge/triangle/tetrahedron) is not as easy as it seems, because although the origin can mathematically be on a point/edge/triangle - and a dot product should be zero - only it isn't because of numerical precision. With my recursion, I found that on successive calls to trap the origin in a tetrahedron were being thwarted when the origin appeared to "jump" from one side of a tetrahedron face to the other as I chased it - running out of stack in the process. I was using 4 byte floats and the numerical precision HAS to be considered, but if you're using 8 byte doubles, you should do much better. 2. There is the case that the triangles get smaller and smaller and flatter and flatter when iterating to towards an origin on the Minkowski difference between two smooth convex objects (e.g. two ellipsoids). If the tetrahedron get too small or too flat, you could be looking at numerical rounding issues again. 3. If the first two directions happen to be on opposite sides of the origin chosen then you've got an edge with an origin on it. Picking the 3rd direction as being perpendicular to this edge on the side of the origin gets a bit ambiguous. I spent a few hours with graph paper and a pencil with this. I surprised to see no mention of the Expanding Polytope Algorithm (EPA). The GJK and the EPA are a pair of algorithms that are complementary to each other. Whereas the GJK works out IF two solids intersect, the EPA works out HOW they intersect (penetration depth and collision normal) - and the collision normal is needed for the collision physics. But the great thing about the EPA is that the data structure that the GKJ finishes with (a simplex that surrounds the origin) is precisely the same data structure that the EPA starts with.
  • Thank you for putting in all the effort necessary to creating these. I love that they're an accessible, entertaining way to broaden your algorithmic problem solving skills.
  • @prahjex2501
    Beautiful walkthrough. Didn’t expect to imbibe a half hour geometry video first thing in the morning, but now I have a fresh appreciation for creative problem solving to propel me through some projects I’m working on. Thank you!