7 comments.

  1. gwern

    I've been skeptical of Dojo from the start: in particular, that they don't seem to have any idea how they are going to program it for the utilization they need, while picking an approach which has very high FLOPS on paper but will be extremely hard to program and where historically, similar approaches with the attitude "a Sufficiently Smart compiler/programmer will write all code forever" have not worked out well. (If you are going to take a 'first principles' approach to DL training, you start with the DL/software/algorithm end first, not the hardware end.)

    Years on, the Dojo project doesn't look like it's going smashingly well compared to just buying a ton of H100s...

    1. learn-deeply

      You've summarized very well why every single AI hardware startup has failed.

      1. gwern

        I wouldn't say every but it is certainly a large graveyard. The software angle is why I'm mildly positive about Cerebras: a single very large very fast chip with high bandwidth is a pretty good starting point for ease of use. Similarly, the new Etched proposal: by making it a Transformer ASIC, you sidestep all of these issues about expecting the programmer to be able to manually schedule every single operation in parallel or similar craziness.

    2. Alternative_Advance

      I took a look at their RnD spending and it's HALF of Nvidias, and obviously includes actual development of cars, robot and FSD. In order to get anywhere near Nvidia I'd guess they need to outspend Nvidia for some years, so effectively 10x their what they are spending today on Dojo only.

      It does feel like money would have been better spent building out relations with Google and AMD to learn to utilize TPUs and MI-series to have alternatives....

  2. JelloSquirrel

    Hardware wise, dojo is two years out of date and it's not competitive in performance / watt or / $ based on published specs and costs against commodity hardware.

    That said, Dojo can scale to pretty insane amounts of high speed ram except it's all via high latency buses. But I think Dojo's unique gimmick is gonna be that they can have about 10TB of usable memory per pod. Both AMD and Nvidia will max out a bit over 1TB via their interconnect hardware.

  3. infomer

    Is he leaving to make Xhitter great again?

  4. 3DHydroPrints

    Or possibly because he got abducted by aliens. Nobody knows

Add a new comment.