• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    3
    ·
    edit-2
    3 days ago

    Let’s look at a “worst case” on my PC. Let’s say 3 attempts, 1 main step, 3 controlnet/postprocessing steps, so 64-ish seconds of generation at 300W above idle.

    …That’s 5 watt hours. You know, basically the same as using photoshop for a bit. Or gaming for 2 minutes on a laptop.

    Datacenters are much more efficient because they batch the heck out of jobs. 60 seconds on a 700W H100 or MI300X is serving many, many generations in parallel.

    Not trying to be critical or anything, I hate enshittified corpo AI, but that’s more-or-less what generation looks like.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        edit-2
        2 days ago

        At risk of getting more technical, some near-future combination of bitnet-like ternary models, less-autoregressive architectures, taking advantage of sparsity, and models not being so stupidly general-purpose will bring inference costs down dramatically. Like, a watt or two on your phone dramatically. AI energy cost is a meme perpetuated by Altman so people will give him money, kinda like a NFT scheme.

        …In other words, it’s really not that big a deal. Like, a drop in the bucket compared to global metal production or something.

        The cost of training a model in the first place is more complex (and really wasteful at some ‘money is no object’ outfits like OpenAI or X), but also potentially very cheap. As examples, Deepseek and Flux were trained with very little electricity. So was Cerebras’s example model.