The book does not exist
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    It sounds like you don't like how LLMs are currently used, not their power consumption.

    I agree that they're a dead end. But I also don't think they need much improvement over what we currently have. We just need to stop jamming them where they don't belong and leave them be where they shine.

    1
  • The book does not exist
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    Yeah, they operate very opaquely, so we can't know the true cost, but based on what I can know with certainty given models I can run on my own machines, the numbers seem reasonable. In any case, that's not really relevant to this discussion. Treat it as a hypothetical, then work out the math later to figure out where we want to be and what threshold we should be setting.

    1
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearQU
    Quotes Now
    Jump
    “Freedom is the right to tell people what they do not want to hear.” ― George Orwell
    The book does not exist
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    Indeed. Though what we should be thinking about is not just the cost in absolute terms, but in relation to the benefit. GPT-4 is one of the more expensive models to run right now, and you can accomplish very good results with their smaller GPT-4o mini at 0.5% of the energy cost^[1]^. That's the cost of running 0.07 LED bulbs over an hour, or running 1 LED bulb over 0.07 hours (i.e. 5min). If that saves you 5min of time writing an email while the room is lit with a single LED bulb and your computer is drawing energy, that might just be worth it, right?

    [1] Estimated by using https://huggingface.co/spaces/genai-impact/ecologits-calculator and the pricing difference between GPT-4o, 4o mini, and 3.5 (https://openai.com/api/pricing/). The assumption I'm making is that the total hardware and energy cost scales linearly with the API pricing.

    1
  • The book does not exist
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    The energy usage is mainly on the training side with LLMs. Generating afterwards is fairly cheap. Maybe what you want is to have fewer companies trying to train their own models from scratch and encourage collaborating instead?

    1
  • Building on my previous question, it is also allowed to use video game gameplay data to improve an AI? (I decided to split this into a separate question, but I don't know if this question is identical
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    We've been doing this in RL research with Minecraft as well (see MineDojo). An excerpt from the GitHub page:

    MineDojo [...] provides open access to an internet-scale knowledge base of 730K YouTube videos, 7K Wiki pages, 340K Reddit posts.

    Again, no one has run into legal issues with this yet either, but this also isn't as ubiquitous compared to Atari, nor has it been around for as long.

    2
  • Why are we training AIs on reddit posts instead of Research Papers? We could be saving the world!
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    The very first response I gave said you just have to reframe state.

    And I said "am augmented state space would make it Markovian". Is that not what you meant by reframing the state? If not, then apologies for the misunderstanding. I do my best, but I understand that falls short sometimes.

    1
  • Why are we training AIs on reddit posts instead of Research Papers? We could be saving the world!
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 50%

    I'm not familiar with the term "beam" in the context of LLMs, so that's not factored into my argument in any way. LLMs generate text based on the history of tokens generated thus far, not just the last token. That is by definition non-Markovian. You can argue that an augmented state space would make it Markovian, but you can say that about any stochastic process. Once you start doing that, both become mathematically equivalent. Thinking about this a bit more, I don't think it really makes sense to talk about a process being Markovian or not without a wider context, so I'll let this one go.

    nitpick that makes communication worse

    How many readers do you think know what "Markov" means? How many would know what "stochastic" or "random" means? I'm willing to bet that the former is a strict subset of the latter.

    0
  • Why are we training AIs on reddit posts instead of Research Papers? We could be saving the world!
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    It's in reference to your complaint about the imprecision of "stochastic process". I'm not disagreeing that molecular diffusion is a stochastic process. I'm saying that if you want to use "Markov process" to describe a non-Markovian stochastic process, then you no longer have the precision you're looking for and now molecular diffusion also falls under your new definition of Markov process.

    1
  • Why are we training AIs on reddit posts instead of Research Papers? We could be saving the world!
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    That's basically like saying that typical smartphones are square because it's close enough to rectangle and rectangle is too vague of a term. The point of more specific terms is to narrow down the set of possibilities. If you use "square" to mean the set of rectangles, then you lose the ability to do that and now both words are equally vague.

    2
  • What is or was your opinion on the VP debate in America?
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 7%

    Everyone's weird in their own ways. It's just that one of them is trying to convince people that weird is bad while simultaneously trying to court their votes.

    -11
  • Why are we training AIs on reddit posts instead of Research Papers? We could be saving the world!
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    Why does everyone keep calling them Markov chains? They're missing all the required properties, including the eponymous Markovian property. Wouldn't it be more correct to call them stochastic processes?

    Edit: Correction, turns out the only difference between a stochastic process and a Markov process is the Markovian property. It's literally defined as "stochastic process but Markovian".

    2
  • Why are we training AIs on reddit posts instead of Research Papers? We could be saving the world!
  • "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    howrar
    Now 100%

    I find it amusing that everyone is answering the question with the assumption that the premise of OP's question is correct. You're all hallucinating the same way that an LLM would.

    LLMs are rarely trained on a single source of data exclusively. All the big ones you find will have been trained on a huge dataset including Reddit, research papers, books, letters, government documents, Wikipedia, GitHub, and much more.

    Example datasets:

    25
  • Following up on [another question](https://lemmy.world/post/19184166) about open source funding, how does it usually work when there is funding to pay for the dev's work, then someone new joins in and makes significant contributions? Does the original dev still keep everything? Do you split the funds between the devs? If so, how do you decide how much each person gets? Are there examples of projects where something like this has happened?

    12
    2

    This community has been around for a few months now. How do we feel about it? Are things working out? Any plans for further growing the community? This is one of the topics I’ve been thinking a lot about quite a bit for the past few years (i.e. how to set up a community that values discussions with diverse viewpoints), so I thought I’d share some of my thoughts in relation to what I’m seeing here. 1. I think such a community necessarily needs to be a full self-contained instance, or else you’ll get very little activity. Think about how these discussions usually start. Someone posts an article/meme/question/etc, a few people show up and comment with similar thoughts about it worded in slightly different ways, then another shows up and goes against the grain, everyone dogpiles on them, and that’s when the real discussion starts. Very rarely do people go out of their way to ask “what do you think of X controversial topic?” And even if you do, that only leads to a very high level discussion that very quickly gets stale. If you get discussion in the context of specific events, then these discussions can be grounded in reality and lead to more unique context-dependent takes each time it comes up. 2. Regarding upvotes/downvotes: as stated in the rules, they should be used to measure whether a post/comment is a positive contribution to the discussion rather than the number of people who agree with your viewpoint. I don’t believe there’s a way to actually enforce this with the voting system we currently have, but I also think a relatively simple change can fix it. It will require a bit of coding. My proposal is a voting system with two votes: one to say that you agree/disagree, and another to say good/bad contribution. With this system, you can easily see if someone only thinks posts they agree with are good contributions, and you can use that information to calculate a total score that weighs their votes accordingly. It’s also small enough of a change that I think most people won’t have a problem figuring it out. Thoughts? Also, thank you Ace for taking the initiative in creating this place. It makes me happy to see that others want to see this change too.

    9
    7

    There's many posts here with the purpose of convincing people to support electoral reform. Not so much that's actually actionable. What do we do if we want to change things? For a start, does anyone have information on who's responsible for the election system at each level of government in each of the major cities?

    16
    1

    I think it's generally agreed upon that large files that change often do not belong while small files that never change are fine. But there's still a lot of middle ground where the answer is not so clear to me. So what's your stance on this? Where do you draw the line?

    16
    9
    https://slrpnk.net/comment/8035803

    I suspect this is a problem with posts that have extremely long bodies like this one: https://slrpnk.net/comment/8035803 I'm trying to scroll down to the top first comment and inevitably overshoot. When I i try to scroll back up, it suddenly jumps back to the middle of the OP's body.

    8
    1
    https://lemmy.ca/pictrs/image/447e38f4-975e-4ef9-a80c-44d253714fc1.png

    I was looking up when babies can safely start eating untoasted bread and one of the images led me to this website that sells... stuff? Are they selling me the question? Who knows. Then if you scroll down to the related products, you can buy a basketball club for $30, down from $15! ![](https://lemmy.ca/pictrs/image/5689223c-afbf-4b99-8ff6-365df272dc13.png) I'm guessing this is some phishing website looking to steal credit cards. I also still haven't found an answer to my original question.

    54
    11

    Is it possible for posts to show the domain (TLD and SLD) of link posts? Use case: I don't want to watch videos so I want to avoid clicking YouTube links. I would like to know that they are YouTube videos without having my phone spend the next minute trying to open YouTube.

    37
    5
    "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearDA
    datahoarder howrar Now 100%
    How do you store your metadata?

    By metadata, I'm talking about things like text descriptions of a photo/video and where they come from, or an explanation of what a certain binary blob contains, its format, how to use it, etc. The best solution I have right now is xattrs, but those are dependent on the file system, and there's no guarantee that they will stay when the files get moved, especially if the person moving them is unaware of its existence. The alternative is to keep a plaintext file with this metadata alongside every photo/video/binary/etc, but that would be a huge pain to keep in sync since both files have to be moved together. So my question to you: do you keep this kind of metadata? If so, how do you manage them?

    6
    5

    With the rapid advances we're currently seeing in generative AI, we're also seeing a lot of concern for large scale misinformation. Any individual with sufficient technical knowledge can now spam a forum with lots of organic looking voices and generate photos to back them up. Has anyone given some thought on how we can combat this? If so, how do you think the solution should/could look? How do you personally decide whether you're looking at a trustworthy source of information? Do you think your approach works, or are there still problems with it?

    14
    11

    Is there a community meant for anything that doesn't currently fit into the existing communities?

    10
    7
    "Initials" by "Florian Körner", licensed under "CC0 1.0". / Remix of the original. - Created with dicebear.comInitialsFlorian Körnerhttps://github.com/dicebear/dicebearHO
    Now
    11 969

    howrar

    lemmy.ca