Why is no one talking about how unproductive it is to have verify every "hallucination" ChatGPT gives you?

phoneymouse@lemmy.world · 5 days ago

Why is no one talking about how unproductive it is to have verify every "hallucination" ChatGPT gives you?

sp3ctr4l@lemmy.zip · edit-2 5 days ago

I just tried out Gemini.

I asked it several questions in the form of ‘are there any things of category x which also are in category y?’ type questions.

It would often confidently reply ‘No, here’s a summary of things that meet all your conditions to fall into category x, but sadly none also fall into category y’.

Then I would reply, ‘wait, you don’t know about thing gamma, which does fall into both x and y?’

To which it would reply ‘Wow, you’re right! It turns out gamma does fall into x and y’ and then give a bit of a description of how/why that is the case.

After that, I would say ‘… so you… lied to me. ok. well anyway, please further describe thing gamma that you previously said you did not know about, but now say that you do know about.’

And that is where it gets … fun?

It always starts with an apology template.

Then, if its some kind of topic that has almost certainly been manually dissuaded from talking about, it then lies again and says ‘actually, I do not know about thing gamma, even though I just told you I did’.

If it is not a topic that it has been manually dissuaded from talking about, it does the apology template and then also further summarizes thing gamma.

…

I asked it ‘do you write code?’ and it gave a moderately lengthy explanation of how it is comprised of code, but does not write its own code.

Cool, not really what I asked. Then command ‘write an implementation of bogo sort in python 3.’

… and then it does that.

…

Awesome. Hooray. Billions and billions of dollars for a shitty way to reform web search results into a coversational form, which is very often confidently wrong and misleading.

archomrade [he/him]@midwest.social · 5 days ago

Idk why we have to keep re-hashing this debate about whether AI is a trustworthy source or summarizer of information when it’s clear that it isn’t - at least not often enough to justify this level of attention.

It’s not as valuable as the marketing suggests, but it does have some applications where it may be helpful, especially if given a conscious effort to direct it well. It’s better understood as a mild curiosity and a proof of concept for transformer-based machine learning that might eventually lead to something more profound down the road but certainly not as it exists now.

What is really un-compelling, though, is the constant stream of anecdotes about how easy it is to fool into errors. It’s like listening to an adult brag about tricking a kid into thinking chocolate milk comes from brown cows. It makes it seem like there’s some marketing battle being fought over public perception of its value as a product that’s completely detached from how anyone actually uses or understands it as a novel piece of software.

sp3ctr4l@lemmy.zip · edit-2 5 days ago

Probably it keeps getting rehashed because people who actually understand how computers work are extremely angry and horrified that basically every idiot executive believes the hype and then asks their underlings to inplement it, and will then blame them for doing what they asked them to do when it turns out their idea was really, unimaginably stupid, but idiot executive gets golden parachute and software person gets fired.

That, and/or the widespread proliferation of this bullshit is making stupid people more stupid, and just making more people stupid in general.

Or how like all the money and energy spent on this is actively murdering the environment and dooming the vast majority of our species, when it could be put toward building affordable housing or renovating crumbling infrastructure.

Don’t worry, if we keep throwing exponential increasing amounts of effort at the thing with exponentially diminishing returns, eventually it’ll become God!

archomrade [he/him]@midwest.social · 5 days ago

Then why are we talking about someone getting it to spew inaccuracies in order to prove a point, rather than the decision of marketing execs to proliferate its use for a million pointless implementations nobody wants at the expense of far higher energy usage?

Most people already know and understand that it’s bad at most of what execs are trying to push it as, it’s not a public-perception issue. We should be talking about how energy-expensive it is, and curbing its use on tasks where it isn’t anything more than an annoying gimmick. At this point, it’s not that people don’t understand its limitations, it’s that they don’t understand how much energy it’s costing and how it’s being shoved into everything we use without our noticing.

Somebody hopping onto openAI or Gemini to get help with a specific topic or task isn’t the problem. Why are we trading personal anecdotes about sporadic personal usage when the problem is systemic, not individualized?

people who actually understand how computers work

Bit idea for moderators: there should be a site or community-wide auto-mod rule that replaces this phrase with ‘eat all their vegitables’ or something that is equally un-serious and infantilizing as ‘understand how computers work’.

sp3ctr4l@lemmy.zip · edit-2 5 days ago

You original comment is posted under mine.

I am going to assume you are responding to that.

… I wasn’t trying to trick it.

I was trying to use it.

This is relevant to my more recent reply to you… because it is an anecdotal example of how broadly useless this technology is.

…

I wasn’t aware the purpose of this joke meme thread was to act as a policy workshop to determine an actionable media campaign aimed at generating mass awareness of the economic downsides of LLMs, which wouldn’t fucking work anyway because LLMs are being pushed by a class of wealthy people who do not fucking care what the masses think, and have essentially zero reason at all to change their course of action.

What, we’re going to boycott the entire tech industry?

Vote them out of office?

These people are on video, on record saying basically, ‘eh, we’re not gonna save the climate, not happening, might as well burn it all down even harder, even faster, for a tiny percentage chance our overcomplicated autocomplete algorithm magically figures out how to fix everything afterward’.

…

And yes, I very intentionally used the phrase ‘understand how computers actually work’ to infantilize and demean corporate executives.

Because they are narcissistic priveleged sociopaths who are almost never qualified, almost always make idiotic decisions that will only benefit themselves and an increasingly shrinking number of people at the expense of the vast majority of people who know more and work harder than they do, and who often respond like children having temper tantrums when they are justly criticized.

Again, in the context of a joke meme thread.

Please get off your high horse, or at least ride it over to a trough of water if you want a reasonable place to try to convince it to drink in the manner in which you prefer.

archomrade [he/him]@midwest.social · 5 days ago

… I wasn’t trying to trick it.

I was trying to use it.

Err, I’d describe your anecdote more as an attempt to reason with it…? If you were using google to search for an answer to something and it came up with the wrong thing, you wouldn’t then complain back to it about it being wrong, you’d just try again with different terms or move on to something else. If ‘using’ it for you is scolding it as if it’s an incompetent coworker, then maybe the problem isn’t the tool but how you’re trying to use it.

I wasn’t aware the purpose of this joke meme thread was to act as a policy workshop to determine an actionable media campaign

Lmao, it certainly isn’t. Then again, had you been responding with any discernible humor of your own I might not have had reason to take your comment seriously.

And yes, I very intentionally used the phrase ‘understand how computers actually work’ to infantilize and demean corporate executives.

Except your original comment wasn’t directed at corporate executives, it appears to be more of a personal review of the tool itself. Unless your boss was the one asking you to use Gemini? Either way, that phrase is used so much more often as self-aggrandizement and condescension that it’s hard to see it as anything else, especially when it follows an anecdote of that person trying to reason with a piece of software lmao.

sp3ctr4l@lemmy.zip · 5 days ago

It is not that it responded “Sorry, I cannot find anything like what you described, here are some things that are pretty close.”

It affirmatively said “No, no such things as you describe exist, here are some things that are pretty close.”

There’s a huge difference between a coworker saying “Dang man, I dunno, I can’t find a thing like that.” and “No, nothing like that exists, closest to it is x y z,”

The former is honest. The latter is confidently incorrect.

Combine that with “Wait what about gamma?”

And the former is still honest, and the latter, who now describes gamma in great detail and how it meets my requirements, is now an obvious liar, after telling me that nothing like that exists.

If I now know I am dealing with a dishonest interlocutor, now I am forced to consider tricking it into being homest.

Or, if I am less informed or more naive, I might just, you know, believe it the first time.

A standard search engine that is not formatted to resemble talking to a person does not prompt a user to expect it to act like a person, and thus does not suffer from this problem.

If you don’t find what you’re looking for, all that means is you did not find it.

If you are told that no such thing exists, a lot of people are going to believe that no such thing exists.

That is typically called spreading disinformation, when the actor knows what they are claiming is false.

Its worse than unhelpful, it actively spreads lies.

…

Anyway, I’m sorry that you don’t see humor in multi billion dollar technology failing at achieving its purported abilities, I laugh all the time at poorly designed products, systems, things.

…

Finally, I did not use the phrase in contention in my original post.

I used it in my response to you, specifically and only within a single sentence which revolved around incompetent executives.

…

It appears that reading comprehension is not your strong suit, maybe you can ask Gemini about how to improve it.

Err, well, maybe don’t do that.

archomrade [he/him]@midwest.social · 5 days ago

reading comprehension

Lmao, there should also be an automod rule for this phrase, too.

There’s a huge difference between a coworker saying […]

Lol, you’re still talking about it like it’s a person that can be reasoned with bud. It’s just a piece of software. If it doesn’t give you the response you want you can try using a different prompt, just like if google doesn’t find what you’re looking for you can change your search terms.

If people are gullible enough to take its responses as given (or scold it for not being capable of rational thought lmao) then that’s their problem - just like how people can take the first search result from google without scrutiny if they want to, too. There’s nothing especially problematic about the existence of an AI chatbot that hasn’t been addressed with the advent of every other information technology.

antonim@lemmy.dbzer0.com · 5 days ago

to fool into errors

tricking a kid

I’ve never tried to fool or trick AI with excessively complex questions. When I tried to test it (a few different models over some period of time - ChatGPT, Bing AI, Gemini) I asked stuff as simple as “what’s the etymology of this word in that language”, “what is [some phenomenon]”. The models still produced responses ranging from shoddy to absolutely ridiculous.

completely detached from how anyone actually uses

I’ve seen numerous people use it the same way I tested it, basically a Google search that you can talk with, with similarly shit results.

archomrade [he/him]@midwest.social · 5 days ago

Why do we expect a higher degree of trustworthiness from a novel LLM than we de from any given source or forum comment on the internet?

At what point do we stop hand-wringing over llms failing to meet some perceived level of accuracy and hold the people using it responsible for verifying the response themselves?

Theres a giant disclaimer on every one of these models that responses may contain errors or hallucinations, at this point I think it’s fair to blame the user for ignoring those warnings and not the models for not meeting some arbitrary standard.

antonim@lemmy.dbzer0.com · edit-2 3 days ago

Why do we expect a higher degree of trustworthiness from a novel LLM than we de from any given source or forum comment on the internet?

The stuff I’ve seen AI produce has sometimes been more wrong than anything a human could produce. And even if a human would produce it and post it on a forum, anyone with half a brain could respond with a correction. (E.g. claiming that an ordinary Slavic word is actually loaned from Latin.)

I certainly don’t expect any trustworthiness from LLMs, the problem is that people do expect it. You’re implicitly agreeing with my argument that it is not just that LLMs give problematic responses when tricked, but also when used as intended, as knowledgeable chatbots. There’s nothing “detached from actual usage” about that.

At what point do we stop hand-wringing over llms failing to meet some perceived level of accuracy and hold the people using it responsible for verifying the response themselves?

at this point I think it’s fair to blame the user for ignoring those warnings and not the models for not meeting some arbitrary standard

This is not an either-or situation, it doesn’t have to be formulated like this. Criticising LLMs which frequently produce garbage is in practice also directed at people who do use them. When someone on a forum says they asked GPT and paste its response, I will at the very least point out the general unreliability of LLMs, if not criticise the response itself (very easy if I’m somewhat knowledgeable about the field in question). This is practically also directed at the person who posted that, such as e.g. making them come off as naive and uncritical. (It is of course not meant as a real personal attack, but even a detached and objective criticism has a partly personal element to it.)

Still, the blame is on both. You claim that:

Theres a giant disclaimer on every one of these models that responses may contain errors or hallucinations

I don’t remember seeing them, but even if they were there, the general promotion and ways in which LLMs are presented in are trying to tell people otherwise. Some disclaimers are irrelevant for forming people’s opinions compared to the extensive media hype and marketing.

Anyway my point was merely that people do regularly misuse LLMs, and it’s not at all difficult to make them produce crap. The stuff about who should be blamed for the whole situation is probably not something we disagree about too much.

archomrade [he/him]@midwest.social · 3 days ago

The stuff I’ve seen AI produce has sometimes been more wrong than anything a human could produce. And even if a human would produce it and post it on a forum, anyone with half a brain could respond with a correction.

Seems like the problem is that you’re trying to use it for something it isn’t good or consistent at. It’s not a dictionary or encyclopedia, it’s a language model that happens to have some information embedded. It’s not built or designed to retrieve information from a knowledge bank, it’s just there to deconstruct and reconstruct language.

When someone on a forum says they asked GPT and paste its response, I will at the very least point out the general unreliability of LLMs, if not criticise the response itself (very easy if I’m somewhat knowledgeable about the field in question)

Same deal. Absolutely chastise them for using it in that way, because it’s not what it’s good for. But it’s a bit of a frequency bias to assume most people are using it in that way, because those people are the ones using it in the context of social media. Those who use it for more routine tasks aren’t taking responses straight from the model and posting it on lemmy, they’re using it for mundane things that aren’t being shared.

Anyway my point was merely that people do regularly misuse LLMs, and it’s not at all difficult to make them produce crap. The stuff about who should be blamed for the whole situation is probably not something we disagree about too much.

People misuse it because they think they can ask it questions as if it’s a person with real knowledge, or they are using it precisely for it’s convincing bullshit abilities. That’s why I said it’s like laughing at a child for giving a wrong answer or convincing them of a falsehood merely from passive suggestion - the problem isn’t that the kid is dumb, it’s that you’re (both yourself and the person using it) going in with the expectation that they are able to answer that question or distinguish fact from fiction at all.

taladar@sh.itjust.works · 5 days ago

And then more money spent on adding that additional garbage filter to the beginning and the end of the process which certainly won’t improve the results.

pyre@lemmy.world · 5 days ago

copilot did the same with basic math. just to test it I said “let’s say I have a 10x6 rectangle. what number would I have to divide width and height by, in order to end up with a rectangle that’s half the area?”

it said “in order to make it half, you should divide them by 2. so [pointlessly lengthy steps explaining the divisions]”

I said “but that would make the area 5x3 = 15 units which is not half the area of 60”

it said “you’re right! in order to … [fixing the answer to √2 using approximation”

I don’t know if I said it then, or after some other fucking nonsense but when I said “you’re useless” it had the fucking audacity to take offense and end the conversation!

like fuck off, you don’t get to have fake pride if you don’t have basic fake intelligence but use it in your description.

sp3ctr4l@lemmy.zip · edit-2 5 days ago

Its a perfect encapsulation of the corpo mindset:

Whatever I do is profound, meaningful, with endless possibilities for future greatness…

… even though I’m just talking out of my ass 99% of the time…

… and if you have the audacity, the nerve, to have a completely normal reaction when you determine that that is what I am doing, pshaw, how uncouth, I won’t stand for your abuse!

…

They’ve done it. They’ve made a talking (not thinking) machine in their own image.

And it was not good.

You start a conversation you can’t even finish it You’re talkin’ a lot, but you’re not sayin’ anything When I have nothing to say, my lips are sealed Say something once, why say it again?

Psycho Killer Qu’est-ce que c’est

Knock_Knock_Lemmy_In@lemmy.world · 4 days ago

please further describe thing gamma that you previously said you did not know about, but now say that you do know about.’

It’s quite amusing to ask it about conspiracy theories. There’s a huge amount in it’s training set (not because the theories are true, just that they are often written about) that it has been dissuaded from discussing.

UnderpantsWeevil@lemmy.world · edit-2 5 days ago

Cool, not really what I asked. Then command ‘write an implementation of bogo sort in python 3.’

… and then it does that.

Alright, but… it did the thing. That’s a feature older search engines couldn’t reliably perform. The output is wonky and the conversational style is misleading. But its not materially worse than sifting through wrong answers on StackExchange or digging through a stack of physical textbooks looking for Python 3 Bogo Sort IRL.

I agree AI has annoying flaws and flubs. And it does appear we’re spending vast resources doing what a marginal improvement to Google five years ago could have done better. But this is better than previous implementations of search, because it gives you discrete applicable answers rather than a collection of dubiously associated web links.

sp3ctr4l@lemmy.zip · edit-2 5 days ago

But this is better than previous implementations of search, because it gives you discrete applicable answers rather than a collection of dubiously associated web links.

Except for when you ask it to determine if a thing exists by describing its properties, and then it says no such thing exists while providing a discrete response explaining in detail how there are things that have some, but not all of those properties…

… And then when you ask it specifically about a thing you already know about that has all those properties, it tells you about how it does exist and describes it in detail.

What is the point of a ‘conversational search engine’ if it cannot help you find information unless you already know about said information?!

The whole, entire point of formatting it into a conversational format is to trick people into thinking they are talking to an expert, an archivist with encyclopedaeic knowledge, who will give them accurate answers.

Yet it gatekeeps information that it does have access to but omits.

The format of providing a bunch of likely related links to a query is a format much more reminiscent of doing actual research, with no impression that you will immediately find what you want right away, that this is a tool to aide you in your research process.

This is only an improvement if you want to further unteach people how to do actual research and critical thinking.

UnderpantsWeevil@lemmy.world · 5 days ago

Except for when you ask it to determine if a thing exists by describing its properties

Basic search can’t answer that either. You’re describing a task neither system is well equipped to accomplish.

sp3ctr4l@lemmy.zip · edit-2 4 days ago

With basic search, it is extremely obvious that that feature does not exist.

With conversational search, the search itself gaslights you into believing it has this feature, as it understands how to syntactically parse the question, and then answers it confidently with a wrong answer.

I would much rather buy a car that cannot fly, knowing it cannot fly, than a car that literally talks to you and tells you it can fly, and sometimes manages to glide a bit, but also randomly nose dives into the ground whilst airborne.

Zeppo@sh.itjust.works · 4 days ago

I don’t feel like off-the-cuff summaries by AI can replace web sites and detailed articles written by knowledgeable humans. Maybe if you’re looking for a basic summary of a topic.

UnderpantsWeevil@lemmy.world · 4 days ago

I don’t feel like off-the-cuff summaries by AI can replace web sites and detailed articles written by knowledgeable humans

No. But that’s not what a typical search result returns.

There’s also no guarantee the “detailed articles” you get back are well-informed or correct. Lots of top search results are just ad copy or similar propaganda. YouTube, in particular, is rife with long winded bullshitters.

What you’re looking for is a well-edited trustworthy encyclopedia, not a search engine.

Zeppo@sh.itjust.works · 2 days ago

Sure. It depends on the topic. It also depends how good you are at searching. Personally, I don’t have any difficulty finding quality websites via Google or DuckDuckGo. Sometimes it requires refining the search terms, and many people don’t know how to do that properly. For certain topics, I might not. However, i stand by my assertion that some statistically generated text which you can trust at all is less useful than a good article on a topic. Is Wikipedia useful also? Yes, it is.

1stTime4MeInMCU@mander.xyz · 5 days ago

I’m convinced people who can’t tell when a chat bot is hallucinating are also bad at telling whether something else they’re reading is true or not. What online are you reading that you’re not fact checking anyway? If you’re writing a report you don’t pull the first fact you find and call it good, you need to find a couple citations for it. If you’re writing code, you don’t just write the program and assume it’s correct, you test it. It’s just a tool and I think most people are coping because they’re bad at using it

BluesF@lemmy.world · 5 days ago

Yeah. GPT models are in a good place for coding tbh, I use it every day to support my usual practice, it definitely speeds things up. It’s particularly good for things like identifying niche python packages & providing example use cases so I don’t have to learn shit loads of syntax that I’ll never use again.

Aceticon@lemmy.world · 5 days ago

In other words, it’s the new version of copying code from Stack Overflow without going to the trouble of properly understanding what it does.

Rekorse@sh.itjust.works · 5 days ago

Pft you must have read that wrong, its clearly turning them into master programmer one query at a time.

BluesF@lemmy.world · 5 days ago

I know how to write a tree traversal, but I don’t need to because there’s a python module that does it. This was already the case before LLMs. Now, I hardly ever need to do a tree traversal, honestly, and I don’t particularly want to go to the trouble of learning how this particular python module needs me to format the input or whatever for the one time this year I’ve needed to do one. I’d rather just have something made for me so I can move on to my primary focus, which is not tree traversals. It’s not about avoiding understanding, it’s about avoiding unnecessary extra work. And I’m not talking about saving the years of work it takes to learn how to code, I’m talking about the 30 minutes of work it would take for me to learn how to use a module I might never use again. If I do, or if there’s a problem I’ll probably do it properly the second time, but why do it now if there’s a tool that can do it for me with minimum fuss?

archomrade [he/him]@midwest.social · 5 days ago

The usefulness of Stack Overflow or a GPT model completely depends on who is using it and how.

It also depends on who or what is answering the question, and I can’t tell you how many times someone new to SO has been scolded or castigated for needing/wanting help understanding something another user thinks is simple. For all of the faults of GPT models, at least they aren’t outright abusive to novices trying to learn something new for themselves.

Aceticon@lemmy.world · edit-2 5 days ago

I fully expect an LLM trained in Stack Overflow is quiet capable of being just as much of an asshole as a Stack Overflow user.

Joke on the side, whilst I can see that “not going to the trouble of understanding the code you got” is mostly agnostic in terms of the source being Stack Overflow or an LLM (whilst Stack Overflow does naturally have more context around the solution, including other possible solutions, an LLM can be interrogated further to try and get more details), I think only time will tell if using an LLM model ultimately makes for less well informed programmers than being a heavy user of Stack Overflow or not.

What I do think is more certainly, is that figuring out a solution yourself is a much better way to learn that stuff than getting it from an LLM or Stack Overflow, though I can understand that often time is not available for that more time consuming method, plus that method is an investment that will only pay if you get faced with similar problems in the future, so sometimes it’s simply not worth it.

The broader point I made still stands: there is a class of programmers who are copy & paste coders (no idea if the poster I originally replied to is one or not) for whom an LLM is just a faster to query Stack Overflow.

archomrade [he/him]@midwest.social · 5 days ago

There will always be a class of programmers/people that choose not to interrogate or seek to understand information that is conveyed to them - that doesn’t negate the value provided by tools like Stack Overflow or chatGPT, and I think OP was expressing that value.

tired_n_bored@lemmy.world · 5 days ago

I beg someone to help me. There is this new guy at my workplace, officially as a developer who can’t write code at all. He has pasted an entire project I did into ChatGPT with “optimize this” and pull requested it. I swear.

wizardbeard@lemmy.dbzer0.com · 5 days ago

Report up the chain, if it’s safe to do so and they are likely to understand.

Also, check what your company’s rules regarding data security and LLM use are. My understanding is that at many places putting private company or customer data into an outside LLM is seen as shouting company secrets out to the open internet. At least that’s the policy where I’m at. Pasting an entire project in would definitely violate things for my workplace.

In general that’s rude as hell. New guy comes in, grabs an entire project they have no background with, and just chucks it at an LLM? No actual review of it themselves, just an assumption that your code is so shit that a general use text generator will do better? Doesn’t sound like a “team player” to me (management eats that kind of talk up).

Maybe couch it as “I want to make sure that as a team, we’re utilizing the tools available to us in the best way possible to multiply our strengths. That said, I’m concerned the approach that [LLM idiot] is using will only result in more work for the team. Using chatGPT as he has is an explosive approach, when I feel that a more scalpel-like approach to address specific areas for improvement would be the best method moving forward. We should be using these tools to address specific concerns, not chucking everything at the wall in some never ending chase of an undefined idea of ‘more optimized’.”

Perhaps frame it in terms of man hours? The immediateness of 5 minutes in chatGPT can cost the team multiple workdays in reviewing the output, whereas more focused code review up front can reduce the man hour cost significantly.

There’s also a bunch of articles out there online about how overuse of LLMs is leading to a measurable decrease in code quality and increase in security issues in code bases.

tired_n_bored@lemmy.world · 4 days ago

Such a great answer, thank you lots!

WalnutLum@lemmy.ml · edit-2 4 days ago

Reminder that all these Chat-formatted LLMs are just text-completion engines trained on text formatted like a chat. You’re not having a conversation with it, it’s “completing” the chat history you’re providing it. By randomly(!) choosing the next text tokens that seems like they best fit the text provided.

If you don’t directly provide, in the chat history and/or the text completion prompt, the information you’re trying to retrieve, you’re essentially fishing for text in a sea of random text tokens that seems like it fits the question.

It will always complete the text, even if the tokens it chooses minimally fit the context, it chooses the best text it can but it will always complete the text.

This is how they work, and anything else is usually the company putting in a bunch of guide bumpers to reformat prompts into coaxing the models to respond in a “smarter” way (see GPT-4o and “chain of reasoning”)

HackerJoe@sh.itjust.works · 4 days ago

They were trained on reddit. How much would you trust a chatbot whose brain consists of the entirety of reddit put in a blender?

I am amazed it works as well as it does. Gemini only occasionally tells people to kill themselves.

JackbyDev@programming.dev · 5 days ago

Because of I haven’t found anyone asking the same question on a search index, ChatGPT won’t tell me to just use Google or close my question as a duplicate when it’s not a duplicate.

bl_r@lemmy.dbzer0.com · 4 days ago

My job uses a data science platform that has a special ai assistant trained on its own docs.

The first time I tried using it, it used the wrong language. The second time I used it, it was hallucinating its own functions, but after looking up the docs I told it what function to use and it gave me code that worked

I have not used it a third time. I don’t think i will.

Nurse_Robot@lemmy.world · 5 days ago

sigh people do talk about this, they complain about it non-stop. These same people probably aren’t using it as intended, or are deliberately trying to farm a “gotcha” response. AI is a very neat tool which can do a lot of things well, but it’s important to recognize its limitations. I don’t use it for things I don’t understand because I won’t recognize if it’s spitting out nonsense, but for topics I do understand it’s hard to overstate how efficient and time saving it is.

ByteOnBikes@slrpnk.net · 5 days ago

The FuckAI people are valid for their concerns.

Unfortunately, their anger seems to constantly be misdirected at the weirdest things, instead of root issues.

taladar@sh.itjust.works · 5 days ago

Oh, there is plenty of hate for the hype cycle in general which is about as close to the root of the issue as you can get.

brucethemoose@lemmy.world · edit-2 5 days ago

My take is they should be fighting the corporate API vs open source models war, instead of just “screw all AI” which really means “screw open source AI and let Sam Altman enshittify everything”

Especially on Lemmy.

It’d be like blanket railing against social media and ultimately getting the Fediverse banned, while Facebook and X walk away.

zarkanian@sh.itjust.works · 5 days ago

“Give me a vegan recipe using <ingredient>” has been flawless. The recipes are decent, although they tend to use the same spices over and over.

Paradigm_shift@sh.itjust.works · 5 days ago

I sometimes use it to “convert” preexisting bulletpoints or informal notes into a professional sounding business email. I already know all the information so proofreading the final product doesn’t take a lot of time.

I think a lot of people who shit on AI forget that some people struggle with putting their thoughts into words. Especially if they aren’t writing in their native language.

Rekorse@sh.itjust.works · 5 days ago

Efficiency depends on the cost doesnt it?

Nurse_Robot@lemmy.world · 5 days ago

The cost to me, the user, is nothing

taladar@sh.itjust.works · 5 days ago

Sorry to hear that you consider your time worthless. Have you tried therapy for that?

Nurse_Robot@lemmy.world · 5 days ago

There’s something so uniquely funny about being too stupid to insult someone properly. Thanks for the chuckle

surph_ninja@lemmy.world · 5 days ago

Depending on the task, it’s quicker to verify the AI response than work through the blank page phase.

TrickDacy@lemmy.world · 5 days ago

Probably because they’re not checking them

hoshikarakitaridia@lemmy.world · 5 days ago

Because in a lot of applications you can bypass hallucinations.

getting sources for something
as a jump off point for a topic
to get a second opinion
to help argue for r against your position on a topic
get information in a specific format

In all these applications you can bypass hallucinations because either it’s task is non-factual, or it’s verifiable while promoting, or because you will be able to verify in any of the superseding tasks.

Just because it makes shit up sometimes doesn’t mean it’s useless. Like an idiot friend, you can still ask it for opinions or something and it will definitely start you off somewhere helpful.

WalnutLum@lemmy.ml · 5 days ago

All LLMs are text completion engines, no matter what fancy bells they tack on.

If your task is some kind of text completion or repetition of text provided in the prompt context LLMs perform wonderfully.

For everything else you are wading through territory you could probably do easier using other methods.

burgersc12@mander.xyz · 5 days ago

I love the people who are like “I tried to replace Wolfram Alpha with ChatGPT why is none of the math right?” And blame ChatGPT when the problem is all they really needed was a fucking calculator

leftzero@lemmynsfw.com · 4 days ago

The fucking problem is they stole my damn calculator and now they’re trying to sell me an LLM as a replacement.

LLMs are an interesting if mostly useless toy (an excessively costly one, though; Eliza achieved mostly the same results at a fraction of the cost).
The massive scam bubble that’s been built around them, however, and its absurd contribution to enshittification and global warming, is downright monstrous, and makes anyone defending commercial LLMs worthy of the utmost contempt, just like those who defended cryptocurrencies before LLMs became the latest fad.

ms.lane@lemmy.world · 5 days ago

Also just searching the web in general.

Google is useless for searching the web today.

fibojoly@sh.itjust.works · 5 days ago

Not if you want that thing that everyone is on about. Don’t you want to be in with the crowd?! /s

ohwhatfollyisman@lemmy.world · 5 days ago

so, basically, even a broken clock is right twice a day?

dev_null@lemmy.ml · 5 days ago

Yes, but for some tasks mistakes don’t really matter, like “come up with names for my project that does X”. No wrong answers here really, so an LLM is useful.

ohwhatfollyisman@lemmy.world · 5 days ago

great value for all that energy it expends, indeed!

dev_null@lemmy.ml · 5 days ago

Can’t agree

archomrade [he/him]@midwest.social · 5 days ago

The energy expenditure for GPT models is basically a per-token calculation. Having it generate a list of 3-4 token responses would barely be a blip compared to having it read and respond entire articles.

There might even be a case for certain tasks with a GPT model being more energy efficient than making multiple google searches for the same. Especially considering all the backend activity google tacks on for tracking users and serving ads, complaining about someone using a GPT model for something like generating a list of words is a little like a climate activist yelling at someone for taking their car to the grocery store while standing across the street from a coal-burning power plant.

ohwhatfollyisman@lemmy.world · 5 days ago

… someone using a GPT model for something like generating a list of words is a little like a climate activist yelling at someone for taking their car to the grocery store while standing across the street from a coal-burning power plant.

no, it’s like a billion people taking their respective cars to the grocery store multiple times a day each while standing across the street from one coal-burning power plant.

each person can say they are the only one and their individual contribution is negligible. but get all those drips together and you actually have a deluge of unnecessary wastage.

archomrade [he/him]@midwest.social · 5 days ago

Except each of those drips are subject to the same system that preferences individualized transport

This is still a perfect example, because while you’re nit-picking the personal habits of individuals who are a fraction of a fraction of the total contributors to GPT model usage, huge multi-billion dollar entities are implementing it into things that have no business using it and are representative for 90% of llm queries.

Similar for castigating people for owning ICE vehicles, who are not only uniquely pressued into their use but are also less than 10% of GHG emissions in the first place.

Stop wasting your time attacking individuals using the tech for help in their daily tasks, they aren’t the problem.

Rekorse@sh.itjust.works · 5 days ago

How is that faster than just picking a random name? Noone picks software based on name.

dev_null@lemmy.ml · edit-2 5 days ago

And yet virtually all of software has names that took some thought, creativity, and/or have some interesting history. Like the domain name of your Lemmy instance. Or Lemmy.

And people working on something generally want to be proud of their project and not name it the first thing that comes to mind, but take some time to decide on a name.

Rekorse@sh.itjust.works · 4 days ago

Wouldnt they also not want to take a random name off an AI generated list? How is that something to be proud of? The thought, creativity, and history behind it is just that you put a query into chatgpt and picked one out of 500 names?

Maybe its just a difference of perspective but thats not only not a special origin story for a name, its taking from others in a way you won’t be able to properly credit them, which is essential to me.

I would rather avoid the trouble and spend the time with a coworker or friend throwing ideas back and forth and building an identity intentionally.

I suppose AI could be nice if I was alone nearly all the time.

dev_null@lemmy.ml · edit-2 4 days ago

The process of throwing ideas back and forth usually doesn’t include just choosing one, but generating ideas as jumping off points, usually with some existing concept in mind. Talking with friends, looking at other projects, searching for inspiration online and in the real world, and now also generating some more ideas with an LLM to add to the mix. Using one source and just picking a suggestion probably won’t get you a good result.

onionsinmypores@sh.itjust.works · edit-2 5 days ago

No, maybe more like, even a functional clock is wrong every 0.8 days.
https://superuser.com/questions/759730/how-much-clock-drift-is-considered-normal-for-a-non-networked-windows-7-pc

The frequency is probably way higher for most LLMs though lol

Kushan@lemmy.world · 5 days ago

They don’t give you the answer, they give you a rough idea of where to look for the answer.

I’ve used them to generate chunks of boilerplate code that was 80% of what I needed, because I knew what I needed and wanted to save time.

BakerBagel@midwest.social · 5 days ago

There are ways of doing that which dont require burning an acre of rainforest

wizardbeard@lemmy.dbzer0.com · 4 days ago

Yep. The overwhelming majority of IDEs have support for making templates/snippets.

VScode/VScodium has a very robust snippet system where you can set parts as “fill in the blank” that you can tab between, with optional drop down menus for choices. You can even link different “fill in” sections so you can do stuff like type in an argument name and have it propagate that same name through multiple places in your snippet.

If that’s too much, how the fuck can any dev (or even someone hacking together scripts) survive without at least one file of common shit they made before that they can copy paste from? I really feel like that’s bare minimum.

Either it’s boilerplate you can already copy from somewhere else (documentation or previous work), or it’s something you should probably review (at least briefly) and make into a template or snippet you can copy and paste later. That’s part of the magic of programming: you get to build your own toolbox over time.

snooggums@lemmy.world · 5 days ago

Because most people are too lazy to bother with making sure the results are accurate when they sound plausible. They want to believe the hype, and lack critical thinking.

Chip_Rat@lemmy.world · 5 days ago

I don’t want to believe any hype! I just want to be able to ask “hey Chatgtp, I’m looking for a YouTube video by technology connections where he discusses dryer heat pumps.” And not have it spit out "it’s called “the neat ways your dryer heat pumps save energy!”

And it is not, that video doesn’t exist. And it’s even harder to disprove it on first glance because the LLM is mimicing what Alex would have called the video. So you look and look with your sisters very inefficient PS4 controller-to-youtube interface… And finally ask it again and it shy flowers you…

But I swear he talked about it ?!?! Anyone?!?

gamermanh@lemmy.dbzer0.com · edit-2 5 days ago

He hasn’t

I think in a recent video he mentioned he will soon, but he hasn’t done a video with even a segment on heat pumps in dryers yet

Fairly confident in this, recently finished a rewatch of basically all his content

Chip_Rat@lemmy.world · 5 days ago

Damn it… I was sure he mentioned them briefly in one of his heat pump videos but I trust you over Chatgtp…

He should do a video! I am constantly enchanted by his heat pump explainers… I don’t know why but it’s one of those concepts that’s just a bit out of my wheelhouse. So I always “knew” how it worked. But the lightbulb moment. The aha! Pure crack.

ms.lane@lemmy.world · 5 days ago

This sound awfully familiar, like almost exactly what people were saying about Wikipedia 20 years ago…

julietOscarEcho@sh.itjust.works · 5 days ago

Pretty weak analogy. Wikipedia was technologically trivial and did a really good job of avoiding vested interests. Also the hype is orders of magnitude different, noone ever claimed Wikipedia was going to lead to superhuman intelligences or to replacement of swathes of human creative/service workers.

Actually since you mention it, my hot take is that Wikipedia might have been a more significant step forward in AI than openAI/latest generation LLMs. The creation of that corpus is hugely valuable in training and benchmarking models of natural language. Also it actually disrupted an industry (conventional encyclopedias) in a way that I’m struggling to think of anything that LLMs has replaced in the same way thus far.

snooggums@lemmy.world · 5 days ago

Those people were wrong because wikipedia requires actual citations from credible sources, not comedic subreddits and infowars. Wikipedia is also completely open about the information being summarized, both in who is presenting it and where someone can confirm it is accurate.

AI is a presented to the user as a black box and tries to be portray it as equivalent to human with terms like ‘hallucinations’ which really mean ‘is wrong a bunch, lol’.

kjaeselrek@lemmy.ml · 3 days ago

who the fuck is scraeming ‘RTFM’ at my house. show yourself, coward. i will never r any fm

orcrist@lemm.ee · 4 days ago

What are you talking about? We mention this on a daily basis. That’s the #1 complaint about ChatGPT when used for factual purposes

buddascrayon@lemmy.world · 4 days ago

when used for factual purposes

I think the point of the post is that anyone who uses it for this is a fucking moron.

Liz@midwest.social · 4 days ago

Literally the only use I’ve found for it that’s better than any other alternative is describing a thing to it that you can’t remember the name of. It’s usually right, and when it’s wrong you were probably never gonna find the thing on your own anyway.

But I don’t go to it first, only when I can’t figure out how to find the name any other way.

alsimoneau@lemmy.ca · 4 days ago

Yeah, I’ve used it when I’m looking for a specific function I don’t know the name of, and then I go read the documentation.

orcrist@lemm.ee · 3 days ago

That’s true, and my point is that the post didn’t say that. The post specifically said something different that was not true.

ugjka@lemmy.world · 5 days ago

The only reason i use ChatGPT for some quick stuff is just that search engines suck so bad.

brucethemoose@lemmy.world · 5 days ago

Perplexity (or open source equivalents) are much better for this.