Sam Altman is one of the dullest, most incurious and least creative people to walk this earth. This is, after all, the person who once tweeted 'i am a stochastic parrot and so are u', in response to Emily Bender's (entirely incisive and absolutely brilliant) critique of what his large language model...
After all, there's almost nothing that ChatGPT is actually useful for.
It's takes like this that just discredit the rest of the text.
You can dislike LLM AI for its environmental impact or questionable interpretation of fair use when it comes to intellectual property. But pretending it's actually useless just makes someone seem like they aren't dissimilar to a Drama YouTuber jumping in on whatever the latest on-trend thing to hate is.
"Almost nothing" is not the same as "actually useless". The former is saying the applications are limited, which is true.
LLMs are fine for fictional interactions, as in things that appear to be real but aren't. They suck at anything that involves being reliably factual, which is most things including all the stupid places LLMs and other AI are being jammed in to despite being consistely wrong, which tech bros love to call hallucinations.
They have LIMITED applications, but are being implemented as useful for everything.
To be honest, as someone who's very interested in computer generated text and poetry and the like, I find generic LLMs far less interesting than more traditional markov chains because they're too good at reproducing clichés at the exclusion of anything surprising or whimsical. So I don't think they're very good for the unfactual either. Probably a homegrown neural network would have better results.
I'm in the same boat. Markov chains are a lot of fun, but LLMs are way too formulaic. It's one of those things where AI bros will go, "Look, it's so good at poetry!!" but they have no taste and can't even tell that it sucks; LLMs just generate ABAB poems and getting anything else is like pulling teeth. It's a little more garbled and broken, but the output from a MCG is a lot more interesting in my experience. Interesting content that's a little rough around the edges always wins over smooth, featureless AI slop in my book.
slight tangent:
I was interested in seeing how they'd work for open-ended text adventures a few years ago (back around GPT2 and when AI Dungeon was launched), but the mystique did not last very long. Their output is awfully formulaic, and that has not changed at all in the years since. (of course, the tech optimist-goodthink way of thinking about this is "small LLMs are really good at creative writing for their size!")
I don't think most people can even tell the difference between a lot of these models. There was a snake oil LLM (more snake oil than usual) called Reflection 70b, and people could not tell it was a placebo. They thought it was higher quality and invented reasons why that had to be true.
Like other comments, I was also initially surprised. But I think the gains are both real and easy to understand where the improvements are coming from. [ . . . ]
I had a similar idea, interesting to see that it actually works. [ . . . ]
I think that's cool, if you use a regular system prompt it behaves like regular llama-70b. (??!!!)
It's the first time I've used a local model and did [not] just say wow this is neat, or that was impressive, but rather, wow, this is finally good enough for business settings (at least for my needs). I'm very excited to keep pushing on it. Llama 3.1 failed miserably, as did any other model I tried.
For story telling or creative writing, I would rather have the more interesting broken english output of a Markov chain generator, or maybe a tarot deck or D100 table. Markov chains are also genuinely great for random name generators. I've actually laughed at Markov chains before with friends when we throw a group chat into one and see what comes out. I can't imagine ever getting something like that from an LLM.
Agreed, our chat server ran a Markov chain bot for fun.
In comparison to ChatGPT on a 2nd server I frequent it had much funnier and random responses.
ChatGPT tends to just agree with whatever it chose to respond to.
As for real world use. ChatGPT 90% of the time produces the wrong answer. I've enjoyed Circuit AI however. While it also produces incorrect responses, it shares its sources so I can more easily get the right answer.
All I really want from a chatbot is a gremlin that finds the hard things to Google on my behalf.
Let's be real here: when people hear the word AI or LLM they don't think of any of the applications of ML that you might slap the label "potentially useful" on (notwithstanding the fact that many of them also are in a all-that-glitters-is-not-gold--kinda situation). The first thing that comes to mind for almost everyone is shitty autoplag like ChatGPT which is also what the author explicitly mentions.
I'm a senior software engineer and I make use of it several times a week either directly or via things built on top of it. Yes you can't trust it will be perfect, but I can't trust a junior engineer to be perfect either—code review is something I've done long before AI and will continue to do long into the future.
I empirically work quicker with it than without and the engineers I know who are still avoiding it work noticeably slower. If it was useless this would not be the case.
ah, a señor software engineer. excusé-moi monsoir, let me back up and try once more to respect your opinion
uh, wait:
but I can’t trust a junior engineer to be perfect either
whoops no, sorry, can't do it.
jesus fuck I hope the poor bastards that are under you find some other place real soon, you sound like a godawful leader
and the engineers I know who are still avoiding it work noticeably slower
yep yep! as we all know, velocity is all that matters! crank that handle, produce those features! the factory must flow!!
fucking christ almighty. step away from the keyboard. go become a logger instead. your opinions (and/or the shit you're saying) is a big part of everything that's wrong with industry.
The survey found that 75.9% of respondents (of roughly 3,000* people surveyed) are relying on AI for at least part of their job responsibilities, with code writing, summarizing information, code explanation, code optimization, and documentation taking the top five types of tasks that rely on AI assistance. Furthermore, 75% of respondents reported productivity gains from using AI.
...
As we just discussed in the above findings, roughly 75% of people report using AI as part of their jobs and report that AI makes them more productive.
And yet, in this same survey we get these findings:
if AI adoption increases by 25%, time spent doing valuable work is estimated to decrease 2.6%
if AI adoption increases by 25%, estimated throughput delivery is expected to decrease by 1.5%
if AI adoption increases by 25%, estimated delivery stability is expected to decrease by 7.2%
and that's a report sponsored and managed right from the fucking lying cloud company, no less. a report they sponsor, run, manage, and publish is openly admitting this shit. that is how much this shit doesn't fucking work the way you sell it to be doing.
but no, we should trust your driveby bullshit. motherfucker.
This is a pretty funny interaction when you realise that you just misread the froztbyte's self-reply (and the survey) as pro-AI, so you were just aggressively agreeing with each other all along
these arseslugs are so fucking tedious, and for almost 2 decades they've been dragging everything and everyone around them down to their level instead of finding some spine and getting better
word. When I hear someone say "I'm a SW developer and LLM xy helps me in my work" I always have to stop myself from being socially unacceptably open about my thoughts on their skillset.
and that’s the pernicious bit: it’s not just their skillset, it also goes right to their fucking respect for their team. “I don’t care about just barfing some shit into the codebase, and I don’t think my team will mind either!”
let me back up and try once more to respect your opinion
The point of me saying that was to imply I've been in the industry for a couple of decades, and have a good amount of experience from before all this. It wasn't any kind of appeal to authority, but I can see how you can read it that way.
jesus fuck I hope the poor bastards that under you find some other place real soon, you sound like a godawful leader
I'm sorry, do you trust junior engineers blindly? That's gonna lead to a much worse outcome than if they get feedback when they do something wrong. Frankly, I don't trust any engineer to be perfect, we're humans and humans make mistakes, that's why we do code review as a fundamental skill in this industry. It's one of the primary ways for people to develop their ability.
yep yep! as we all know, velocity is all that matters! crank that handle, produce those features! the factory must flow!!
In an industry where many companies are tightening the belt, yes it's important to perform well—I kinda want to keep my job and ideally get a good bonus. It would be pretty foolish to leave free productivity on the table when the alternative is working harder to bridge the gap, where I could spend that energy doing more productive stuff.
as a starting position, fucking YES. you know why I hired that person? because I believe they can do the job and grow in it. you know what happens if they make a mistake? I give them all the goddamn backup they need to handle it and grow.
"this is why code review is so important" jfc. you're one of those "I've worked here for 4 years and I'm a senior" types, aren't you
@froztbyte@9point6 There's a distinct difference between "I have twenty years of experience" and "I've had the same ten minutes of experience over and over again, over a twenty year period" 🤷
yep yep. no code review. no version control either. that’s weak shit only babies use. over here you deploy patches by live editing app memory in production, and you update the codebase by editing the central repo using vscode remote. everyone has access to it because monorepos are what google do and so do we.
you have a 100% correct comprehension takeaway of what I said, well done!
Interesting you bring up reading comprehension because this whole thread started with me saying I would not trust a junior engineer to be perfect or trust them blindly.
You proceed to die on the hill that you would do that for some reason, despite now implying that you do, in fact, do code reviews—which we do because people can't be trusted to be perfect
Acting superior presses the dopamine button. Especially since the other poster keeps being mature and kind in their responses, really gets that feedback loop going.
Nice, me too, and whenever some tech-brained C-suite bozo tries to mansplain to me why LLMs will make me more efficient, I smile, nod politely, and move on, because at this point I don't think I can make the case that pasting AI slop into prod is objectively a worse idea than pasting Stack Overflow answers into prod.
At the end of the day, if I want to insert a snippet (which I don't have to double-check, mind you), auto-format my code, or organize my imports, which are all things I might use ChatGPT for if I didn't mind all the other baggage that comes along with it, Emacs (or Vim, if you swing that way) does this just fine and has done so for over 20 years.
I empirically work quicker with it than without and the engineers I know who are still avoiding it work noticeably slower.
If LOC/min or a similar metric is used to measure efficiency at your company, I am genuinely sorry.
I agree with you on the examples listed, there are much better tools than an LLM for that. And I agree no one should be copy and pasting without consideration, that's a misuse of these tools.
I'd say my main uses are kicking off a new test suite (obviously you need to go and check the assertions are what you expect, but it's usually about 95% there) which has gone from a decent percentage of the work for a feature down to an almost negligible amount of time. This one also results in me enjoying my job a bit more now too as I've always found writing tests a bit of a drudgery.
The other big use for me is that my organisation is pretty big and has a hefty amount of code (a good couple of thousand repos at least), we have a tool that's based on GPT which has processed all the code, so you can now ask queries about internal stuff that may not be well documented or particularly obvious. This one saves a load of time because I now don't always have to do the Slack merry go round to try and find an engineer that knows about what I'm looking for—sometimes it's still unavoidable, but they're less frequent moments now.
If LOC/min or a similar metric is used to measure efficiency at your company, I am genuinely sorry.
It's tied to OKR completion, which is generally based around delivery. If you deliver more feature work, it generally means your team's scores will be higher and assuming your manager is aware of your contributions, that translates to a bigger bonus. It's more of a carrot than a stick situation IMO, I could work less hard if I didn't want the extra money.
It’s tied to OKR completion, which is generally based around delivery. If you deliver more feature work, it generally means your team’s scores will be higher and assuming your manager is aware of your contributions, that translates to a bigger bonus.
holy fuck. you’re so FAANG-brained I’m willing to bet you dream about sending junior engineers to the fulfillment warehouse to break their backs
motherfucking, “i unironically love OKRs and slurping raises out of management if they notice I’ve been sleeping under my desk again to get features in” do they make guys like you in a factory? does meeting fucking normal software engineers always end like it did in this thread? will you ever realize how fucking embarrassing it is to throw around your job title like this? you depressing little fucker.
I worked at one of the biggest AI companies and their internal AI question/answer was dogshit for anything that could be answered by someone with a single fold in their brain. Maybe your co has a much better one, but like most others, I'm gonna go with the smooth brain hypothesis here.
Senior software engineer programmer here. I have had to tell coworkers "don't trust anything chat-gpt tells you about text encoding" after it made something up about text encoding.
Sadly all my best text encoding stories would make me identifiable to coworkers so I can't share them here. Because there's been some funny stuff over the years. Wait where did I go wrong that I have multiple text encoding stories?
That said I mostly just deal with normal stuff like UTF-8, UTF-16, Latin1, and ASCII.
My favourite was a junior dev who was like, "when I read from this input file the data is weirdly mangled and unreadable so as the first processing step I'll just remove all null bytes, which seems to leave me with ASCII text."
Another professional here. Lemmy really isn’t a place where you’re going to find people listening to what you have to say and critically examining their existing positions. You’re right, and you’re going to get downvoted for it.
In this and other use cases I call it a pretty effective search engine, instead of scrolling through stackexchange after clicking between google ads, you get the cleaned up example code you needed. Not a Chat with any intelligence though.
That ChatGPT can be more useful than a web search is really more indicative of how bad the web has got, and can only get worse as fake text invades it. It's not actually better than a functional search engine and a functional web, but the companies making these things have no interest in the web being usable. Pretty depressing.
Remember when you could read through all the search results on Google rather than being limited to the first hundred or so results like today? And boolean search operators actually worked and weren't hidden away behind a "beware of leopard" sign? Pepperidge Farm remembers.
"despite the many people who have shown time and time and time again that it definitely does not do fine detail well and will often present shit that just 10000% was not in the source material, I still believe that it is right all the time and gives me perfectly clean code. it is them, not I, that are the rubes"
The problem with stuff like this is not knowing when you dont know. People who had not read the books SSC Scott was reviewing didnt know he had missed the points (or not read the book at all) till people pointed it out in the comments. But the reviews stay up.
Anyway this stuff always feels like a huge motte bailey, where we go from 'it has some uses' to 'it has some uses if you are a domain expert who checks the output diligently' back to 'some general use'.
A lot of the "I'm a senior engineer and it's useful" people seem to just assume that they're just so fucking good that they'll obviously know when the machine lies to them so it's fine. Which is one, hubris, two, why the fuck are you even using it then if you already have to be omniscient to verify the output??
Ahah I'm totally with you, I just personally know people that love it because they have never learned how to use a search engine. And these generalist generative AIs are trained on gobbled up internet basically, while also generating so many dangerous mistakes, I've read enough horror stories.
I'm in science and I'm not interested in ChatGPT, wouldn't trust it with a pancake recipe. Even if it was useful to me I wouldn't trust the vendor lock-in or enshittification that's gonna come after I get dependent on aa tool in the cloud.
A local LLM on cheap or widely available hardware with reproducible input / output? Then I'm interested.
It's useful insofar as you can accommodate its fundamental flaw of randomly making stuff the fuck up, say by having a qualified expert constantly combing its output instead of doing original work, and don't mind putting your name on low quality derivative slop in the first place.
I don't see what useful information the motte and bailey lingo actually conveys that equivocation and deception and bait-and-switch didn't. And I distrust any turn of phrase popularized in the LessWrong-o-sphere. If they like it, what bad mental habits does it appeal to?