The other thread hit bump limit and I'm addicted to talking about the birth of the ̶a̶l̶l̶-̶k̶n̶o̶w̶i̶n̶g̶ ̶c̶o̶m̶p̶u̶t̶e̶r̶ ̶g̶o̶d̶ the biggest financial bubble in history and the coming jobless eschaton, post your AI news here
Previous thread:
>>27559 143 posts and 14 image replies omitted.>>31167It's an interesting contrast between the effectiveness of AI which seems to be a -20% productivity improvement, and the projections of task completion. Though crucially this is for experienced developers in codebases they're familiar with and who however, are not familiar with LLMs. One is mostly tempted to reject this study.
>>31170The monopoly is just in the IP no? LLM haven't yet wiped that out; though ideally would do.
>>31165>>31175METR's metrics of AI exponential improvement in human labor hours automated at 50% accuracy by AI [^1].
METR's metrics on programming tasks failing to increase productivity for experienced programmers with no LLM experience in repositories they're deeply familiar with [^2].
A Stanford study documenting some of the decline in the job market of entry-level positions exposed to artificial intellegence [^3].
:[^1]
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/:[^2]
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/:[^3]
https://digitaleconomy.stanford.edu/wp-content/uploads/2025/08/Canaries_BrynjolfssonChandarChen.pdf >>31176>programmers with no LLM experienceLook at the acceptance rates here:
https://github.blog/news-insights/research/the-economic-impact-of-the-ai-powered-developer-lifecycle-and-lessons-from-github-copilot/Six months of using Copilot makes you go from around 28% to 34%. That's barely anything and even then it's not clear if it is due to better use or just getting bored of having to review Copilot's code. Tellingly inexperienced developers are more likely to accept what Copilot generated for them… I'm sure it's not because they have lower standards…
>>31201Tried kimi yet? It's considerably more direct than deepseek.
LLMs aren't that impressive of a concept really, it's just the final form of search engines before the web inevitably goes back to surf-ability oriented design. If you come at it like that it's not that bad. It'll never be like the Sci-fi AIs if that's the koolaid you bought.
>>31238 (me)
how does amazon or whatever work around these sort of limitations is that they make a suite of tests that attempt to translate business requirements into a means for the AI to "check its work" but this is just a roundabout way of working, to the point where it gets rather absurd to work this way. if you need detailed prompts, break down problems into discrete tests, and handhold your AI into outputting what you expect, then it stands to reason that this technology is not saving you time, you're doing the same work in a roundabout way, because most senior roles aren't even vomitting code all that much.
>>31237>>31238I don't want to spam this thread anymore, but the issues SWE-benchmark are sourced from that feed the metr study are all sort of like this:
https://github.com/scikit-learn/scikit-learn/issues/13314That is, they have a clearly defined problem statement, with a very straightforward acceptance criteria, and instructions for how to repro. The repo itself has great test coverage too, so the machine can know if it's fucking something up by just running the tests, most SWE is just not like this, not in my experience, maybe some of you have worked with amazing QAs, I dunno.
>>31244lmao why the fuck would you do research grade software depending on multiple nobel prize winning serious scientists carefully written scientist code on anything other than a VM not on other people's servers that can be nuked if it goes wrong, or better on a cluster of airgapped compute
rotflmao, unless you're asian, then it's fine; get the compute do the paper audit properly; black people also good, wypipo messy coders, western education shit
American Communist Party is teaching American People how to use tape measures
This is bad for science and technology, just saying
Very interesting article on the economics of GenAI and how much of a delicate balancing act it'd take to make it a profitable industry.
I've long observed various problems like "more users means more costs and it isn't always a good thing for them", but this one is more detailed.
https://gauthierroussilhe.com/en/articles/how-to-use-computing-power-faster>>31308lol the power of asking even basic follow up questions
hilarious that it takes Tucker to do it, absolute state of US media
Unique IPs: 24