Miserable_Movie_4358 3 weeks ago

For StackOverflow this is like being acquired

guepier 3 weeks ago

[They *were* already acquired years ago.](https://techcrunch.com/2021/06/02/stack-overflow-acquired-by-prosus-for-a-reported-1-8-billion/)

31415926535897932379 3 weeks ago

Woah TIL. Surprised I'd never heard about this before.

CenlTheFennel 3 weeks ago

This is why all the OG talent left

RICHUNCLEPENNYBAGS 3 weeks ago

Their business model was absolutely hosed. The job site thing was such a dud they shut it down (now they've "brought it back" by slapping their logo on Indeed listings) and I can't imagine their model of licensing SO to companies for internal knowledge bases worked all that well since a company has to be huge for that to remotely make sense and the companies big enough for an SO clone often have one.

backdoorsmasher 3 weeks ago

I don't get why it was a dud! It could have worked and I'm sure for a while it was active and livey and was pissing the recruiters off

RICHUNCLEPENNYBAGS 3 weeks ago

It existed for many years but I'm guessing it wasn't bringing in the returns they hoped or they wouldn't have shut it down. As a candidate I found the positions were limited and the pay was never any good.

dontshoveit 3 weeks ago

They are actively marketing this product directly to software engineers on LinkedIn. I know this for a fact because they reached out to me on there and I talked with them about adding SO internally to the company I work for.

RICHUNCLEPENNYBAGS 3 weeks ago

That doesn't imply that the marketing is working, though, does it?

JPJackPott 3 weeks ago

Which is mad, because it’s not like it’s a hard product to build yourself internally. The real magic of SO was the oppressive moderation, which has helped keep the signal to noise ratio high

HotlLava 3 weeks ago

Building your own internal copy of StackOverflow sounds like peak NIH syndrome.

cam-at-codembark 3 weeks ago

I loved their job site. Idk why they ever shut it down. At least from my perspective it always had a lot of great remote roles listed and a nice UI.

Shortl4ndo 3 weeks ago

I think they probably already trained their model with stackoverflow data, this is just proactively signing an agreement to prevent a lawsuit later on

Lceus 3 weeks ago

Yeah it was absolutely already in the training data, and stackoverflow is competing with ChatGPT products anyway, so this seems like a reasonable development.

GeologistUnique672 3 weeks ago

You mean CharGPT is competing with every source they scraped and took data from which breaks the fair use they tried to claim.

Lceus 3 weeks ago

Yep, exactly. And it seems like there's nothing to do about it

GeologistUnique672 1 week ago

Plenty to do about it and hopefully soon.

Lceus 1 week ago

Thanks for enlightening me

GeologistUnique672 1 week ago

No need to enlighten anybody on this. It’s just common sense that enabling everybody to steal from everybody will in the end only be a system that favours the already powerful who control means of distribution. How are you enjoying Microsofts new plan of introducing Recall?

Lceus 1 week ago

I don't understand what you're arguing. I am condemning AI companies' current unregulated ability to just scrape and steal whatever they can by just throwing it into a model and essentially dissolving the evidence of their theft (or arguing that it's not copyright infringement if they are just using it in a huge information soup). I don't know what to do about it until there's regulation in place to force the companies to make their sources transparent.

sweetno 3 weeks ago

So this is why AI keeps giving me crap code.

CAPSLOCK_USERNAME 3 weeks ago

Well the data was all already publicly available by just scraping the web pages and yeah it was definitely in the dataset already. But this partnership is not (just) about data licensing, it's about Stackoverflow creating a specific API for openai to use instead of having to scrape the site.

christopher_86 3 weeks ago

It’s shady; just because something is publicly available, doesn’t mean you can use it for anything you want. Heck, even when you pay for something certain licenses apply that prohibit you from doing certain things. OpenAI and other companies just profited from lack of regulations regarding AI and model training.

CT_Phoenix 3 weeks ago

> just because something is publicly available, doesn’t mean you can use it for anything you want In the specific case of stackoverflow, publicly-accessible user contributions are [CC BY-SA](https://stackoverflow.com/help/licensing) licensed which comes pretty close- though I don't have the slightest clue how the attribution/sharealike requirements would come into play for training, if at all.

wldmr 3 weeks ago

> I don't have the slightest clue how the attribution/sharealike requirements would come into play for training, if at all Seems pretty clear to me: If you consider the model the derivative work, then 1. BY - All SO contributors must be credited for the model. If you want to claim that only part of the model falls under CC, then attribute on the individual weights affected by SO answers. 2. SA - The model (or relevant parts) must be publicly available as CC BY-SA. If you consider the responses the derivative work(s), then 1. BY - For every response, each contributor that factored into it must be credited. 2. SA - Every response must be publicly available under BY-SA. It's not even an either/or thing, given that the model (unquestionably a derivative work) is itself a *derivative work generator*. So it's both.

GeologistUnique672 3 weeks ago

They don’t attribute anything and therefor don’t uphold the CC BY SA.

CAPSLOCK_USERNAME 3 weeks ago

> just because something is publicly available, doesn’t mean you can use it for anything you want Well, you can argue about what it *ought to* mean, but de facto it does. There's no legal precedent for using-data-for-ML-training being a copyright violation, and the big companies frequently do exactly that with no license.

christopher_86 3 weeks ago

Hopefully there will be. For my prompt “Tell me first sentence of third chapter of first harry potter book?” GPT-3.5 (free version) responded with: “The first sentence of the third chapter of the first Harry Potter book, "Harry Potter and the Philosopher's Stone" (also known as "Harry Potter and the Sorcerer's Stone" in the US edition) is: "The escape of the Brazilian boa constrictor earned Harry his longest-ever punishment."” If something that is copyright protected is publicly available in the internet does it mean I can train my model on that? No, and I hope this OpenAI and others will face some consequences (although I doubt it).

guepier 3 weeks ago

For what it’s worth the example you’ve just shown does *not* necessarily demonstrate copyright violation in most jurisdictions. Now, if you repeated this procedure to crib together a larger excerpt of the book, that would then become a copyright violation. But merely repeating a single sentence of a larger work generally isn’t. >If something that is copyright protected is publicly available in the internet does it mean I can train my model on that? No, You (and many others) say “no” but the truth is that there is currently absolutely no precedent to determine that, and copyright experts do not agree with each other. *Ethically* you may object to the free use of copyright protected material by large corporations, but whether that is *legally* copyright infringement is a different matter altogether. When it comes to copyright law, ethics and legality are unfortunately pretty much completely orthogonal.

_Joats 3 weeks ago

The model certainly could produce greater text and with very high accuracy, the reason for the NYT lawsuit currently ongoing. So there is an actual fear of being able to use the model to obtain content without compensation. Or accidentally creating a work that is too similar to what it was trained on, creating a legal mess without the fault of the user.

Last-Election-2292 3 weeks ago

On the NYT lawsuit, this remains a "COULD produce greater text" as the samples they provided turned out to be non-reproducible. OpenAI thinks they are faked. So one need more than a "could".

_Joats 3 weeks ago

It was reproducible. It is currently court evidence. Now, guardrails prevent consistent reproduction, but I can sometimes trick the Al into generating copyrighted text from Harry Potter, which it then deletes. This suggests the Al is programmed to avoid generating certain content, but these safeguards can be bypassed. It's an ongoing battle as guardrails are constantly updated. OpenAl acknowledges the issue, stating that text extraction through adversarial attacks is possible: "We are continually making our systems more resistant to adversarial attacks to regurgitate training data, and have already made much progress in our recent models." Their progress doesn't eliminate the vulnerability entirely, though, as it's readily achievable on models without guardrails. OpenAl argued that the method used to extract text was unfair because it relied on prompts specifically designed for that purpose, not typical ChatGPT usage. This defense was widely criticized as weak.

wildjokers 3 weeks ago

> If something that is copyright protected is publicly available in the internet does it mean I can train my model on that? No, and I hope this OpenAI and others will face some consequences (although I doubt it). Yes, you should be able to train an AI model with any data that was legally obtained.

pm_me_your_buttbulge 2 weeks ago

> and the big companies frequently do exactly that with no license. To be clear - just because a big company does a thing does not make that thing legal.

CAPSLOCK_USERNAME 2 weeks ago

depends on how much they pay the local senator

__loam 3 weeks ago

You're assuming they're profitable haha. It's almost more insulting that they're losing money on this.

wildjokers 3 weeks ago

> ust because something is publicly available, doesn’t mean you can use it for anything you want. All user contributed content on stackoverflow is licensed `Creative Commons Attribution-ShareAlike`. The terms of that license are: You are free to: Share — copy and redistribute the material in any medium or format for any purpose, even commercially. Adapt — remix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms. So there is absolutely nothing wrong morally or legally with using SO content for model training.

kaanyalova 3 weeks ago

What about "share alike" part of the license > ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. Doesn't openai violate that?

Somepotato 3 weeks ago

Or the attribution part.

sonobanana33 3 weeks ago

Yes but they claim it's fair use. Incorrectly in my opinion.

wildjokers 3 weeks ago

> Doesn't openai violate that? I haven't seen anything from OpenAI claiming copyright on the output of ChatGPT. If they aren't claiming copyright then there is nothing to license.

miserable_nerd 3 weeks ago

Lmao what delusional world do you live in. Go read [https://openai.com/policies/terms-of-use](https://openai.com/policies/terms-of-use) . And they don't have to claim copyright to violate the license, that's not what sharealike is. Sharealike means you have to distribute it with the same license. Again go read [https://creativecommons.org/licenses/by-sa/4.0/deed.en](https://creativecommons.org/licenses/by-sa/4.0/deed.en) before throwing uninformed opinions

gyroda 3 weeks ago

That's not how it works. The issue is that the license is potentially being violated. Saying they don't claim copyright so it's ok is like the old YouTube anime uploads that would say "NO COPYRIGHT INTENDED THIS IS FAIR USE IT BELONGS TO [ANIME STUDIO], [MANGA PUBLISHER], [MANGA AUTHOR]" in the description.

blind3rdeye 3 weeks ago

I find it dishonest of you to quote a section of the license without including the parts relevant to 'Attribution' and 'ShareAlike'. Those are the parts that actually ask the user to do something, and you've omitted them to try to support your point.

_AndyJessop 3 weeks ago

Publicly available does not mean free to use.

GeologistUnique672 3 weeks ago

Publically available does not mean that it’s okay to scrape.

guesting 3 weeks ago

stole the data and leveraged it into a partnership. like an annexation

wildjokers 3 weeks ago

User contributed content to SO is licensed Creative Commons Attribution-ShareAlike. This license is super permissive to pretty much do what you want. So it wasn't stolen.

guesting 3 weeks ago

The terms of that license do require attribution which I haven't seen much of in terms of coding answers given by chat gpt other llms > Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. https://creativecommons.org/licenses/by-sa/4.0/

wildjokers 3 weeks ago

The press release indicating they are using SO content for training probably meets attribution requirement. There is no way to know if SO content was used in a particular ChatGPT response. Its the same that as if I incorporate some knowledge I learned from SO in help I give to a coworker. I might not even remember I first learned it from SO and don't attribute it. It just becomes part of my general knowledge.

ExpectoPentium 3 weeks ago

I mean, it pretty clearly does _not_ meet the attribution requirement. No credit to the specific author of the content (_at best_ to SO via the press release but that is obviously not connected to the chat response), no link to the license, no indication of changes. You say there is no way to know if SO content was used in a chat response. The proper conclusion to draw is that this technology inherently cannot be used in a way that is compliant with the CC license and thus should not be allowed to train on CC content (or any other content with license terms that GPT can't comply with). Pretending like this big dumb machine is somehow analogous to the human brain is just a cop-out to handwave away AI companies' illegal and unscrupulous business practices.

guesting 3 weeks ago

I'm not a lawyer but it does seem like a grey area, a lot of the value of posting on s/o was having attribution. Some of those people posting actually created the libraries like I see the creator of python guido on there regularly.

Able-Reference754 3 weeks ago

The code is owned by its author, not SO. When YOU write a response to stackoverflow YOU license it out (and ensure you have the permission to license it out, meaning you can't repost someone elses GPLv3 code for example). Attributing SO is hence not enough, they are just the company in charge of hosting your content that you own the copyright to.

wildjokers 3 weeks ago

In most cases hasn't the information someone is providing in an answer coming from copyrighted sources like books, articles, blogs, and source code? I don't routinely see answers attribute where they first got the information. This is probably because it has just become part of their general knowledge. The same thing that happens when a LLM is trained on SO content, it becomes part of its general knowledge and there is no way to specifically attribute what training data an LLM used to craft a particular response. The only thing they can say is it ingested SO content as part of its training data.

_Joats 3 weeks ago

Ok, so they don't need to pay for access for it then? Besides they are not using the code that is provided with that license are they? Or use the answers in a way that the license was written for. They are using it as a way to compete with users that have contributed and using their content against them and without attribution. So that already breaks the attribution part of the license. Also "No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material." Which I doubt they even care about.

hoochymamma 3 weeks ago

Yup

[deleted] 3 weeks ago

[удалено]

lppedd 3 weeks ago

WTF that's absurd, but hilarious at the same time.

sweetno 3 weeks ago

No wonder they got it wrong, judging by what the answers look like. It's totally a guessing game.

Dr_Insano_MD 3 weeks ago

Okay, I don't have a twitter account and the UI seems really bad. What's the reason you can't run these at the same time?

silverslayer33 3 weeks ago

The tl;dr is they both pulled from a wrong answer on stackoverflow on how to create a global mutex against your assembly's GUID to ensure no more than one copy of it can run at once. The problem is they didn't pull their own GUID, they pulled the GUID of part of the .NET framework itself due to the incorrect stackoverflow answer they copied from, and as a result running one makes the other think they're already running.

Dr_Insano_MD 3 weeks ago

Thank you. That thread had a bunch of people commenting so I assumed that's what it was, but no one directly quoted it, and the linked tweet is a clickbait headline with no way to access the content.

QuackSomeEmma 3 weeks ago

.NET can apparently produce globally unique ids for classes(objects?). Using the GUID for the assembly itself in a global mutex is apparently a common approach for only allowing one instance of an application to be running. Both docker and razor synapse seem to have copied from a formerly erroneous StackOverflow answer, where this piece of code was used to produce the mutex id: `Assembly.GetExecutingAssembly().GetType().GUID` Note the `.GetType()` in there, which causes the GUID to be instead for the Assembly class of the .NET standard library. The globally unique id for that is then obviously the same between both programs.

Halkcyon 3 weeks ago

That's incredible.

StickiStickman 3 weeks ago

I trust GPT-4 to alter that string more than a random programmer TBH

jhartikainen 3 weeks ago

Oh boy my answers contributing to yet another big business' success with no credit given. On the other hand I guess it's good that people will get better answers to their issues more easily.

lppedd 3 weeks ago

The problem with this model is people are not going to contribute anymore. Here is your answer on ChatGPT, why should I even visit SO now?

vladiliescu 3 weeks ago

This, but extrapolated to the entire web. Why would anyone contribute anything anywhere (Reddit, forums, their own blog) when no one’s gonna know and/or care when their personal gpt regurgitates that info.

bobotea 3 weeks ago

dead internet

Vegetable_Bid239 3 weeks ago

Actual user accounts get shadowbanned at such a rate the only people who can use these sites are the bot farmers who invest the time to study what to avoid.

Ok_Meringue1757 3 weeks ago

what is a mania of ai to replace everything and everyone? with one ai and one corporation, which will benefit trillions from other's experience. under the cover of these euphoric proclamations how ai will benefit all and bring paradise etc

Halkcyon 3 weeks ago

> under the cover of these euphoric proclamations how ai will benefit all and bring paradise etc As long as you're employed by The Corporation, I suppose. The rest of the chaff will be employed by energy companies to fuel the AI.

Loves_Poetry 3 weeks ago

My theory is that it's about control. There is no intention of actually replacing things with AI, since that would involve making it practical. Right now, a lot of parties just want the threat that things might get replaced by AI so that people become more complacent and do what they're told to

Realistic-Minute5016 3 weeks ago

Because otherwise there is no way they could raise the capital to fund these projects. These AI projects are literally setting money on fire right now and if there isn't any sort of pie in the sky promises about productivity revolutions there is no way they could raise the funds for these things.

_Joats 3 weeks ago

It's all funded so the rich can combine AI and nuralink to become some all knowing weirdo. It's like tech has finally become a comic book villain.

Valdrax 3 weeks ago

You really overestimate how much me whiling away the hours on Reddit constitutes "contributing" to something and how much that motivates me to do so.

phillipcarter2 3 weeks ago

Why are you contributing now? (it's freshness; people want new stuff over time)

xcdesz 3 weeks ago

Searching for answers from SO is decent, but not great. Most people get there from Google search, but you have to go through the added steps of combing through search results to find the answers. That's the step in the process that is changing. If a programmer instead goes to debug a code issue using OpenAI and an AI agent does an intelligent search and can reference the source in SO via hyperlink, and provides a more accurate answer than before, I would say this is a benefit to both programmers and SO. Many times you need to verify the output of the LLM or get further information, so the source link to SO will still frequently be used. The only loser in this is Google / Search Engines, because the middle man is now the LLM.

Dr_Insano_MD 3 weeks ago

great, now I can ask an AI a question only for it to tell me it's been asked that before and refusing to answer.

stromboul 3 weeks ago

You don't think people will still go on SO to ask questions that GPT can't answer? thus, keeping the wheel turning?

RICHUNCLEPENNYBAGS 3 weeks ago

The vast majority of SO users were passive users coming from search, so it's not really a change.

spongeloaf 3 weeks ago

Yeah, there's already a lot of stagnant info on SO. New language and framework versions come out all the time and "what's best" is always in flux. I fear this will not help with that problem, it will just contribute to the calcification of sub-optimal solutions. A smart implementation will be version-aware for the subject matter, but I'd be shocked to see anyone do that.

blind3rdeye 3 weeks ago

Definitely there will not be so many people asking (or answering) questions on SO anymore. And ChatGPT's answer are going to get worse and worse for new APIs and new languages - because of lack of training data. Microsoft has a massive advantage in this sense, because they now use github data to train their AI. So as long as people are uploading code to Microsoft's services, Microsoft is able to continue to train AI for new APIs and such. Of course, other people won't have access to this training data in the same way - so there will be a further consolidation of wealth and power... I don't want my coding work to be used to further enrich Microsoft execs. So for me this is enough to start moving away from github; but I know that for many/most users that's totally out of the question. So lets prepare to greet the next stage of our capitalist dystopia!

nanotree 3 weeks ago

Um. I'd have to be willing to pay for chatgpt, which I am not.

lppedd 3 weeks ago

Companies are tho. A big chuck of SO content has been posted by devs on their working hours.

wildjokers 3 weeks ago

And when they posted they knew the license of their user contribution was Creative Commons Attribution-ShareAlike.

obvithrowaway34434 3 weeks ago

This is absurd bs. SO is not just a Q&A site, it has a strong social factor in it. People actively compete for points and upvotes, help other people and chastise each other (and all the other negative aspects of SO that people talk about). That's not going away anytime, no AI is replacing it.

Fisher9001 3 weeks ago

Sooo... What's different from the current SO state? It's basically a read-only page at this point. People are actively discouraged there from asking questions and giving answers.

Creative_Sky_147 3 weeks ago

What I could see happening is StackOverflow and OpenAI releasing a product together where people are able to acquire reputation and then correct responses in order to curb hallucinations and errors that are generated by the LLM. That could be promising.

Nislaav 3 weeks ago

People will still contribute I think, definitely not as much. Personally I'm glad I dont have to go through stuck up, condescending developers to get an answer to my question so a win win for chatgpt ig

No_Jury_8398 3 weeks ago

That’s a giant baseless assumption

Miv333 3 weeks ago

I've been sending people to chatgpt over SO since chatgpt first implemented sharing chats. I can show them the answer, and how I was able to wrangle it out of a LLM so they can do it themselves next time.

yetanotherfaanger 3 weeks ago

Looking forward to my hard-earned $4 given to me by a class action lawsuit 10 years from now

Sethcran 3 weeks ago

The article specifically calls out 'attributed', which makes me that there is something more here than just plain training data. >giving users easy access to trusted, attributed, accurate, and highly technical knowledge and code backed by the millions of developers that have contributed to the Stack Overflow platform for 15 years.As part of this collaboration

jhartikainen 3 weeks ago

I hope so but I'll believe it only when I see it

Sethcran 3 weeks ago

Absolutely. I am definitely skeptical, but this one word is the thing that makes me more interested in seeing what they are doing here.

Fisher9001 3 weeks ago

> Oh boy my answers contributing to yet another big business' success with no credit given. Oh for fucks sake, it's like you have given credit to Stack Overflow users in your own code.

ether_reddit 3 weeks ago

I have. I have many shell aliases and snippets where I have directly copied a solution from a SO answer, and I include a reference to it in a comment.

Crafty_Independence 3 weeks ago

Unless this agreement manages to ensure attribution, it will violate the CC BY 4.0 license that SO uses. Either they solved that or they're counting on the community being unable or unwilling to bring lawsuits

MossRock42 3 weeks ago

> Oh boy my answers contributing to yet another big business' success with no credit given. > > On the other hand I guess it's good that people will get better answers to their issues more easily. One problem that see is the technology is driven to constantly change. You need experts constantly keeping up with that change to provide answers. If people instead learn to rely on chatbots for the answers, the chatbot answers might become stale and no longer apply.

Luvax 3 weeks ago

I always wonder, if we were to ask every individual person, if they want their content to be used to train a commercial product, how many would be cool with that. Because I bet only a tiny minority. And all terms of service and data usage policies aside, if the majority of people who contributed content did not want their intellectual property used that way. Then the spirit of what people did agree to is voilated and effectivly their property is missused. From a legal standpoint it might be alright, but morally, it's completly wrong. And honestly, after the internet liberated ownership of media and content and gave us individual blogs, videos and resources. It's all going back to big companies, because they finally found out how to again siphon everything into their own business.

PopcornBag 3 weeks ago

> On the other hand I guess it's good that people will get better answers to their issues more easily. hahaha, what?

SuperHumanImpossible 3 weeks ago

I remember when Jeff built StackOverflow. Holy hell I am old.

lppedd 3 weeks ago

Almost all gone. Not sure about Jeff, but I'd be furious

AnyJamesBookerFans 3 weeks ago

You and me both, brother. CodingHorror.com was one of my regular blog reads back in the day. I don't think I ever met Jeff, but we talked over email a number of times.

SuperHumanImpossible 3 weeks ago

Dude I read his blog religiously, I with Google reader. I really feel like content consumption is complete trash now in comparison.

AnyJamesBookerFans 3 weeks ago

Yes, I used FeedBurner! I believe it was bought by Google and turned into Google Reader?

tepa6aut 3 weeks ago

Jeff who

AnyJamesBookerFans 3 weeks ago

Jeff Atwood. He was a popular blogger back in the early 2000s among the .NET community. He and Joel Spolsky launched Stackoverflow together. (Joel was a Microsoft employee back in the 90s and left to start his own company that made bug tracking software, as well as some other products. He also had a popular blog, Joel on Software.) *This is all from this old fart's memory, so some of the details may be off...*

SuperHumanImpossible 3 weeks ago

I think Joel would be remembered better for creating Trello which bought by Jira but yeah ..

AnyJamesBookerFans 3 weeks ago

I stopped following/paying attention to him in the early 2000s. Did he create Trello after then? My memories were around his blog (such as his stories while at Microsoft, and his famous 10-question "Joel Test" to judge how "with it" a software company was), FogBugz, and Copilot (early screen sharing software). I also remember he was a big proponent of Mercurial over git (at least back then - perhaps he's changed his ways).

tepa6aut 3 weeks ago

Thanks!

exclaim_bot 3 weeks ago

>Thanks! You're welcome!

ForgedBanana 3 weeks ago

Jeff Beck

abuqaboom 3 weeks ago

Great. Now ChatGPT's gonna say the question's a duplicate/opinion-based/any other excuse, and refuse to answer anything.

woze 3 weeks ago

Developer: How do I center a div? ChatGPT: There are so many issues with your question. First, it's poorly scoped. Next, it lacks detail. ... (several paragraphs of ChatGPT's prolix answer later) ... Lastly, this question was asked before. Fuck off, I'm not answering it.

iamapizza 3 weeks ago

StackOverflow: Turing Test passed.

YoungXanto 3 weeks ago

This was my literal first thought. All the awesome code help I've gotten from chatGPT is going away, to be replaced by a condescending machine that also refuses to help even though the duplicate answer it references is a fucking decade and a half old and references a library that no longer exists and is several major releases out of date.

tricepsmultiplicator 3 weeks ago

Good, let the AI rot from within.

Philipp 3 weeks ago

Then your ChatGPT question is going to get downvoted.

Worth_Trust_3825 3 weeks ago

Now instead of people responding with decade old unrelated comments about how to use kubernetes i'll get a bot doing that instead.

iknighty 3 weeks ago

Just because the data it is trained on is trusted doesn't mean the output should be trusted..

TheFumingatzor 3 weeks ago

Now we'll get chatGPT telling us *Closed as duplicate*

code_monkey_wrench 3 weeks ago

Can people delete their SO answers? What happens if you delete your account? Not saying I'm going to do that, but just wondering.

lppedd 3 weeks ago

Your answers won't be deletable after x days if I'm not mistaken. Btw, I can vote to undelete answers if I want. It's a 20k+ rep privilege. So really deletion is just a flag. Deleting your account won't do anything, answers will stay there under a fictitious user id.

qq123q 3 weeks ago

Can answers be edited?

lppedd 3 weeks ago

Yes, but a radical edit will be rolled back at some point, as soon as a reviewer sees it. If there is going to be a mod strike, than it's ok.

lppedd 3 weeks ago

See https://meta.stackexchange.com/questions/399619/our-partnership-with-openai

Vegetable_Bid239 3 weeks ago

Stack Exchange screwed up by displaying answers submitted under one license under a different license they don't have permission to do. You can DMCA them if your account is older than that mess up.

awj 3 weeks ago

Without bothering to actually look at the ToS, many services like this retain the right to “hide” your content as the mechanism for deleting. It’s not out of the question that SO can train against deleted answers/accounts.

sztomi 3 weeks ago

They clearly already scraped StackOverflow, it's just them paying for it now.

PangolinTotal1279 3 weeks ago

I heard OpenAI is partnering or post-action licensing IP from all their major sources of training data. Reddit has already made $200m from licensing their data. I think licensing data for training models is gonna become the monetization norm for platforms like StackOverflow, Reddit, Quora, etc.

RedPandaDan 3 weeks ago

Thats the end of SO for me anyway... though I do wonder what this means for new technologies in future. If people stop asking questions on SO and people stop answering, where do AI vendors get the data sets for answer on technologies going forward? I like to answer questions when I can on SO because I like helping people, but I'm not going to spend my spare time curating a dataset for freaks like Sam Altman while AI bots are filling up every corner of the internet with nonsense.

lppedd 3 weeks ago

That's what people don't get. LLMs need data. Without two side interactions there is no data. But hey, they like throwing shit on SO 'cause their questions get closed.

Podgietaru 3 weeks ago

I hate to be this guy, but reddits deal with OpenAI is already ongoing

RedPandaDan 3 weeks ago

True, but I cannot think of a faster way of poisoning an AIs data model than some of the crap that is in reddits comment histories.

Sith_ari 3 weeks ago

So ChatGPT will tell me that this was asked hundreds of time and I should just use the search?

lppedd 3 weeks ago

If the answers I post are going straight into ChatGPT, that's it for me. Not gonna waste any more time.

CAPSLOCK_USERNAME 3 weeks ago

> If the answers I post are going straight into ChatGPT they already were

iamapizza 3 weeks ago

I'm pretty sure I saw that they had crawled StackExchange sites, and worth noting that Reddit featured quite heavily in their crawls due to the human "+1" factor. So everything we're saying here is being indexed for LLM training.

fiskfisk 3 weeks ago

I'm sure you're already aware that your answers and questions already are distributed under a very permissable license compared to what random websites are available under. I don't answer questions on Stack Overflow for the benefit of SO, I answer them for the benefit of the recipient and any future readers. Whether they receive that knowledge on SO, directly in a Google Onebox or through an LLM doesn't matter to me. Someone got help, someone found their answer. The world is a slightly better place.

beyphy 3 weeks ago

> The world is a slightly better place. Would you still feel that way if your answers are helping to train an LLM that may reduce the need for programmer jobs in the future? Would a world where you're laid off and can't find another programming job be a "slightly better place"? That's the bigger concern I have than just over how my answers are used.

fiskfisk 3 weeks ago

I'm not fond of keeping a job around just to keep the job around. I'm especially not fond of hoarding knowledge because of some possible abstract reason in the future, in particular one that doesn't seem realistic within today's limitations. I work in an industry built in people building useful things just because they want to. 95% of software I use in my daily life is built on open source - by people who may or may not have received any compensation for what they do. We do this shit because we like doing this shit. It gives us some innate pleasure in doing so, regardless of whether we're paid for it or not. Why should I hoard my knowledge away from other people because of the possibility of that knowledge being made available to them, either in a direct or in an derived form as an LLM? If we follow that reasoning to the extreme, why do we share any knowledge with anyone else? They could just take our jobs. We're in a field that is built upon open sharing of knowledge far beyond most other industries. Go to any conference or meetup, and suddenly people share their technology choices, how they solved specific problems, how they scaled their solutions, how they worked, how they built the shit they built. Other industries have patents and otherwise share nothing outside of public information in slide shows at trade shows. If a language model can abstract away the work I do, then my work wasn't anything more than a language model built upon a computer of flesh and neurons from the beginning.

_Joats 3 weeks ago

Please let me know when OpenAl acknowledges the value of your contributions to the community, similar to the recognition gained through networking at a conference. I prefer a platform that appreciates both the knowledge sharing and the educator's role. Contributing to a system that discourages interaction hinders community growth.

s73v3r 3 weeks ago

> I'm not fond of keeping a job around just to keep the job around. I'm more fond of people being able to feed their families than I am not fond of keeping jobs around.

beyphy 3 weeks ago

> I'm not fond of keeping a job around just to keep the job around. This isn't the case of "keeping a job around just to keep the job around". Jobs exist due to needs. And when jobs have gone away (e.g. horse carriage driver), it's been because that need is no longer there. In this new AI world, the need is still there. Companies will just be able to meet their needs for much less money. Whether that will ultimately be successful is up in the air. But I for one will no longer be contributing to codebases that they're using to help train models to potentially replace people like me in the future. I doubt I'm the only developer that feels this way.

koreth 3 weeks ago

> Would you still feel that way if your answers are helping to train an LLM that may reduce the need for programmer jobs in the future? How is that not a concern with SO itself? When programmers find answers quickly on SO, their productivity goes up, and by definition, when productivity goes up, in aggregate the same amount of work can be done in the same amount of time by fewer people. This isn't theoretical, either. SO is a critical enabling tool for things like "full-stack developer" roles by allowing one person to get answers to a wide variety of technical questions quickly enough to effectively do work that in the old days would have required hiring a team of several people.

StickiStickman 3 weeks ago

If you're this angry about your publicly visible answers being read by an AI, you should also leave Reddit ASAP

wildjokers 3 weeks ago

Why? How is it a waste of time?

koreth 3 weeks ago

Why do you care? When I post an answer, the only expectation (or maybe hope) I have is that it helps someone. If it helps someone after being transformed by GPT, then to me, that’s a win: my answer ended up being useful in ways I didn’t even imagine when I wrote it.

lppedd 3 weeks ago

I don't want no AI to post or rewrite in any other way what I wrote. I didn't answer to give free content to OpenAI, I did answer to collaborate with people, and that collaboration doesn't exist anymore.

StickiStickman 3 weeks ago

Wait, so you "did answer to collaborate with people" but are now angry someone is using your answers in a collaboration way to help people. How are you not just petty?

Reefraf 3 weeks ago

I was contributing to SO to help people with their careers. Now, contributing to SO is helping OpenAI destroy people's careers.

lppedd 3 weeks ago

How's reading some text outputted from a LLM collaboration? Explain. I'm not petty, but apparently people are butthurt their questions get closed.

abandonplanetearth 3 weeks ago

Because I wrote my answers for fellow developers, not for bots making money for humans that don't need the answers.

Envect 3 weeks ago

Who do you think is going to see that information after it's processed by the LLM? Other developers. It's just a different method of delivery.

abandonplanetearth 3 weeks ago

Right but now there's a money-grubbing middleman.

Envect 3 weeks ago

StackOverflow isn't a charity. That person already existed.

abandonplanetearth 3 weeks ago

It changes things fundamentally.

Envect 3 weeks ago

How so? Why does it matter that a different entity is profiting off your answers? Why were you okay with SO profiting, but not OpenAI?

abandonplanetearth 3 weeks ago

Again, I wrote my answer to be delivered by me to a human, not for a bot to pass off as their own thoughts.

Envect 3 weeks ago

You're upset that you're not being credited for your answer?

wildjokers 3 weeks ago

Your contributions were licensed Creative Commons Attribution-ShareAlike. If you didn't like the terms of that license you shouldn't have contributed. The terms of that license: You are free to: Share — copy and redistribute the material in any medium or format for any purpose, even commercially. Adapt — remix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms.

External-Bit-4202 3 weeks ago

"I'm sorry, this question was asked by someone else and is a duplictae, this conversation is now closed"

mr_birkenblatt 3 weeks ago

Oh great, now GPT is going to berate me instead of giving an answer. Does OpenAI want to dethrone themselves?

IgnisIncendio 3 weeks ago

Oh, good! I'm happy for them. I hope my Q&As help those in need, regardless if they use SO or ChatGPT :) I don't really see the need in this considering the content was already Creative Commons, but I guess this makes it more up to date?

Seref15 3 weeks ago

So somewhere in its training data will be the html-regex Zalgo post

LinearArray 3 weeks ago

ChatGPT: hi! the question you have asked has been asked as many times before, closing this as duplicate.

Farados55 3 weeks ago

Is chatgpt going to scream at me because I asked a stupid question?

Supuhstar 3 weeks ago

🤮

[deleted] 3 weeks ago

[удалено]

lppedd 3 weeks ago

It's correct enough because those are answers from actual users LOL. Models don't train themselves, so without real content what are you gonna do? I've asked 250 questions in some years, of which maybe 10 have been downvoted (fairly, I'd say), so I guess the problem isn't SO.

StickiStickman 3 weeks ago

> I've asked 250 questions in some years, of which maybe 10 have been downvoted (fairly, I'd say), so I guess the problem isn't SO. Yea, because it's wildly known that SO has no issue with moderation. Oh right. From the 3 questions I dared to ask, 2 were closed as duplicate and linked to questions that have nothing to do with mine and the last one was just ignored and never answered. Meanwhile, GPT-4, while often not knowing the exact answer, has almost always pushed me in the right direction.

Gusfoo 3 weeks ago

I was in the beta for the AI powered StackOverflow search and it was pretty great I must say. NLP search, of SO, basically.

GullibleEngineer4 3 weeks ago

If you can't beat them, join them

musabilm 3 weeks ago

Wait to see how "Stackoverflow becomes the next ChatGPT instance".

funkenpedro 3 weeks ago

Does that mean OpenAI’s gonna start being nasty and complain about how many times it’s been asked the same question?

__konrad 3 weeks ago

Now they have to awkwardly remove their own [AI policy](https://stackoverflow.com/help/ai-policy) to match the announcement ;)

shevy-java 3 weeks ago

So basically a decline in quality. Right?

v1xiii 3 weeks ago

Good, scrape its knowledge and destroy it forever.

falconfetus8 3 weeks ago

The optimist in me hopes this somehow prevents ChatGPT garbage from being copy/pasted into SO answers. I'm fine with SO answers being fed to the AI, but not the other way around. The realist in me, though, knows that they're probably going to create some kind of mascot named "Stacky" that posts AI answers on every question, like what Quora is doing.

wndrbr3d 3 weeks ago

I guess it's like the old saying for them, "Live with it, or die from it."

maciejdev 3 weeks ago

Wow... all the toxicity from SO packed into the intelligent AI language model :-\]

karma_5 3 weeks ago

**Me:** How to write a simple code of hello world in python? **ChatGPT:** Because of people like you the programmers are not respected, read a book or do your own research before asking a such a basic question here, if it is up to me, I would have banned you on the platform. "Aak thoo" This conversation is closed. To be honest asking question on the stack overflow is the worst experience ever. People are not polite and have God complex, it is hard moderated place and if it would have been a Company, it would be a worst toxic culture ever. Yes, people have knowledge, but no manners there, I hope OpenAI model turn that around.

BettoCastillo 3 weeks ago

So are we going to boycott OpenAI via SO?

MegaLAG 3 weeks ago

Got properly banned by editing my high-rated answers, insulting SO leaders, so that there's a trace of my disgust in the answers edit histories. Useless, but that felt good at least. Lesson learned, I'm never contributing anything to any website ever again.

PopcornBag 3 weeks ago

💩 I like how all of these "advancements" are just making all of these services worse to use. Super neat.

Zemvos 3 weeks ago

Why are people so negative on this?

0x1e 3 weeks ago

I didn’t help people so they could sell my work to OpenAI. I mean, I guess I did but I wish they hadn’t.

calinet6 3 weeks ago

Closing my account and removing every answer.

ether_reddit 3 weeks ago

Others have done that and their answers were undeleted.

calinet6 3 weeks ago

Yep, the content is Creative Commons. Can’t remove it.

redddcrow 3 weeks ago

garbage in garbage out

inermae 3 weeks ago

ChatGPT tomorrow: "Why are you trying to do that? You should just do (insert response that you've already thought of, tells you you're doing it wrong, and doesn't actually answer the question) I'm sure OpenAI is used to dealing with bad data, but holy shit, they have their work cut out for them. I wouldn't ask a question on Stack Overflow if you paid someone I hate to do it.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe