TL;DR (this is a lengthy post, but stay with us until the end: as a lawyer, I am not allowed to be brief):
We are, unfortunately, seeing more and more commercial entities collecting public data, including Reddit content, in bulk with no regard for user rights or privacy. We believe in preserving public access to Reddit content, but in distributing Reddit content, we need to work with trusted partners that will agree in writing to reasonable protections for redditors. They should respect user decisions to delete their content as well as anything Reddit removes for violating our Content Policy, and they cannot abuse their access by using Reddit content to identify or surveil users.
In line with this, and to be more transparent about how we protect data on Reddit, today we published our Public Content Policy, which outlines how we manage access to public content on our platform at scale.
At the same time, we continue to believe in supporting public access to Reddit content for researchers and those who believe in responsible non-commercial use of public data. This is why we’re building new tools for researchers and introducing a new subreddit, r/reddit4researchers. Our goal is for this sub to evolve into a place to better support researchers and academics and improve their access to Reddit data.
Hi, redditors - I’m u/Traceroo, Reddit’s Chief Legal Officer, and today I’m sharing more about how we protect content on Reddit.
Our Public Content Policy
Reddit is an inherently public platform, and we want to keep it that way. Although we’ve shared our POV before, we’re publishing this policy to give you all (whether you are a redditor, moderator, researcher, or developer) a better sense of how we think about access to public content and the protections that should exist for users against misuse of public content.
This is distinct from our Privacy Policy, which covers how we handle the minimal private/personal information users provide to us (such as email). It’s not our Content Policy, which sets out our rules for what content and behavior is allowed on the platform.
What we consider public content on Reddit
Public content includes all of the content – like posts and comments, usernames and profiles, public karma scores, etc. (for a longer list, you can check out our public API) – that Reddit distributes and makes publicly available to redditors, visitors who use the service, and developers, e.g. to be extra clear, it doesn’t include stuff we don’t make public, such as private messages or mod mail, or non-public account information, such as email address, browsing history, IP address, etc. (this is stuff we don’t and would never license or distribute, because we believe Privacy is a Right).
Preventing the misuse and abuse of public content
Unfortunately, we see more and more commercial entities using unauthorized access or misusing authorized access to collect public data in bulk, including Reddit public content. Worse, these entities perceive they have no limitation on their usage of that data, and they do so with no regard for user rights or privacy, ignoring reasonable legal, safety, and user removal requests. While we will continue our efforts to block known bad actors, we can’t continue to assume good intentions. We need to do more to restrict access to Reddit public content at scale to trusted actors who have agreed to abide by our policies. But we also need to continue to ensure that users, mods, researchers, and other good-faith, non-commercial actors have access.
The policy, at-a-glance
Our policy outlines the information partners can access via any public-content licensing agreements. It also outlines the commitments we make to users about usage of this content, explaining how:
- We require our partners to uphold the privacy of redditors and their communities. This includes respecting users’ decisions to delete their content and any content we remove for violating our Content Policy.
- Partners are not allowed to use content to identify individuals or their personal information, including for ad targeting purposes.
- Partners cannot use Reddit content to spam or harass redditors.
- Partners are not allowed to use Reddit content to conduct background checks, facial recognition, government surveillance, or help law enforcement do any of the above.
- Partners cannot access public content that includes adult media.
- And, as always, we don’t sell the personal information of redditors.
What’s a policy without enforcement?
Anyone accessing Reddit content must abide by our policies, and we are selective about who we work with and trust with large-scale access to Reddit content. We will block access to those that don’t agree to our policies, and we will continue to enhance our capabilities to hunt down and catch bad actors. We don’t want to but, if necessary, we’ll also take legal action.
What changes for me as a user?
Nothing changes for redditors. You can continue using Reddit logged in, logged out, on mobile, etc.
What do users get out of these agreements?
Users get protections against misuse of public content. Also, commercial agreements allow us to invest more in making Reddit better as a platform and product.
Who can access public content on Reddit?
In addition to those we have agreements with, Reddit Data API access remains free for non-commercial researchers and academics under our published usage threshold. It also remains accessible for organizations like the Internet Archive.
Reddit for Research
It’s important to us that we continue to preserve public access to Reddit content for researchers and those who believe in responsible non-commercial use of public data. We believe in and recognize the value that public Reddit content provides to researchers and academics. Academics contribute meaningful and important research that helps shape our understanding of how people interact online. To continue studying the impacts of how behavioral patterns evolve online, access to public data is essential.
That’s why we’re building tools and an environment to help researchers access Reddit content. If you're an academic or researcher, and interested in learning more, head over to r/reddit4researchers and check out u/KeyserSosa’s first post.
Thank you to the users and mods who gave us feedback in developing this Public Content Policy, including u/abrownn, u/AkaashMaharaj, u/Full_Stall_Indicator, u/Georgy_K_Zhukov, u/Khyta, u/Kindapuffy, u/lil_spazjoekp, u/Pedantichrist, u/shiruken, u/SQLwitch, and u/yellowmix, among others.
EDIT: Formatting and fighting markdown.
So if I am reading this right, reddit will still bundle and sell bulk user data, but there will at least be some privacy restrictions and respect for EU and California privacy laws. What is changing is that random groups that may or may not care about all of the laws will not be allowed to scrape and sell Reddit data.
I am glad that researchers will still be supported though. There actually is valid research that is done, and supporting that is valuable.
Of course, reddit bulk user data will only be valuable for another year or two, and then chatgpt bots will have so thoroughly polluted it that it becomes more or less worthless.
The ultimate question is what will Reddit, Inc. do about these non-partner groups that are violating the policy? Should we expect Reddit to start filing lawsuits?
For those who we find are violating the privacy of redditors, we have a number of different ways to respond. Our options range from asking you nicely to knock it off to more aggressive actions. It’s always great when the former works promptly.
Ah, the "speak softly and carry a big stick" strategy.
Are there any plans to inform users about such violations? Might be nice to know who's not playing by the rules.
This has been an issue since day one. Anyone can be banned from the platform under the guise of let's say, "hate" but reddit doesn't provide a clear definition on what it is, and how this is against policy under definition. So, just more chiefs carrying big sticks telling you to "shut up or I'm banning you" attitudes.
...why can't any of the people down voting me provide an explanation instead of just hating all the time? A little transparency instead of carving out loopholes for yourself would be nice.
A downvote is not hate.
I did not vote.
Maybe ask yourself whether your previous comment oversimplified and/or overgeneralised things.
[deleted]
Quite right. There is a lack of transparency.
I'd much prefer to be able to read what was banned, because I want to. Reddit denies us this by removing content, including nullifying real people (I don't mind spam and bots get removed, but I do mind censorship done against real people - what ever happened to free speech, anyway? Why do laws in reallife protect free speech but on reddit this is all ignored?).
Because this subreddit in particular is full of moderators who are on that same very power trip.
That makes sense. I remember when warnings and explanations of those violations were the norm, now, no explanation, no warning, just you're banned. There needs to be more transparency in these said violations so we know how to proceed. Not knowing if something violates a "policy" or not leaves people to self-censorship much of the time, due to this fear. I just wish it would stop, because its nonsensical. Its more tyrannical by the day, where many people are leaving, and AI is taking over, and Reddit as a whole is going down the toilet of ie: Big Tech authoritarianism instead. That is another reason why many have left these platforms to go to alt platforms where there is greater transparency and greater freedom. What is your take on an "internet bill of rights"
[deleted]
Really? Then why is rumbl3 dot com a blocked domain?
Perhaps alternative platforms will eventually rise up to the challenge.
I'd love to have a good alternative; the censorship on reddit kills everything.
I had an account (another one) for many years, I think since 2010 or even before that. The changed policies ruined a large part of reddit.
[deleted]
Ah, so reddit also uses AI to auto-ban? That may explain why they got so much more aggressive in general in the last ~15 months or so.
Hey I need your help can u inbox me
I need help please inbox me
N O T H I N G
I'm trying to figure out how to become a trusted user so that I'm allowed to post.
I think the doj should investigate reddit
Can I opt out of my personal stories and conversations on Reddit being sold to AI chatbot developers?
The silence is your answer.
Can you opt out of speaking out in a public space, and having other people present hear and remember what you said and then make something out of that (i.e. adopting an opinion, using it as a source of information, or being inspired by it)?
I fully agree with your sentiment on any kind of conversation that is supposed to occur in a private space (i.e. DMs), but subreddits are pretty much themed open forums. Think a theme cafe or a clubhouse. You cannot expect to have full privacy control over your words after they have left your mouth in a public space,
and neither should you expect the same from a public site such as reddit.
The fact that anything written on the internet is digitally available in potential perpetuity doesn't change that initial premise.
There's a difference between the fact that public statements are obviously accessible to everyone and the fact that reddit intends to sell all of our conversations to AI chatbot developers.
If the chatbot developers were continuing to simply scrape publicly available data from a publicly available API like in the old days, that would be one thing, but the idea of my conversations being specifically sold to AI chatbot developers for profit makes me feel icky.
And that's where your analogy doesn't really hold up. It'd be more like you speaking out in a public space and someone else recording it and selling the video of you for profit.
Hmmm, that's a good point. I don't see a reason to complain about the general public getting access to whatever I say in public, but when a 3rd party specifically gets control over what of my public remarks are available to whom, profiting off of selling exclusive rights to something that should be innately public, we can agree that's an issue.
Thanks for correcting my analogy, I indeed didn't consider the "sold to" detail well enough.
The comparison falls flat because reddit censors discussions at will, whereas speaking in a public space is protected e. g. by the US constitution as such. I think loigcally the US constitution needs to extend onto reddit too - otherwise the constant censorship will continue to be rampant here.
It is not. Speech, in public spaces or not, is only ever protected from infringement by the government. It neither applies to any other actor (such as a person or company), nor does it even mention privacy, and thus does not apply to the context of the discussion to begin with.
This is nice in theory, but lets say we have an AI being trained on reddit users' data - which we do. Our comments and content are part of that dataset. We've seen that AI models can be used to output their training data in many cases, because they encode a lot of that training data inherently in the model
So with this in mind:
If I delete content from my reddit account, are you saying that these companies will be forced to delete that content from their training data, and retrain their models?
Similarly, if I train an AI model on reddit content, and that AI model is then put into the public for other people to use, someone might ask it "Does the reddit user /u/james20k have any questionable information in their background I should know before I hire them for a job?". That AI model will have been trained on a dataset that contains a significant amount of information on me, and it will have an answer
Does the no-background-checks etc encompass a commitment to prevent partnered large language models trained on reddit being used by downstream third parties for these purposes, or does it only encompass the immediate third parties themselves using it directly for these purposes?
You make a good point here. The privacy principles in this policy sound great, but as soon as LLM's get involved, those principles fall somewhere between impractical and impossible to follow in practice. My guess is that they'll have to stretch the meaning of the policy to create loopholes for AI, or else just play dumb and ignore the ramifications of AI entirely.
More importantly, will Reddit honor the spirit of deletion or will they pull a Stack Overflow: https://arstechnica.com/information-technology/2024/05/stack-overflow-users-sabotage-their-posts-after-openai-deal/
I don't want my content in Reddit's LLM models, do I have ways to preventing that?
Can you explain what's meant when you say partners have to respect user decisions to delete their content? Like, suppose they've bulk downloaded a bunch of info containing my posts. How would they ever know if I deleted my Reddit account later?
For those that do legitimate bulk download of Reddit content, we provide a compliance API that notifies them when content is deleted by users. See https://support.reddithelp.com/hc/en-us/articles/26417433892756-Do-Reddit-s-data-licensees-have-to-stop-using-data-deleted-from-Reddit.
They have an API that says tells legitimate partners "pretty pretty please delete these things" and that's it.
Hi traceroo,
First off, I just want to say how happy I am to see a public data policy, particularly one that forefronts user privacy (unlike some other platforms *cough cough*). I know this is something you all have been thinking about for a while, but given that one of Reddit's key assets right now is its data, making those internal policies and values public is even more important now than ever.
I have a couple of questions about details:
Thanks SarahAGilbert! Great questions.
As to (1), this is another reason we want to understand what third parties are doing with publicly-accessible content. Removed content can be particularly useful in helping create powerful tools for moderation teams. But there are nuances here that those with experience moderating communities would appreciate, and it is still paramount that the developer respect the privacy expectations of redditors.
As to (2), that is definitely something we are pondering. We prefer convincing third parties that our policies make sense, but sometimes conversation is not enough unfortunately.
Thanks for your response!
So if I'm understanding correctly, moderated data is currently being treated as public data, but that it's something you're working with mods on? That's great!
For 2, I'm glad to hear you're considering it! I've done some related research showing that awareness helps people feel more comfortable and less concerned when their data is reused, so I think it's also important to share who the licensees are, not just the ones who've violated the policy. The results of the same paper show that context matters to people, including who is using the data (and what data is used, and for what purpose). So that added level of awareness and transparency would help people make more informed decisions about their participation on Reddit, which I know y'all care about.
Yet as a lawyer, you are allowed to prepare briefs. Ironic, no?
On a more serious note, thanks for keeping us updated on Reddit's efforts to protect our privacy.
Tools to access deleted posts are crucial to modding. Banning such tools will cripple us.
FWIW, Reddit's CTO said in the other thread that Pushshift will not be impacted by this policy.
What about pullpush?
PullPush has never operated in accordance with the Data API terms of service and was sent a cease and desist order months ago for their repeated violations. After seeing the owner/operator's behavior here on Reddit and screenshots from their Discord, I would not touch that service with a ten foot pole, particularly as a moderator of a reputable subreddit.
Good to know, some of these so-called "services" need a major audit.
OK, thanks.
Pullpush is (IMNSHO) a major motivation for the adoption of this policy. It is maliciously operated.
We totally understand, and we are working on approaches that protect redditors’ privacy while allowing the proper investigation of bad actors.
It's hard to have much faith when the pattern of "ban/disable something, promise a replacement, radio silence for 5 years" keeps happening over and over.
I'm going to hazard a guess you're not going to get a reply.
So I mean, at least they're predictable in lying through their teeth.
Yeah right. Reddit removed a shitton of moderator tools when you guys basically banned 3rd party applications from accessing the API while at the same time providing absolutely no beneficial alternatives. Please don't pretend you guys give any shred of a crap for what moderators do. You'd imagine that the lies you guys seep through your teeth would've worn down the enamel in your fake smiles by now.
I'm sure its propagated by the massive amounts of corroded ear wax of issues that seemingly they only want to hear, while finding other ways to discriminate others without saying they are doing it.
*unless they pay us $60 million dollars. They they can have all of it.
[deleted]
TIL! Also, username checks out.
When you say personal information here, what exactly qualifies? Are you aligned with GDPR's definition of personal information, CCPA/CPRA's, or is this section referring only to PII?
Are there limits on how partners use anonymized personal information that they collect from Reddit? For example, could Google construct a machine learning model that uses my Reddit personal information to conclude that "BIPOC men like plastic robots" without identifying me personally?
If Google then independently identifies me as a BIPOC man using its own data collection and targets ads to me accordingly outside of the Reddit platform, is this a violation of the policy?
Can you expand a bit on what this might look like?
u/traceroo two questions.
How can I determine when content is deleted without re-accessing it from the API each time? I'm fairly sure your commercial partners have access to a feed of deleted object ID's to remove from their data set, but that's not available to the rest of us.
If content is public on reddit, does that mean we can keep using it even if the author doesn't want us to (outside things like copyright)?
Re: #1, they have access to the Firehose API which, as you said, includes a feed of deleted object IDs. It's been very unclear how everyone else, including Devvit app developers, are supposed to operate without access to it.
Is anything happening with the "allow my data to be used for research purposes" preference? It still shows up in preferences (at least on old.reddit), but it doesn't seem to have any effect on this
Thanks for including us in the process!
Thanks for taking the time to discuss it with us!
Thanks for working to protect Redditors and for seeking out user/mod feedback as part of the process!
If you’re reading this and are interested in giving Reddit feedback on various aspects of the platform, consider joining one of Reddit’s collaborative programs. Check out the User Feedback Collective and the Mod Council. 🎉
Edit: fixed a typo
Thanks for the shoutout of these great programs! We’re always looking to source and incorporate candid, constructive feedback from redditors.
If only the last decade didn't show Reddit's pattern of mostly ignoring user feedback.
Unless you've got a time machine, you can either comment on the spilled milk, or be happy that it's not getting spilled (as much), you do you.
If you do have a time machine, I'd like to borrow it!
What are you talking about? Reddit did listen to user and mod feedback
Remind me again how much the API costs for moderator tools to access it? I remember there being a massive thing that happened last year when reddit decided to ban critical moderator tools through price gouging the API to an absurd level. :)
They got rid of gold and awards, which if they had solicited feedback for, would have been met with feedback telling them not to given they're rolling back some of the changes.
New reddit exists, despite being told that it was bad at all stages, and now there's new new reddit to fix it.
Their apps suck ass, and they don't take any feedback to fix it. Just look at the state of
r/beta, sorryr/redditmobile, sorry, it's r/bugs where they want bugs to be reported so they can ignore them all in one place.This comment from 3 months ago about multiple flairs to which the admin replied "You may be surprised to hear this, but we haven’t seen/heard mods request the ability to have multiple flairs on a post much before". And another user brought receipts dating back literally 10 years of this exact request.
Third Party Apps feedback (need I say more)
[deleted]
Because new new reddit is even worse.
Worse in what points? I find it better in speed and the new mod queue is really handy
You can't see who posted something without clicking through. You can't go directly to the image/article without an extra stop at the comment section. That's just the first two that leapt out.
That has already been the case on new.reddit. It also only happens on the home feed. When you go to the subreddit page and browse there, you can see the username.
Just click on the image and you'll be directly on the image. No stop at the comment section.
Just click on the full link or the article thumbnail and you'll go directly to the article. Also no stop at the comment section.
That doesn't make it better.
I see what you mean about tapping on the thumbnail, but you can't do that for text posts. Also it's still loading in a new page - I don't want that when I'm browsing the feed. I want it opening inline so I don't have to use back and re-scroll, or fuck about with tabs.
Like this https://imgbox.com/RXBL2siS
Edit: just found another one. When you edit a comment, it loses all the line spacing https://imgbox.com/hT3HknFP
YMMV. https://sh.reddit.com/comments/1co0xnu/-/l3f1buc/, for example, is:
Let's not hijack this post :)
Where best to discuss?
TIA
How's that CSS support going?
It does exist for old.reddit and I don't think that custom CSS would be supported on sh.reddit or new.reddit. Custom CSS per Subreddit makes the UX experience not homogeneous and takes away from the Corporate Identity that Reddit tries to establish.
So they lied when they promised CSS support would be coming to new reddit then?
When and where did they say that?
You new here?
Downvoted for rudeness in response to a polite question.
I was not previously aware. Rudeness in this situation is a massive turn-off.
/u/miowiamagrapegod enlightens with facts without belittling people.
Lol.
It's literally in the subreddit design ui
https://www.reddit.com/r/modnews/comments/6auyq9/reddit_is_procss/
Thanks for the invite to the roundtable discussion!
I'm curious what implications this might have for this new policy? It looks like the judge is ruling that:
The conflict seems to arise from trying to claim both Section 230 safe harbor protections and ownership and exclusive control of platform content:
So, how does this affect reddit? It seems to me like the judge is saying that platforms don't get to charge for access to public data without losing access to certain legal protections. Here's the judge's order, for anyone who's interested.
Thanks for involving us in the process! Are there any plans to improve the "make your content non-public" process? Right now it's extremely tedious to bulk delete posts and comments on accounts with extensive histories. Many users have to rely upon (and trust) third-party scripts or websites. Would Reddit ever consider implementing an automatic content deletion setting in the user profile similar to that offered on Mastodon?
Why won't certain moderators tell me how I violated the rules? They haven't given any explanation and when I ask for one they mute me for a week without answering. I did not point out any race or ethnicity when I said I'd experienced begging in developing countries and that got me banned without warning. I thought moderators were required to give some sort of rationale. This is not a report on a specific subreddit.
WHAT IS KARMA AND HOW DO I GET IT i’m new and barley on this app but want to ask questions in certain groups BUT THEY WONT LET ME BC I DONT HAVE ENOUGH “KARMA” HELP😭😂
You just have to post/comment in communities that don't have karma requirements, you get karma from upvotes on your posts or comments.
So if Reddit says they "own" the content produced by their users on this site (by choosing who can and cannot view it), isn't that a violation of Section 230 and Reddit is giving up their safe harbor protections? Because that's what a judge says.
I don't see how this policy is legal given the above. Either I own the copyright to my comments as a third-party (at which point Reddit cannot deny access to others, as they do not control my copyright), or by me posting here Reddit takes the copyright of my comment and in turn loses Section 230 privileges.
That’s a district court ruling from a single judge, it’s not binding precedent. Also that decision was literally issued yesterday, it’s going to be a long time before the legal question raised is settled
Well ain’t that something. Will Reddit do anything about it until they get sued? Absolutely not.
"We are, unfortunately, seeing more and more commercial entities collecting public data,"
You mean like YOU? So you can sell it to google without notifying anyone?
OpenAI strikes Reddit deal to train its AI on your posts
https://www.theverge.com/2024/5/16/24158529/reddit-openai-chatgpt-api-access-advertising
you're welcome, steve.
it's fine though you can hand my ip to the quantum computer to keep it safe and use all my personlity data to assemble legally "entitled" bot clones it's "totally" cool bro "entities" aren't doesn't even
bt if i buy reddit gold
Racism seems alive and well in the r/combatfootage sub and the moderators allow it. How do I take the next step?
Hello, while I understand the posts made are public and anyone can view them, are there any efforts being made to prevent YouTube creators from using a members post to create videos that they subsequently earn revenue on?
I have been getting calls for months I wouldn't answer if I didn't know them and they would have me click on a business like alcohol treatment centers near by I still have the text on my phone the master card I used was blocked already I had thru Merrick Bank and I filed a police report and called the trade commission I had thought they worked there because I talk to the same person almost Everytime I got a different lady Saturday and she swore a card had been mailed and I told her I had faxed the information along with the police report 345 dollars at a restaurant called Family's First Gourmet in Louisville Tn 37777 no such thing here in this town. I should get a card this week. I still had. my bank card on lock out of the blue I got a card from NetSpend I was trying to freeze Trans union and Experian. and I couldn't do it online the 1800 number was not taking calls. They took my contacts out I didn't know my mom's email or my sister's. My husband is littlesralphm@gmail.com.
We buy prepaid cards for streaming and for the last six months someone has been using our card and watching Peacock we have called the 1800 number to do a dispute and we thought we did and when we call back they say no one has done one. Is there something we can put on our wifi? We had a doorbell camera and it would get hacked into the cars would be black as the car passed our house. I cut that off if you look at my maps they have given me or showed me the houses that were hacking. I'm going to get my doorbell camera cut back on my house as the only one blurred out on Wheeler Rd . How can they hide movies on Netflix?? This has been going on for a minute I was stupid to think I had that many ads. My Facebook account has been hacked so much I have about 8 accounts someone takes them down I would put another one up. I think that person had a twitter account. Merrick Bank sent me dispute papers when they said it was a card
wew
What avenues are there to stop commercial entities taking screenshots of Redditors content and use it on other platforms without permission, possibly exposing someone to harm by doing so? In this instance I am specifically referring to news channels.
When it comes to the piracy lawsuit against you, since the plaintiffs argue they only want to prove posts were made on a specific ISP, could your attorneys try to reach a compromise where you share only the subnet part of the address?
What is the problem? it is public.
What i would really want is change this name. Hot-Cocroach? really?
Reddit reports are a joke. Actually the site promoted misinformation
Ban me from the fucking app and delete my account FFS! I’m sick of the bullying and bullshit here and the Reddit autobots who support based on karma scores.
I am vehemently against every move that fragments the world wide web and turns it into a private version for, e. g. [insert huge mega-mega-corporation here].
I just found Reddit recently. What do people think of this article, and what can be done to keep Reddit's wonderful ecosystem alive? https://www.joanwestenberg.com/reddits-anti-protest-policy-exposed/
I've been quite keen on the whole harassment filter topic since it was introduced to moderators in march this year. From what I've been able to see some have been very happy while some couldn't handle the false positive rate and basically saw half their communities posts being flagged. Is it visible from the normal user view if a subreddit is using the harassment filter to moderate their content?
[removed]
testing
dddd