Also, I figure (if it hasn’t happened already) some federated instances out there are nefarious, set up to harvest data.
[Citations needed] or it didn’t happen. There’s precious little extra information that a “nefarious” instance can harvest that any basic web scrapper can’t.
This is such a bullshit challenge. I often see it used to essentially bully someone into a side issue about citations. It’s a great way to avoid discussing the original issue.
I have knowledge (that I rarely share) that I am absolutely not going to cite, because I’m not jeopardising sources, or clearances, or violating my obligations to the official secrets act just to play someone’s status games.
If someone makes a claim, I am perfectly able to go find the relevant citations myself, if there are any. I am more interested in the structure and content of what they’re adding to the discussion.
I often see it used to essentially bully someone into a side issue about citations. It’s a great way to avoid discussing the original issue.
You may well have, but that’s not what I’m doing. I’m familiar with ActivityPub’s & Lemmy’s APIs, and I’m calling bullshit on OP’s hyperbolic claim without evidence or elaboration.
So, from your knowledge of those APIs, this isn’t possible? I don’t need to develop a defensive protocol for it? I like to be comprehensive, especially with a potential (ideological and propaganda, if not literal) invasion from the new fascist state to my south, but if this is a low-level probability, I can put it way down my priority list.
If privacy is what you’re looking for, ActivityPub is never going to provide it, because it wasn’t designed for it and can’t be back-ported into it. You should log off and use (or create) something altogether else.
People were saying the same thing for decades in response to a small minority warning about government surveillance, often dismissing them with labels like “paranoid”. Eventually, Snowden came along and produced the citations, at extreme risk to himself and his loved ones. It’s an anomaly that they were ever revealed at all.
History is replete with examples of bad stuff going on for ages before irrefutable evidence of it became widely known. In general, if something can be abused to someone’s advantage, it will be, and likely already is.
There’s precious little extra information that a “nefarious” instance can harvest that any basic web scrapper can’t.
You have a point there, but consider also that effective web scraping uses significantly more resources than having the data you want handed to you. Monitoring Lemmy through federation would be much more efficient.
An instance owner can only collect the IP addresses/brower fingerprints of users logged in to their instance. In other words, only slrpnk.net could collect that information about you, because you are only directly connecting to slrpnk.net.
Credit where due, it is just my best guess. I have no evidence.
I simply think if you have custom code on a machine to ingest data, creating a federation interface may be more suitable and stable in the long run than a scraper. The extra server load may draw attention or run amuck with security policies designed to obscure scrapers.
[Citations needed] or it didn’t happen. There’s precious little extra information that a “nefarious” instance can harvest that any basic web scrapper can’t.
This is such a bullshit challenge. I often see it used to essentially bully someone into a side issue about citations. It’s a great way to avoid discussing the original issue.
I have knowledge (that I rarely share) that I am absolutely not going to cite, because I’m not jeopardising sources, or clearances, or violating my obligations to the official secrets act just to play someone’s status games.
If someone makes a claim, I am perfectly able to go find the relevant citations myself, if there are any. I am more interested in the structure and content of what they’re adding to the discussion.
You may well have, but that’s not what I’m doing. I’m familiar with ActivityPub’s & Lemmy’s APIs, and I’m calling bullshit on OP’s hyperbolic claim without evidence or elaboration.
So, from your knowledge of those APIs, this isn’t possible? I don’t need to develop a defensive protocol for it? I like to be comprehensive, especially with a potential (ideological and propaganda, if not literal) invasion from the new fascist state to my south, but if this is a low-level probability, I can put it way down my priority list.
If privacy is what you’re looking for, ActivityPub is never going to provide it, because it wasn’t designed for it and can’t be back-ported into it. You should log off and use (or create) something altogether else.
I think this mindset is naïve and unrealistic.
People were saying the same thing for decades in response to a small minority warning about government surveillance, often dismissing them with labels like “paranoid”. Eventually, Snowden came along and produced the citations, at extreme risk to himself and his loved ones. It’s an anomaly that they were ever revealed at all.
History is replete with examples of bad stuff going on for ages before irrefutable evidence of it became widely known. In general, if something can be abused to someone’s advantage, it will be, and likely already is.
You have a point there, but consider also that effective web scraping uses significantly more resources than having the data you want handed to you. Monitoring Lemmy through federation would be much more efficient.
Can’t an instance also collect IP-addreses and device info, if its owner adds some scripts to its web version?
An instance owner can only collect the IP addresses/brower fingerprints of users logged in to their instance. In other words, only slrpnk.net could collect that information about you, because you are only directly connecting to slrpnk.net.
Good point
Credit where due, it is just my best guess. I have no evidence.
I simply think if you have custom code on a machine to ingest data, creating a federation interface may be more suitable and stable in the long run than a scraper. The extra server load may draw attention or run amuck with security policies designed to obscure scrapers.
But that is certainly an option.