Just an explorer in the threadiverse.

  • 15 Posts
  • 140 Comments
Joined 1 year ago
cake
Cake day: June 4th, 2023

help-circle


  • Like helping to find a bug, discussing about how to setup an application for a certain use case or anything like that? Answering questions on Stack overflow is an example but is that the best way?

    Generally the best way to help out is to do a thing that’s needed and that you can figure out how to do. Your list includes a bunch of good options, and I’ve been thanked for doing all those things at one point or another. Some common growth paths include:

    1. Using the software
    2. Encountering bugs, problems, or small opportunities for improvement.
    3. Discussing those informally in forums and helping people find workarounds.
    4. Identifying some of those issues as common things other things experience as well, so filing bugs for them with clear explanations and links to related forum discussions.
    5. Reading source code to better understand bugs.
    6. Discussing potential fixes in developer bug threads (or in GitHub or whatever).
    7. Submitting small fixes for simple bugs as pull requests.

    Another path might be:

    1. Using the software and reading forums/docs for help.
    2. Answering basic questions on forums, looking to old threads and relevant docs.
    3. Learning about common questions.
    4. Writing blogs or forum posts about common questions.
    5. Submitting improvements to official docs to clarify common areas of confusion.

    There are other paths as well, the main thing is to use a thing so you learn about it and then use that knowledge to make it a little easier for the next person. Good luck!






  • You misunderstand what the Hot rank is doing. It’s not balancing newness vs hotness, it’s scaling hotness according to community size. This might feel like newness if you’re focused on vote counts as a proxy for post age, but it’s a different approach. See https://github.com/LemmyNet/lemmy/issues/3622 for details.

    There’s a couple ways to think about this:

    1. There are a handful of Lemmy communities that are just WAY more active than everything else. The main feeds are kind of lame if you have to scroll 300 posts it to find anything other than a shit post from the same 3 communities. Scaled Hot rank shows a greater variety of communities by making it easier small communities to get ranked hotly.
    2. Or you can consider Hotness to be a rough measure of what percentage of people who have seen the post interacted with it. A post with 500 upvotes in a community with 10,000 active users is kind of popular, but only 5% of the people likely to have scrolled passed it cared about it. A post with 50 upvotes in a community with 200 active members is much MORE popular relatively even though the absolute numbers are smaller.

    At any rate, this preference toward smaller communities in hot is a recent change and deliberate. While they might further tweak the scaling factors, I wouldn’t expect it to be drastically different. It sounds to me like what you want is Top, Active, or Most Comments. All these are unscaled according to community size and will get you top posts by their absolute metric rather than posts that are doing well relative to their community size.


  • PriorProject@lemmy.worldtoSelfhosted@lemmy.worldWoL through Wireguard
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    edit-2
    1 year ago

    This is a very strong explanation of what’s going on. And as a follow-up, I believe that ZeroTier present a single Ethernet broadcast domain, and so WoL tricks are more likely to work naturally there than with Wireguard. I haven’t used ZeroTier, and I do use Wireguard via Tailscale/Headscale. I’ve never missed the Ethernet features of ZeroTier and they CAN result in a very chatty wan if you’re not careful. But I think ZT would make this straightforward.

    Though as other people note… the simplest/least-disruptive change is probably to expose some scripty thing on the rpi that can be triggered via be triggered over a routed protocol and then have the rpi emit the Ethernet broadcast packets from the physical network.


  • I don’t think titles directly transfer between companies, and yet the industry allows it. It’s a very useful tool for advancement.

    This may be true on some corners of the industry, but at the more competitive end (both in terms of competitive pay, and a competitive pool of candidates)… I believe it’s common to relevel on hire. I’ve seen folks go from director to senior and from senior to junior at my org. The candidates being offered those seemingly big “demotions” often seem to be somewhere between unphased and enthusiastic about the change, presumably because the compensation package we offer at the lower level beats what they were getting with an inflated title and because they know their inflated title is nonsense and they’re frustrated with the other aspects of organizational dysfunction that accompany title inflation at their current company.

    What you say is real, and sometimes a promotion in one org can help bridge you into an org that would have been hard to get hired into as a junior, or harder to get promoted in. It’s not without risk though. All things being equal, I’d much rather spend my time working on a strong team and learning a lot and being challenged than to be in a weaker org that’s handing out inflated titles. Getting gud isn’t a guarantee of advancement, but it’s at least as reliable over the long haul as title inflation.



  • I dunno how to hotlink, but if you scroll to the active users graph at https://fedidb.org/software/lemmy you can see there’s been like a 25% dropoff in active users since the peak in July. Lemmy has still grown 50x since May, and it’s much MUCH more active than it was then. But we’ve definitely crested a peak and not everyone who gave Lemmy a shot then is sticking around in a monthly basis.

    This isn’t necessarily bad. Lemmy is still young and has many rough edges, it wasn’t realistic to win all the users that tried it on ease-of-use in a head to head with reddit. And Mastodon has had multiple growth waves interspersed with periods of declining usage, but with the spikes has grown ie remained stable overall. Early-stage commercial social media have big ups and downs in engagement and growth as well, and just like lemmy those ups and downs are often externally driven… when competitors mess up, when a big global news story hits, when a major sporting event happens… these can all be catalysts for one-time growth. It’s not a straight line.

    Time will tell what user level we stabilize at in the short-term and what events spur new growth, but it’s normal to have a big expansion be followed by some degree of contraction.





  • You were banned from the community and are no longer allowed to post or comment there, there’s a public record of this in the modlog: https://lemmy.world/modlog?page=1&userId=29397

    The best practice is for the mod to put a comment in when they ban someone about why they did so, but there’s no such comment in your case. You’d have to look back through your post and comment history to try to guess what you did in that community around 2mo ago when the ban happened.

    It’s also a good practice IMO to do temporary bans for first offenses, but the mod in this case appears to have issued a permanent ban, so you’re done interacting in that community unless you can message a mod to request being unbanned.

    Some mods tell you when they take action, but many don’t. It would be cool if Lemmy itself notified you, but it doesn’t… you have to search the modlog to see.


  • I don’t think this is a thing and I’m not sure it reasonably can be.

    • Maybe if someone properly crossposted, Lemmy could know which posts are identical and skip dupes. Though it would still be a crapshoot which community got displayed… you might end up seeing the comments from the original post in some tiny/dead community while a crosspost to a huge community blows up with it’s own comments.
    • But for non-crossposted duplicate posts… there’s no relationship between them as far as lemmy is concerned. They’re separate posts to separate communities that just happen to look very similar. Deducing such a scenario is very sticky.


  • I use Headscale, but Tailscale is a great service and what I generally recommend to strangers who want to approximate my setup. The tradeoffs are pretty straightforward:

    • Tailscale is going to have better uptime than any single-machine Headscale setup, though not better uptime than the single-machine services I use it to access… so not a big deal to me either way.
    • Tailscale doesn’t require you to wrestle with certs or the networking setup required to do NAT traversal. And they do it well, you don’t have to wonder whether you’ve screwed something up that’s degrading NAT traversal only in certain conditions. It just works. That said, I’ve been through the wringer already on these topics so Headscale is not painful for me.
    • Headscale is self-hosted, for better and worse.
    • In the default config (and in any reasonable user-friendly, non professional config), Tailscale can inject a node into your network. They don’t and won’t. They can’t sniff your traffic without adding a node to your tailnet. But they do have the technical capability to join a node to your tailnet without your consent… their policy to not do that protects you… but their technology doesn’t. IMO, the tailscale security architecture is strong. I’d have no qualms about trusting them with my network.
    • Beyond 3 devices, Tailscale costs money… about $6 US in that geography for over 100 devices. It’s a pretty reasonable cost for the service, and proportional in the grand scheme of what most self-hosters spend on their setups annually. IMO, it’s good value and I wouldn’t feel bad paying it.

    Tailscale is great, and there’s no compelling reason that should prevent most self-hosters that want it from using it. I use Headscale because I can and I’m comfortable doing so… But they’re both awesome options.


  • I replied to the parent comment here to say that governments HAVE set up CSAM detection services. I linked a review of them in my original comment.

    • They’ve set them up through commercial partnerships with technology companies… but that’s no accident. CSAM fighting orgs don’t have the tech reach of a major tech company so they ask for help there.
    • Those partnerships are limited to major/successful orgs… which makes it hard to participate as an OSS dev. But again, that’s on-purpose as the same access that would empower OSS devs to improve detection would enable CSAM producers to improve evasion. Secrecy is useful in this race, even if it has a high cost.

    Plus with the flurry of hugely privacy-invading or anti-encryption legislation that shows up every few months under the guise of “protecting the children online”, it seems like that should be a top priority for them, right?! Right…?

    This seems like inflammatory bait but I’ll bite once.

    • Improving CSAM detection is absolutely a top priority of these orgs, and in the last 10y the scope and reach of the detection tools they’ve created with partners has expanded in reach from scanning zero images to scanning hundreds of millions or billions of images annually. It’s a fairly massive success story even if it’s nowhere near perfect.
    • Building global internet infrastructure to scan all/most images posted to the internet is itself hugely privacy invading even if it’s for a good cause. Nothing prevents law-makers from coopting such infrastructure for less noble goals once it’s been created. Lemmy is in desperate need of help here, and CSAM detection tools are necessary in some form, but they are also very much scary scary privacy invading tools that are subject to “think of the children” abuse.

  • I’m not sure I follow the suggestion.

    • NCMEC, the US-based organization tasked with fighting CSAM, has already partnered with a list of groups to develop CSAM detection tools. I’ve already linked to an overview of the resulting toolsets in my original comment.
    • The datasets used to develop these tools are private, but that’s not an oversight. The datasets are… well… full of CSAM. Distributing them openly and without restriction would be contrary to NCMEC’s mission and to US law, so they limit the downside by partnering only with serious/capable partners who are able to commit to investing significant resources to developing and long-term maintaining detection tools, and who can sign onerous legal paperwork promising to handle appropriately the access they must be given to otherwise illegal material to do so.
    • CSAM detection tools are necessarily a cat and mouse game of CSAM producers attempting to evade detection vs detection experts trying to improve detection. In such a race, secrecy is a useful… if costly… tool. But as a result, NCMEC requires a certain amount of secrecy from their partners about how the detection tools work and who can run them in what circumstances. The goal of this secrecy is to prevent CSAM producers from developing test suites that allow them to repeatedly test image manipulation strategies that retain visual fidelity but thwart detection techniques.

    All of which is to say…

    … seems like law enforcement would have such a data set and seems they should of course allow tools to be trained on it. seems but who knows? might be worth finding out.)

    Law enforcement DOES have datasets, and DO allow tools to be trained on them… I’ve linked the resulting tools. They do NOT allow randos direct access to the data or tools, which is a necessary precaution to prevent attackers from winning the circumvention race. A Red Hat or Mozilla scale organization might be able to partner with NCMEC or another organization to become a detection tooling partner, but db0, sunaurus, or the Lemmy devs likely cannot without the support of a large technology org with a proven track record or delivering and maintaining successful/impactful technology products. This has the big downside of making a true open-source detection tool more or less impossible… but that’s a well-understood tradeoff that CSAM-fighting orgs are not likely to change as the same access that would empower OSS devs would empower CSAM producers. I’m not sure there’s anything more to find out in this regard.



  • It’s worth considering some commercially developed options as well: https://prostasia.org/blog/csam-filtering-options-compared/

    The Cloudflare tool in particular is freely and widely available: https://blog.cloudflare.com/the-csam-scanning-tool/

    I am no expert, but I’m quite skeptical of db0’s tool:

    • It repurposes a library designed for preventing the creation of synthetic CSAM using stable diffusion. This library is typically used in conjunction with prompt scanning and other inputs into the generation process. When run outside it’s normal context on non-ai images, it will lack all this input context which I speculate reduces its effectiveness relative to the conditions under which it’s tested and developed.
    • AI techniques live and die by the quality of the dataset used to train them. There is not and cannot be an open-source test dataset of CSAM upon which to train such a tool. One can attempt workarounds like extracting features classified and extracted separately like trying to detect coexisting features related to youth (trained from dataset A using non sexualized images including children) and sexuality (trained separately from dataset B using images containing only adult performers)… but the efficacy of open source solutions is going to be hamstrung by the inability to train, test, and assess effectiveness of the open tools. Developers of major commercial CSAM scanners are better able to partner with NCMEC and other groups fighting CSAM to assess the effectiveness of their tools.

    I’m no expert, but my belief is that open tools are likely to be hamstrung permanently compared to the tools developed by big companies and the most effective solutions for Lemmy must integrate big company tools (or gov/nonprofit tools if they exist).

    PS: Really impressed by your response plan. I hope the Lemmy world admins are watching this post, I know you all communicate and collaborate. Disabling image uploads is I think I very effective temporary response until detection and response tooling can be improved.