• 0 Posts
  • 271 Comments
Joined 3 months ago
cake
Cake day: March 23rd, 2025

help-circle


  • 590-620nm. Identical to orange.

    The difference between brown and orange is the brightness level, and since the eyes have an automatic brightness adjustment, brightness levels only appear in context.

    Light becomes a darker variant if there’s brighter light around and vice versa. Shine brown/orange light into a dark room, and it will appear orange. Shine the same light into a brighter context, and it will be brown.

    It’s exactly the same thing as e.g. dark blue or light blue. Both share the exact same wavelength, and their brightness becomes apparent in context.

    If you’ve ever been to a cinema and you saw anything brown or orange on screen, you have seen the effect. If you have ever seen a dim conventional light bulb in a bright room, you have seen it too.

    Brown has just as much a wave length as orange, because it’s the same color.





  • Terrible idea for a few reasons.

    • The example in the OP does not need anything but the country. GPS coordinates are less efficient than ISO codes
    • GPS coordinates don’t map 1:1 to countries or even street addresses. There are infinite different coordinates for each address, and it’s very non-trivial to match one to another. Comparing whether two records with country codes are in the same country is trivial. Doing the same with two GPS coordinates is very difficult.
    • GPS coordinates might be more exact than accurate. This is a surprisingly common issue: you start out only needing a country, so you put some arvitrary GPS position (e.g. the center of the country) into the GPS coordinates. Later a new requirement arises that means you now need street addresses. Now all old entries point so some random house in the middle of the country, and there’s no easy way to differentiate these false locations from real ones.

    I guess you meant that as a joke, but people are really doing this and it leads to actual problems.

    I saw a news report a while ago about something like that being done in a database for people with outstanding debt. If the address of the debtor wasn’t known, they just put “US” in the form, and the program automatically entered the centre of the US as the coordinates.

    Sucks for the family that lives there because they constantly get threatening mail and even house visits from angry lenders who want their money back. People even vandalized their house and car because they believed that their debtors lived in that house.


  • You did not read your source. Some quotes you apparently missed:

    Scraping to violate the public’s privacy is bad, actually.

    Scraping to alienate creative workers’ labor is bad, actually.

    Please read your source before posting it and claiming it says something it doesn’t actually say.

    Now why does Doctrow distinguish between good scraping and bad scraping, and even between good LLM training and bad LLM training in his post?

    Because the good applications are actually covered by fair use while the bad parts aren’t.

    Because fair use isn’t actually about what is done (scraping, LLM training, …) but about who does it (researchers, non-profit vs. companies, for-profit) and for what purpose (research, critique, teaching, news reporting vs. making a profit by putting original copyright owners out of work).

    That’s the whole point of fair use. It’s even in the name. It’s about the use, and the use needs to be fair. It’s not called “Allowed techniques, don’t care if it’s fair”.


  • Tbh, this is not a question about scraping at all.

    Scraping is just a rather neutral tool that can be used for all sorts of purposes, legal and illegal.

    Neither does the technique justify the purpose nor does outlawing the technique fix the actual problem.

    Fair use only applies for a certain set of use cases and has a strict set of restrictions applied to it.

    The permitted use cases are: “criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research”.

    And the two relevant restrictions are:

    • “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;”
    • “the effect of the use upon the potential market for or value of the copyrighted work.”

    (Quoted from 17 U.S.C. § 107)

    And here the differences between archive.org and AI become obvious. While archive.org can be abused as some kind of file sharing system or to circumvent paywalls or ads, its intended purpose is for research, and it’s firmly non-profit and doesn’t compete with copyright holders.

    AI, on the other hand, is almost always commercial, and its main purpose is to replace human labour, specifically of the copyright owners. It might not be an actual problem for Disney’s bottom line, but it’s a massive problem for smaller artists, stock photographers, translators, and many other professions.

    That way, it clearly doesn’t apply to the use cases for fair use while violating the restrictions.

    And for that, it doesn’t matter if the training data is acquired using scraping (without permission) or some other way (without permission to use it for AI training).


  • Brown is on the colour spectrum, it does have a wavelength. Specifically, it has the same wavelength as orange. Because brown is dark orange and orange is light brown.

    What’s not on the colour spectrum are multi-wavelength mixed colours like e.g. red and blue light combining to something that looks like spectral violet. And while these multi-wavelength colours are physically different than a pure spectral colour, the sensation to a human is identical, because both trigger the cone cells in the eyes in an identical way. Which is why we can have screens that only emit three colours and still trigger the same sensations as millions of different spectral colours.


  • The same is true for English too.

    Brown and orange are different brightness levels of the same colour. Brown is dark orange and orange is light brown. Yet people experience brown and orange as separate colours, because we have separate words for it, while we experience light blue and dark blue as different brightness levels of the same colour, because both are called “blue”.