Yet another “brilliant” scheme from a cryptobro. Naturally this caused a gold-rush for scammers who outsourced random people via the gig economy to open PRs for this yml file (example)

  • toastal@lemmy.ml
    link
    fedilink
    arrow-up
    32
    arrow-down
    131
    ·
    edit-2
    9 months ago

    The easy red flag here is YAML. It’s a hideous, overly-complex format for anything so of course a scam would choose it.

        • jeffhykin@lemm.ee
          link
          fedilink
          arrow-up
          44
          ·
          edit-2
          8 months ago

          I have read the 1.2 spec (I’m trying to make a round trip parser for JS, and I do maintainance on a fork of the rumel yaml python package). I actually think its very well thought out, with things I hadn’t considered like future extensibility, streaming applications, and data-corruption detection.

          The diagrams, color coding, and less-formailty of the spec was much appreciated. Especially compared to something like the ECMA Script spec, which reads like a math textbook had a child with a legal document.

          I’m not saying YAML is perfect; round trip (the thing I’m working on) is nearly impossible because it wasn’t a design goal. It has a few too many features (I’ve never seen a declaration in the wild), but it does a good job at accomplishing the creators goals, and the additional features basically only slow down parser-implementers like me. I often pick it because of the tag support, which I’ve struggled to find an equivalent for in other serialization languages. I use anchors in recursive data structures, and complex keys for serializing complex data structures (not human readable). The “document end” marker has been nice when I’m worried about detecting partial-writes. And the merge key is nice for config files.

          The application/perspective matters. Yaml might be bad for you but its not bad for everyone.

          • toastal@lemmy.ml
            link
            fedilink
            arrow-up
            1
            arrow-down
            10
            ·
            edit-2
            9 months ago

            Even if anchors are pretty novel… I’ve watched myself & others fail for things that seem like they should be simple like scalars, quoting, & indentation rules all for being confusing (while failing to understand how/why the tab character isn’t supported).

            • theherk@lemmy.world
              link
              fedilink
              arrow-up
              7
              ·
              9 months ago

              That sounds like a skill issue. Something isn’t bad because you don’t understand it. Suggesting quoting is an issue for yaml is beyond the pale; it happens to be an issue everywhere.

              • jeffhykin@lemm.ee
                link
                fedilink
                arrow-up
                2
                ·
                edit-2
                8 months ago

                Despite my love of yaml. I actually think he has a small point with unquoted strings. I teach students and see their struggles. Bash also does unquoted strings and basically all students go years and years without realizing

                cat --help
                cat "--help"
                # ^ same thing
                
                cat *
                cat "*"
                # ^ not same thing
                
                cat $thing
                cat "$thing"
                # ^ similar but not the same 
                

                To know the difference between special and normal-but-no-quotes you have to know literally every special symbol. And, for example, its rare to realize the -- in --help, isn’t special at a language level, its only special at a convention level.

                Same thing can happen in yaml files, but actually a little worse I’d say. In bash all the “special” things are at least symbols. But in yaml there are more special cases. Imagine editing this kind of a list:

                js_keywords:
                - if
                - else
                - while
                - break
                - continue
                - import
                - from
                - default
                - class
                - const
                - var
                - let
                - new
                - async
                - function
                - undefined
                - null
                - true
                - false
                - Nan
                - Infinity
                

                Three of those are not strings. Syntax highlighting can help (which is why I don’t think its a real issue). But still “why are three not strings? Well … just because”. AKA there isn’t a syntax pattern, there’s just a hardcoded list of names that need to be memorized. What is actually challeging is, unless students start with a proper yaml tutorial, or see examples of quotes in the config, its not obvious that quotes will solve the problem (students think "true" behaves like "\"true\""). So even when they see true is highlighted funny, they don’t really know what to do about it. I’ve seem some try stuff like \true.

                Still doesn’t mean yaml is bad, every language has edge cases.

                • theherk@lemmy.world
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  8 months ago

                  While the subjective assessment that quote handling in yaml is worse than bash is understandable, it is really just two of many many cases where quotes complicate things. And for a pretty good reason. They are used to isolate strings in many languages, even prose. They, therefore, always get special handling in lexical analysis. Understanding which languages use single quotes, double quotes, backticks, heredocs, etc and when to use them is really just part of the game or the struggle I guess.

              • toastal@lemmy.ml
                link
                fedilink
                arrow-up
                1
                ·
                9 months ago

                Most languages require you to put quotes around strings as the norm… breaking that is part of what causes all of the confusion in the first place. Better design upfront would lead to less common errors. I have way more quoting issues in YAML than I do JSON, Nix, Nickel, Dhall, etc. because they aren’t trying to be cute with strings.

                • jeffhykin@lemm.ee
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  edit-2
                  8 months ago

                  When you’re editing yaml, why not just always write JSON?

                  Almost all nix attr keys are unquoted strings. Maybe I’m missing the point list, but I kinda wouldn’t expect it to be on the list.

            • jeffhykin@lemm.ee
              link
              fedilink
              arrow-up
              1
              ·
              8 months ago

              Its easy for me to say “just start writing JSON in the yaml. It doesn’t get more simple than JSON”, but actually I do think there’s a small point with the unquoted strings.

              Back before I knew programming, I was trying to change grammar settings sublime 2, which uses yaml. I had no idea what yaml was. The default setting values used unquoted strings fot regex. I knew PCRE regex and escapes, but suddenly they didnt work, and when I tried to match a single quote inside of regex that also didn’t work. I didn’t know I was editing yaml file (it had a .tmLanguage extension). Even worse, if I remeber correctly, unparsable settings just silently fail. Not only did I have no errors to google, I didn’t have any reason to believe the escapes were the cause of the problem (they worked in the command line). Sometimes I edited the regex and it was fine, and other times it just seemed to break. I didn’t learn about quoting in YAML until years later.

              For me that was an unfortuate combination, which was exacerbated by yaml unquoted weirdness. But when you’re talking about “did you read the spec” that’s a whole other story. .nan for nan, tabs vs spaces, unquted string weirdness, etc should just be one error message+google away. I think they’re a small hiccups with what is overall a great format.

    • umbraroze@kbin.social
      link
      fedilink
      arrow-up
      43
      arrow-down
      1
      ·
      9 months ago

      Brief history of YAML:

      “Oh no! All of these configuration file formats are complicated. I want to make things simpler!”

      (Years go by)

      “…I have made things more complicated, haven’t I?”

      YAML is generally good if it’s used for what it was originally designed for (relatively short data files, e.g. configuration data). Problem is, people use it for so much more. (My personal favourite pain example: i18n stuff in Ruby on Rails. YAML language files work for small apps, but when the app grows, so does the pain.)

      • db0@lemmy.dbzer0.comOP
        link
        fedilink
        arrow-up
        28
        arrow-down
        1
        ·
        9 months ago

        Ansible is using YAML and it’s orders more readable than any other config engine, like puppet or cfengine.

        • pastermil@sh.itjust.works
          link
          fedilink
          arrow-up
          3
          ·
          9 months ago

          Ideally, yes it can be beautifully written, certainly more than bash scripts.

          With that said, I’ve also seen some hideous ansible scripts…

      • toastal@lemmy.ml
        link
        fedilink
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        9 months ago

        originally designed for (relatively short data files, e.g. configuration data)

        This I can get behind. But because it’s not bad in those spaces folks think it’ll be a good idea in all spaces. Anchors do neat things, but organizing large files with YAML’s weird rules around quoting, & no support for tab indentation rub me the wrong way.

    • FooBarrington@lemmy.world
      link
      fedilink
      arrow-up
      18
      arrow-down
      2
      ·
      9 months ago

      What? I love having 20 ambiguous ways to express the same data with weird and unexpected conversion rules. JSON is so much worse - if data types are explicit and obvious, how can I properly express my feelings when writing a config file?

    • rtxn@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      1
      ·
      edit-2
      9 months ago

      And what would your ideal, legible, general-purpose data markup language be? XML?

      • Kogasa@programming.dev
        link
        fedilink
        arrow-up
        10
        arrow-down
        1
        ·
        edit-2
        9 months ago

        Yaml Ain’t Markup Language: am i a joke to you

        (JSON for data, TOML for configuration)

        • rtxn@lemmy.world
          link
          fedilink
          English
          arrow-up
          19
          arrow-down
          1
          ·
          9 months ago

          I’ve used both YAML and a TOML-adjacent INI format for Ansible. While I wouldn’t use YAML for massive data serialization (because significant whitespaces are fucking stupid), it’s much better suited for manual data entry compared to most options, including TOML, when nested data structures are required.

          And if YAML’s structure is too complicated, that’s honestly a skill issue.

          • Kogasa@programming.dev
            link
            fedilink
            arrow-up
            11
            arrow-down
            1
            ·
            9 months ago

            Not that YAML’s structure is too complicated, but its syntax is too flexible. All the shit about being whitespace sensitive yet with whitespace errors leading to a syntactically valid YAML document. TOML’s syntax is rigid which makes it unsuitable for expressing complex nested data structures, which is good because that’s not what you should use TOML for. Ultimately the dependence on a highly flexible baseline language like YAML to create complex DSLs is a failure on the developers’ part, and the entire configuration system should be reworked.

            • moonpiedumplings@programming.dev
              link
              fedilink
              arrow-up
              4
              arrow-down
              1
              ·
              edit-2
              9 months ago

              Do you use a linter like the ansible vscode extension?

              I used to hate writing ansible, and yaml, until I installed the ansible lint vscode extension, and everything became much, much easier.

              Later on, when I was working on a docker-compose, I noticed that the vscode yaml extension (which the ansible extension pulled in as a dependency) caught errors. It’s quite intelligent, able to spot errors exactly like what you mentioned, where the yaml syntax is correct, but the docker-compose, or the ansible syntax is wrong.

              • Kogasa@programming.dev
                link
                fedilink
                arrow-up
                3
                ·
                9 months ago

                Of course. If you’re working in a DSL that’s popular enough for someone to have written a good schema/parser for then tooling can help.

          • toastal@lemmy.ml
            link
            fedilink
            arrow-up
            2
            ·
            9 months ago

            Significant white space is awesome! Not supporting tabs tho shows you don’t know what you are doing, YAML.

            • Trail@lemmy.world
              link
              fedilink
              arrow-up
              2
              arrow-down
              2
              ·
              9 months ago

              They very well know what they are doing. Take your filthy tabs and get out of here. Spaces only.

              • CrayonRosary@lemmy.world
                link
                fedilink
                arrow-up
                4
                ·
                edit-2
                8 months ago

                Tabs for indentation, spaces for alignment. It’s perfect. Lets people visually indent as much as they want in their settings, but manually aligned things stay manually aligned. Forcing indents to always be… whatever number of spaces you personally like is dumb.

                Plus then you can outdent with a single Backspace in every text editor ever.

                  • toastal@lemmy.ml
                    link
                    fedilink
                    arrow-up
                    4
                    ·
                    edit-2
                    9 months ago

                    That just converts tabs to space but doesn’t address the underlying accessibility needs where some folks demand different indentation due to vision issues or nonstandard IO devices like braille readers. Tabs allow the user to configure the width for their needs. Being static spaces ignores the needs of many folks.

      • toastal@lemmy.ml
        link
        fedilink
        arrow-up
        6
        arrow-down
        1
        ·
        edit-2
        9 months ago

        Depends on the use case but XML is good for markup—especially if you need extensibility.

        For config, Nickel & Dhall take the cake for being typed & having LSPs so the configuration writer can get immediate feedback about possible options (while eliminating invalid states) without requiring the manual—with configuration readers not needing to mess around with marshaling their types. Both these configuration languages let you import files & write little loops to make your config more DRY & makes maintaining large files (like say Kubernetes) easier.

        • rtxn@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          9 months ago

          XML is great if the (de-)serialization is already implemented. Otherwise traversing the document is a massive pain.

          • toastal@lemmy.ml
            link
            fedilink
            arrow-up
            3
            ·
            9 months ago

            True. Something like XPath can really help & there are use cases where that is more concise but requires loading XPath into your head like Regex (which tends to get unloaded). The extensibility shines tho as seen by XMPP continuing to this day with very good backwards compatibility with 2 decades of updates since everything in an extension to the base.

    • sep@lemmy.world
      link
      fedilink
      arrow-up
      7
      ·
      9 months ago

      I see you get downvoted a lot. But as a norwegian that repeatedly have run into the norwegian problem when trying to use some program… i see you.