I take my shitposts very seriously.

  • 18 Posts
  • 1.26K Comments
Joined 3 years ago
cake
Cake day: June 24th, 2023

help-circle

  • Crawlers don’t have to follow conventions or specifications. If one has a setTimeout implementation that doesn’t wait the specified amount of time and simply executes the callback immediately, it defeats the system. Proof-of-work is meant to ensure that it’s impossible to get around the time factor because of computational inefficiency.

    Anubis is an emergency solution against the flood of scrapers deployed by massive AI companies. Everybody wishes it wasn’t necessary.





  • rtxn@lemmy.worldtoProgrammer Humor@programming.devlads
    link
    fedilink
    arrow-up
    50
    ·
    edit-2
    5 months ago
    • A web server that can’t discriminate between a request made by a human and one made by a machine has to handle all requests. It may not be an issue for large companies like Amazon or Microsoft, but small websites will suffer timeouts and outages.
    • Without a locally hosted solution like Anubis, small websites would have to move behind a large centralized service like Cloudflare.
    • Otherwise they might not be able to continue operating and only large corporate-backed services like Twitter and Reddit would survive.

    The alternative is having to choose between Reddit and Cloudflare. Does that look “free” and “open” to you?


  • rtxn@lemmy.worldtoProgrammer Humor@programming.devlads
    link
    fedilink
    arrow-up
    90
    ·
    edit-2
    5 months ago

    Anubis is a simple anti-scraper defense that weighs a web client’s soul by giving it a tiny proof-of-work workload (some calculation that doesn’t have an efficient solution, like cryptography) before letting it pass through to the actual website. The workload is insignificant for human users, but very taxing for high-volume scrapers. The calculations are done on the client’s side using Javascript code.

    (edit) For clarification: this works because the computation workload takes a relatively long time, not because it bogs down the CPU. Halting each request at the gate for only a few seconds adds up very quickly.

    Recently, the FSF published an article that likened Anubis to malware because it’s basically arbitrary code that the user has no choice but to execute:

    […] The problem is that Anubis makes the website send out a free JavaScript program that acts like malware. A website using Anubis will respond to a request for a webpage with a free JavaScript program and not the page that was requested. If you run the JavaScript program sent through Anubis, it will do some useless computations on random numbers and keep one CPU entirely busy. It could take less than a second or over a minute. When it is done, it sends the computation results back to the website. The website will verify that the useless computation was done by looking at the results and only then give access to the originally requested page.

    Here’s the article, and here’s aussie linux man talking about it.





  • No.

    The local machine boots using PXE. Clonezilla itself is transferred from a TFTP server as a squashfs and loaded into memory. When that OS boots, it mounts a network share using CIFS that contains the image to be installed. All of the local SATA disks are named sda, sdb, etc. A script determines which SATA disk is the correct one (must be non-rotational, must be a specific size and type), deletes every SCSI device (which includes ATA devices too), then mounts only the chosen disk to make sure it’s named sda.

    Clonezilla will not allow an image cloned from a device named sda to be written to a device with a different name – this is why I had to make sure that sda is always the correct SSD.










  • Sometimes for maintenance, sometimes because manual intervention was necessary. The machines where we did this were built in the 90s and have been in near constant operation. Moving parts are worn out and the tolerances are gone. Replacement parts are difficult to find and expensive to manufacture, so if something more complex than a ball bearing or axle got out of alignment, we had to pound it back into place (sometimes literally).

    I personally never bypassed the interlock, I wasn’t paid enough to take on that responsibility. I would just file a downtime notice and call the on-site mechanic when needed. I didn’t give a shit about reduced output.

    Tagging @Remorhaz@lemmy.world