Piano.cat
    • Categories
    • Recent
    • Tags
    • Popular
    • World
    • Users
    • Groups
    • Register
    • Login
    1. Principal
    2. Uncategorized
    3. LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)
    This topic has been deleted. Only users with topic management privileges can see it.
    • fedipact@cyberpunk.lolF fedipact@cyberpunk.lol

      LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

      fedipact@cyberpunk.lolF fedipact@cyberpunk.lol

      LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI (Including Many Fediverse Instances!!!)

      "The tech giant is sidestepping guardrails that websites use to prevent being scraped, data show, in a move whistleblowers say is unethical and potentially illegal."

      ARTICLE: https://www.dropsitenews.com/p/meta-facebook-tech-copyright-privacy-whistleblower

      FULL PDF: https://www.dropsitenews.com/api/v1/file/b3555944-e204-4f5e-9a64-e44281b19a82.pdf

      #FediPact #meta #threads #AI

        0
        Offline fedipact@cyberpunk.lol •
        , last edited by
      • spla@mastodont.catS spla@mastodont.cat
        spla@mastodont.catS spla@mastodont.cat

        @FediPact I did apply this nginx config to fight against it and many other IA bots and scrappers:

        Link Preview Image
        GitHub - kurren/ai-bots-crawlers: Prevent ai bots to crawl a website (Nginx web server)

        Prevent ai bots to crawl a website (Nginx web server) - kurren/ai-bots-crawlers

        favicon

        GitHub (github.com)

        returning 444 to them seems a good way to confuse them and decrease server load.

          0
          Offline spla@mastodont.cat •
          , last edited by spla@mastodont.cat
        Loading More Posts
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes
        Reply
        • Reply as topic
        Log in to reply
        • First post
          Last post