• 0 Posts
  • 123 Comments
Joined 2 years ago
cake
Cake day: June 15th, 2023

help-circle



  • That can’t be good. But I guess it was inevitable. It never seemed like Arc had a sustainable business model.

    It was obvious from the get-go that their ChatGPT integration was a money pit that would eventually need to be monetized, and…I just don’t see end users paying money for it. They’ve been giving it away for free hoping to get people hooked, I guess, but I know what the ChatGPT API costs and it’s never going to be viable. If they built a local-only backend then maybe. I mean, at least then they wouldn’t have costs that scale with usage.

    For Atlassian, though? Maybe. Their enterprise customers are already paying out the nose. Usage-based pricing is a much easier sell. And they’re entrenched deeply enough to enshittify successfully.



  • Yeah, that’s true for a subset of code. But for others, the hardest parts happen in the brain, not in the files. Writing readable code is very very important, especially when you are working with larger teams. Lots of people cut corners here and elsewhere in coding, though. Including, like, every startup I’ve ever seen.

    There’s a lot of gruntwork in coding, and LLMs are very good at the gruntwork. But coding is also an art and a science and they’re not good at that at high levels (same with visual art and “real” science; think of the code equivalent of seven deformed fingers).

    I don’t mean to hand-wave the problems away. I know that people are going to push the limits far beyond reason, and I know it’s going to lead to monumental fuckups. I know that because it’s been true for my entire career.


  • If I’m verifying anyway, why am I using the LLM?

    Validating output should be much easier than generating it yourself. P≠NP.

    This is especially true in contexts where the LLM provides citations. If the AI is good, then all you need to do is check the citations. (Most AI tools are shit, though; avoid any that can’t provide good, accurate citations when applicable.)

    Consider that all scientific papers go through peer review, and any decent-sized org will have regular code reviews as well.

    From the perspective of a senior software engineer, validating code that could very well be ruinously bad is nothing new. Validation and testing is required whether it was written by an LLM or some dude who spent two weeks at a coding “boot camp”.






  • Nobody should feel a strong need to upgrade after only two generations. Same deal with most tech like GPUs and CPUs.

    I use my phone a lot and my Pixel 7 is fine. The primary factors driving my last couple upgrades were battery degradation and software support. Neither should be a big problem with a Fairphone.

    I’m also trying to decide whether to stick with the Pixel/GrapheneOS ecosystem or go for Fairphone.

    How hard/expensive was it to replace your battery? I looked on iFixIt and it seemed a lot harder than my orevious phones.






  • SEO (search engine optimization) has dominated search results for almost as long as search engines have existed. The entire field of SEO is about gaming the system at the expense of users, and often also at the expense of search platforms.

    The audience for an author’s gripping life story in every goddamn recipe was never humans, either. That was just for Google’s algorithm.

    Slop is not new. It’s just more automated now. There are two new problems for users, though:

    1. Google no longer gives a shit. They used to play the cat-and-mouse game, and while their victories were never long-lasting, at least their defeats were not permanent. (Remember ExpertsExchange? It took years before Google brought down the hammer on that. More recently, think of how many results you’ve seen from Pinterest, Forbes, or Medium, and think of how few of those deserved even a second of your time.)
    2. Companies that still do give a shit face a much more rapid exploitation cycle. The cats are still plain ol’ cats, but the mice are now Borg.

  • Well I’m sorry, but most PDF distillers since the 90s have come with OCR software that can extract text from the images and store it in a way that preserves the layout AND the meaning

    The accuracy rate of even the best OCR software is far, far too low for a wide array of potential use cases.

    Let’s say I have an archive of a few thousand scientific papers. These are neatly formatted digital documents, not even scanned images (though “scanned images” would be within scope of this task and should not be ignored). Even for that, there’s nothing out there that can produce reliably accurate results. Everything requires painstaking validation and correction if you really care about accuracy.

    Even ArXiv can’t do a perfect job of this. They launched their “beta” HTML converter a couple years ago. Improving accuracy and reliability is an ongoing challenge. And that’s with the help or LaTeX source material! It would naturally be much, much harder if they had to rely solely on the PDFs generated from that LaTeX. See: https://info.arxiv.org/about/accessible_HTML.html

    As for solving this problem with “AI”…uh…well, it’s not like “OCR” and “AI” are mutually exclusive terms. OCR tools have been using neural networks for a very long time already, it just wasn’t a buzzword back then so nobody called it “AI”. However, in the current landscape of “AI” in 2025, “accuracy” is usually just a happy accident. It doesn’t need to be that way, and I’m sure the folks behind commercial and open-source OCR tools are hard at work implementing new technology in a way that Doesn’t Suck.

    I’ve played around with various VL models and they still seem to be in the “proof of concept” phase.


  • For instance, Mozilla said it may have removed blanket claims that it never sells user data because the legal definition of “sale of data” is now “broad and evolving,” Mozilla’s blog post stated.

    Uh huh.

    The company pointed to the California Consumer Privacy Act (CCPA) as an example of why the language was changed, noting that the CCPA defines “sale” as the “selling, renting, releasing, disclosing, disseminating, making available, transferring, or otherwise communicating orally, in writing, or by electronic or other means, a consumer’s personal information by [a] business to another business or a third party” in exchange for “monetary” or “other valuable consideration.”

    Yes. That’s what “sale of data” means. Everybody understood that. That’s exactly what we don’t want you to do.