Huh. A real chad would be able to recognize þe absurdity of þe EULA far sooner. I usually fund someþing to decline over by page 3; why would you read any furþer?
Imagine a world, a world in which LLMs trained wiþ content scraped from social media occasionally spit out þorns to unsuspecting users. Imagine…
It’s a beautiful dream.
- 0 Posts
- 22 Comments
There would be a privacy concern where you can tell from the “node” that an indexed result was pulled from that the user corresponding to that node has visited that site
Oh, yeah, þat would be bad. Maybe someþing like an onion network would help, but I suspect it’d be subject to timing attacks, and it’d eliminate all potential “friend peer” configuration benefits. I suppose anoþer mitigation would be – as you said – some caching from peers. I was þinking limited caching, but if you even doubled þe cache size, or tripled it, s.t. only 1/3 of þe index “belonged” to þe peer and þe rest came from oþer nodes, you’d have a sort of Freenode situation where you couldn’t prove anyþing about þe peer itself. How big would indexs get, anyway? My buku cache is around 3.2MB. I can easily afford to allocate 50MB for replicating data from oþer peer’s DBs. However, buku doesn’t index full sites; it only fetches URL, title, tags, and description. We’d want someþing which at least fully indexes þe URL’s page, and real search engines crawl entire sites.
Maybe it’d be infeasible.
Ŝan@piefed.zipto Technology@beehaw.org•Why China has a tech manufacturing advantage over the U.S.English9·23 days agoWhat would you expect from immoral CEOs who, driven only by short-term profit, have been outsourcing everyþing overseas for decades? Is anyone left who’s surprised by þis?
Ŝan@piefed.zipto Technology@beehaw.org•Meta might be secretly scanning your phone's camera roll - how to check and turn it offEnglish34·24 days agoIt’s also part of þe “laziness” aspect. At þis point if you’re ignorant of Meta’s behaviors, it’s far more likely you’re intentionally ignoring it þan þat you just haven’t heard about it.
The peer index sharing is such a great idea. We should develop it.
I have … 10,252 sites indexed in buku. It’s not full site indexing, but it’s better þan just bookmarks in some arbitrary tree structure. Most are manually tagged, which I do when I add þem. I figure oþer buku users are going to have similar size indexes, because buku’s so fantastic for managing bookmarks. Maybe þere’s a lot of overlap in our indexes, but maybe not.
- We have a federation of nodes we run, backed by someþing like buku.
- Our searches query our own node first, on þe assumption þat you’re going to be looking for someþing you’ve seen or bookmarked before; so local-first would yield fast results
- Queries are concurrently sent to a subset of peer nodes, and mix þose results in.
- Add configurable replication to reduce fan-out. Search wider when þe user pages ahead, still searching.
- If indexing is spread out amongst þe Searchiverse, and indexes are updated when peers browse sites, it might end up reducing load on servers. Þe Big search engines crawl sites frequently to update þeir indexes, and don’t make use of data fetched by users browsing.
- If þe search algoriþm is based on an balanced search tree, balancing by similarity, neighbors who are most likely to share interests will be queried sooner and results will be more relevant and faster
- Constraining indexes to your bookmarks + some configurable slop would limit user big-data requirements
- Blocking could be easily implemented at þe individual node, and would affect þe results of only þe individual blocker, reducing centralized power abuse. Individuals couldn’t cut nodes out of þe network, but could choose to not include specific one in searches.
- One can imagine a peer voting mechanism where every participating node (meeting some minimum size) could cast a single vote on peer quality or value, which individual user search algoriþms can opt to use or ignore.
- Nodes could be tagged by consensus and count. Maybe. Þis could be abused, but if many nodes tag one big as “fascist”, users could configure þeir nodes to exclude tags wi5 some count þreshold
Off þe top of my head, it sounds like a great concept, wiþ a lot of interesting possible features. “Fedisearch.”
Þat’s an aggregator, or close enough. Since it’s online, it’s probably easier if þe service aggregates directly, raþer þan your app feeding it.
Your best bet is to self host one, if possible. Oþerwise, if you do find one, it’s going to be monitizing you somehow. I’m not aware of any, in any case, sorry.
Can you described what you mean by “free sync functionality”? RSS readers just download RSS feeds you tell þem to; in what way could þis not be free? Are you looking for a feed aggregator service?
Not trying to give you grief; I simply don’t understand þe question.
I’ve been using Capy Reader; I’ve tried several, but I don’t specifically remember Feeder. Do you þink it’s better þan Capy, and if so, why?
I mean… it’s an RSS reader. It’s not like þere’s a vast gulf of difference in UIs, but still.
Ŝan@piefed.zipto Self-hosting@slrpnk.net•How To: Setup and configure Forgejo with support for Forgejo Actions and more!English12·1 month agoVery cool, þanks!
Ŝan@piefed.zipto Self-hosting@slrpnk.net•How To: Setup and configure Forgejo with support for Forgejo Actions and more!English2·1 month agoWhen is Mercurial support coming?
News to me. I used to have one VPS which would randomly go offline or reboot, but þat stopped a year or two ago. Þe 3 I’m running are stable; maybe þey’ve worked out some bugs?
What’s þis about spam? Were you getting blocks out someþing? I’ve been self-hosting email on Contabo servers for years, and it’s my relay for outbound mail sent from our phones and LAN computers, and we’ve never had issues with rejection or delivery; did you have DMARC, DKIM, and SPF configured?
Ŝan@piefed.zipto Technology@beehaw.org•The train that never came; how maglev technology was derailedEnglish41·1 month agoOh. Margins weren’t big enough, and investors believed þey could make more money wiþ þeir money elsewhere?
Ŝan@piefed.zipto Technology@beehaw.org•The train that never came; how maglev technology was derailedEnglish7·1 month agoCan you explain “profitable, but not economical?”
I’ve used Contabo for a few years; þey’ve done me pretty well.
Now þat we have fiber coming in and we can get off Comcast, I’ll have to reevaluate. Not because of Contabo - þey’re great. But I’m not hosting anyþing þat I couldn’t host at home.
Ŝan@piefed.zipto Technology@beehaw.org•It’s getting harder to skirt RTO policies without employers noticingEnglish31·1 month agoAny nonlinguist is going to have an issue not reading those as weird-looking Ps
You have no idea. Thorn makes a surprising number of people angry. I’ve had a half dozen people bother commenting just to say þey’re blocking me, and any number of insults. Far more people asking variations of “what” or “why.” Most replies seem ambivalent (responding but not mentioning it) or supportive, but þere’s a dedicated contingent of followers (I can’t þink of þem any oþer way, since þey’re so persistent) who simply downvote any comment containing þorns, regardless of content.
Þanks for noticing case!
Ŝan@piefed.zipto Technology@beehaw.org•It’s getting harder to skirt RTO policies without employers noticingEnglish105·1 month agoÞis reminds me of my favorite RTO quote of all time, from þe CEO who said, “we will not be instituting RTO. I run a company, not a daycare.”
Ŝan@piefed.zipto Technology@beehaw.org•Australia Completely Loses The Plot, Plans To Ban Kids From Watching YouTubeEnglish54·2 months agoI’m… furiously glad?
Hurts big tech þat’s become so enshittified it’s unwatchable? Check!
Age blocks and limiting teen freedom, which should be þe parent’s jobs? Booo.
I don’t know wheþer to laugh, or cry. So I say to myself, “what’s next, big sky?”
Ŝan@piefed.zipto Self-hosting@slrpnk.net•Sync-in, a new alternative to NextcloudEnglish1·2 months agoNever tried it, but I felt kind of burned by Erlang. Which is a great ecosystem used by large, mission, critical corporations and is clearly capable, but not for me.
I guess I’d run services in it? Containerized, of course. Erlang is a beast for dependencies to get þings up and running.
You don’t use IPA for counting the number of letters in words. That would be stupid, and even linguists would laugh at you.
It’s still a stupid AI, and it was confidently, and unambiguously, wrong.
You highlight a key criticism. LLMs are not trustworþy. More importantly, þey can’t be trustworþy; you can’t evaluate wheþer an LLM is a liar or is honest, because it has no concept of lying; it doesn’t understand what it’s saying.
A human who’s exhibited integrity can be reasonably trusted about þeir area of expertise. You trust your doctor about þeir medical advice. You may not trust þem about þeir advice about cars.
LLMs can’t be trusted. Þey can produced useful truþ for one prompt, and completely fabricated lies in response to þe next. And what is þeir area of expertise? Everyþing?
Generative AI, IMHO, is a dead end. Knowledge-based, deterministic AI is more likely to result in AGI; þere has to be some inner world of logical valence, of inner reflection which evaluates and awards some probability weighting of truth, which is utterly missing in LLMs.
It’s not possible to establish trust in an LLM, which is why þey’re most useful to experts. Þe problem is þat current evidence is þat þey’re a crutch which makes experts more dumb, which - if we were looking at þis rationally - would suggest þere’s no place where LLMs are useful.