@infinitepcg

infinitepcg@lemmy.world · 3 months ago

This looks like an embarrassing mistake. If someone were to try to “tank” Twitter, it wouldn’t really make sense to do this on purpose.

infinitepcg@lemmy.world · edit-2 3 months ago

If the live version is already broken, there isn’t much to lose deploying the fix as soon as possible. Not sure what else they could have done here.

infinitepcg@lemmy.world · 4 months ago

The article just says that the account is suspended, there is no official statement from Twitter an no indication that they suspended the account on purpose. The most likely reason is that the account was mass reported by trolls and got suspended automatically.

infinitepcg@lemmy.world · 5 months ago

I think it’s reasonable to not short stocks. I just find it a bit weird to see people confidently proclaim that a company is overvalued, but than not shorting the stock, which would be the rational thing to do.

infinitepcg@lemmy.world · 5 months ago

It’s hard to tell how much a platform is worth, arguably the value of Twitter was 44B, since someone was willing to pay that.

The good news is, if you’re really certain that Reddit is overvalued, you’ll soon be able to short it and get rich if you end up being right!

infinitepcg@lemmy.world · 5 months ago

I don’t think the number of bots matters much, there are much more real people on Twitter than on Mastodon. It’s not an issue for Twitter because they already are the platform where everyone else is. I’m optimistic about Mastodon, it already has the better UX and the better business model and I think it will slowly attract more users over time and eventually reach the relevance that Twitter had at its peak.

infinitepcg@lemmy.world · 5 months ago

The difficult thing is gaining users, not writing the code.

infinitepcg@lemmy.world · edit-2 5 months ago

I’ve been on Mastodon for over a year and I never experienced anything that could be classified as a technical glitch. From a tech / UI perspective it feels very polished to me.

I guess the only exception would be that old posts are sometimes missing on profiles from different servers.

infinitepcg@lemmy.world · edit-2 7 months ago

This article is full of errors!

At its core, an LLM is a big (“large”) list of phrases and sentences

Definitely not! An LLM is the combination of an architecture and its model parameters. It’s just a bunch of numbers, no list of sentences, no database. (Seems like the author confused the word “LLM” with the dataset of the LLM???)

an LLM is a storage space (“database”) containing as many sample documents as possible

Nope. This applies to the dataset, not the model. I guess you can argue that memorization happens sometimes, so it might have some features of a database. But it isn’t one.

Additional data (like the topic, mood, tone, source, or any number of other ways to categorize the documents) can be provided

LLMs are trained in an unsupervised fashion. Just sequences of tokens, no labels.

Typically, an LLM will cover a single context, e.g. only social media

I’m not aware of any LLM that does this. What’s the “context” of GPT-4?

software developers have gone to great lengths to collect an unfathomable number of sample texts and meticulously categorize those samples in as many ways as possible

The closest real thing is the RLHF process that is used to fine tune an existing LLM for a specific application (like ChatGPT). The dataset for the LLM is not annotated or categorized in any way.

a GPT uses the words and proximity data stored in LLMs

This is confusing. “GPT” is the architecture of the LLM.

it is impossible for it to create something never seen before

This isn’t accurate, depending on the temperature setting, an LLM can output literally any word at any time with a non-zero probability. It can absolutely produce things it hasn’t seen.

Also I think it’s too simple to just assert that LLMs are not intelligent. It mostly depends on your definition of intelligence and there are lots of philosophical discussions to be had (see also the AI effect).

infinitepcg@lemmy.world · 7 months ago

Mich wundert, dass ein Kreistag überhaupt über die Gültigkeit entscheiden kann. Heißt das, dass alle anderen Kreistage sich für das Ticket entschieden haben? Ich dachte der Sinn von einem bundesweiten Ticket ist dass es überall gilt und man eben nicht vorher recherchieren muss ob eine konkrete Linie jetzt abgedeckt ist oder nicht.

infinitepcg@lemmy.world · 7 months ago

Naja dass CDU und FDP dagegen stimmen ist ja nicht besonders überraschend. Ein anderer Kommentar hier sagt, dass die Wahl mit 13:14 Stimmen entschieden wurde. Wenn das stimmt, hätten die zwei Grünen das Ergebnis also schon maßgeblich beeinflusst.

Gibt es eigentlich einen legitimen Grund warum Abgeordnete nicht zu einer Abstimmung kommen sollten? Ich dachte dass das Abstimmen im Kreistag eine zentrale Aufgabe der Abgeordneten ist.

infinitepcg@lemmy.world · 7 months ago

Based on your post history, you probably know how to do it ;)

Just for fun, I pasted your request into ChatGPT and it did indeed produce a function that passes the tests, I’m impressed.

infinitepcg@lemmy.world · 7 months ago

deleted by creator

infinitepcg@lemmy.world · 8 months ago

Naja, das Grundgesetz enthält ja Rechte die man gegenüber dem Staat hat, da “hält” man sich nicht dran.

infinitepcg@lemmy.world · edit-2 8 months ago

Whether something is derivative or not is one of the key questions used to determine whether the free use of someone else’s copyrighted work is fair, as in fair use.

I think training an AI model is not fair use. It’s either derivative work and needs a license or it’s not derivative work and can be used without a license. In both cases it’s not fair use (in the legal sense of “fair use”).

I’m not sure if you’re making an argument about what the law currently says or what it should say. In my opinion the law should be updated to clarify if you need a license to use copyrighted material as training data.

The amount that artists would be paid would be determined by negotiation between the artist (the rights holder) and the entity using their work

Sure, my point is such an agreement will never be made. It’s a good deal for AI companies to use the data for free, but if they can’t do that, they will not be interested.

Either way, I think there is no way for artists to win this. It’s completely possible to train large image generators without copyrighted material. These datasets are so large that paying artists per image will never be feasible.

infinitepcg@lemmy.world · 8 months ago

The problem is that it’s a fraction of a fraction of a cent per image used during training, over the lifetime of the model.

infinitepcg@lemmy.world · edit-2 8 months ago

If you look at a hundred paintings of faces and then make your own painting of a face, you’re not expected to pay all the artists that you used to get an understanding of what a face looks like.

Even if AI companies were to pay the artists and had billions of dollars to do it, each individual artist would receive a tiny amount, because these datasets are so large.

Much more realistically, they would just retrain their models using data they can use for free.

Btw, I don’t think this is a fair use question, it’s really a question of whether the generated images are derivatives of the training data.

infinitepcg@lemmy.world · 9 months ago

A debit card should be sufficient and it seems that 72% of the people in Saudi Arabia have a debit card , probably even more among those who would use social media.

infinitepcg@lemmy.world · edit-2 9 months ago

Das gibt’s in England auch. Man muss doppelt so lange warten, es gilt aber für alle Züge.