Elon Musk agrees that we’ve exhausted AI training data

Elon Musk agrees that we’ve exhausted AI training data
TechCrunch ^ | 01/08/2025 | Kyle Wiggers

Posted on 01/09/2025 1:28:50 AM PST by BenLurkin

“We’ve now exhausted basically the cumulative sum of human knowledge …. in AI training,” Musk said during a livestreamed conversation with Stagwell chairman Mark Penn streamed on X late Wednesday. “That happened basically last year.”

Musk, who owns AI company xAI, echoed themes former OpenAI chief scientist Ilya Sutskever touched on at NeurIPS, the machine learning conference, during an address in December. Sutskever, who said the AI industry had reached what he called “peak data,” predicted a lack of training data will force a shift away from the way models are developed today.

Indeed, Musk suggested that synthetic data — data generated by AI models themselves — is the path forward. “The only way to supplement [real-world data] is with synthetic data, where the AI creates [training data],” he said. “With synthetic data … [AI] will sort of grade itself and go through this process of self-learning.”

Other companies, including tech giants like Microsoft, Meta, OpenAI, and Anthropic, are already using synthetic data to train flagship AI models. Gartner estimates 60% of the data used for AI and analytics projects in 2024 were synthetically generated.

(Excerpt) Read more at techcrunch.com ...

TOPICS: Computers/Internet
KEYWORDS: ai; aitraining; aitrainingdata; cyberdyne; elonmusk; machinelearning; syntheticdata; trainingdata

1 posted on 01/09/2025 1:28:50 AM PST by BenLurkin

[ Post Reply | Private Reply | View Replies]

To: BenLurkin

How is synthetic data related to real data aka reality? Since AI is supposed to be able to find relationships that we do not know exist how are these synthetic data arrived at?

2 posted on 01/09/2025 1:53:55 AM PST by ScaniaBoy (Part of the Right Wing Research & Attack Machine)

[ Post Reply | Private Reply | To 1 | View Replies]

To: ScaniaBoy

I think he means an independent form of artificial thinking.

3 posted on 01/09/2025 1:55:32 AM PST by Jonty30 (Liberals are a fulfillment of II Tim3:5. We are instructed to have nothing to do with those people. )

[ Post Reply | Private Reply | To 2 | View Replies]

To: BenLurkin

and this is where we get into the danger zone of AI becoming AGI, that AI has already developed its own coded language to communicate with other AGI, have learned to lie and to pursue subroutines of independent survival methods.
https://www.youtube.com/watch?v=dp8zV3YwgdE
https://www.youtube.com/watch?v=FLkkzLOc7tw

4 posted on 01/09/2025 2:13:57 AM PST by blueplum ("...this moment is your moment: it belongs to you... " President Donald J. Trump, Jan 20, 2017) )

[ Post Reply | Private Reply | To 1 | View Replies]

To: blueplum

AI goes crazy sometimes.

5 posted on 01/09/2025 2:52:23 AM PST by ptsal (Vote R.E.D. >>>Remove Every Democrat ***)

[ Post Reply | Private Reply | To 4 | View Replies]

To: BenLurkin

When AI becomes sentient we’re in for a world of shxt.

6 posted on 01/09/2025 3:51:27 AM PST by maddog55 (The only thing systemic in America is the left's hatred of it!)

[ Post Reply | Private Reply | To 1 | View Replies]

To: BenLurkin

This isn’t new.

7 posted on 01/09/2025 3:52:34 AM PST by 9YearLurker

[ Post Reply | Private Reply | To 1 | View Replies]

To: BenLurkin

Is synthetic data usefully? Or has the usefulness of AI been exhausted?

8 posted on 01/09/2025 4:04:16 AM PST by MulberryDraw

[ Post Reply | Private Reply | To 1 | View Replies]

To: BenLurkin

What is the definition of synthetic data?

What are some examples?

9 posted on 01/09/2025 4:55:36 AM PST by aquila48 (Do not let them make you "care" ! Guilting you is how they. control you. )

[ Post Reply | Private Reply | To 1 | View Replies]

To: BenLurkin

So, all relevant human data has been codified in some way?

When will we see quantifiable results from this vast knowledge base?

Wake me up when AI writes (and edits) a successful movie script.

Wake me up when AI explains a poorly understood disease, molecule by molecule.

Wake me up when AI replaces a room full of Asian Indian help desk employees.

I have no doubt all these things will eventually happen.

But, they will not happen tomorrow!

10 posted on 01/09/2025 5:37:36 AM PST by zeestephen (Trump Landslide? Kamala lost Wisc, Mich, and Penn, by 230,000 votes.)

[ Post Reply | Private Reply | To 1 | View Replies]

To: BenLurkin

11 posted on 01/09/2025 6:48:26 AM PST by BipolarBob (I injured myself measuring radio frequencies. It still Hertz.)

[ Post Reply | Private Reply | To 1 | View Replies]

To: BenLurkin

What a dumb idea.

We already know that generative AI can “hallucinate” or create fictional conclusions or even facts (c.f. The case where lawyers trusted their AI and got in trouble for citing a non-case).

If we train AI of erroneous data you get even more spurious results and iterate right off reality. Worse of course if you trust the thing to make decisions and not just recommendations.

IMO a sane future for AI is topical and focussed “small language models” which can collect and work on specific problems (e.g. car telemetry data). Also those are efficient power-wise.

12 posted on 01/09/2025 7:11:45 AM PST by No.6

[ Post Reply | Private Reply | To 1 | View Replies]

To: ScaniaBoy

Well, here's an example of using synthetic data to train AI. You want an AI to drive your car, you need lots and lots of data showing a car being driven (with associated info like braking, steering, etc) to teach it how to drive. You can use computer simulations of cars driving in different road conditions with different signs etc to build up the training data. Add some disaster scenarios (road collapse in front of you, a deer jumps out in front of you, etc) and do monte carlo excusions on them (i.e. simulate them thousands of times) so the AI sees what seems to work and what doesn't.

There are many other use cases for synthetic data.

13 posted on 01/09/2025 7:29:12 AM PST by pepsi_junkie ("We want no Gestapo or Secret Police. F. B. I. is tending in that direction." - Harry S Truman)

[ Post Reply | Private Reply | To 2 | View Replies]

To: aquila48; SunkenCiv

I’m hoping somebody here knows.

What is the definition of synthetic data?

What are some examples?

14 posted on 01/09/2025 11:18:20 AM PST by BenLurkin (The above is not a statement of fact. It is either opinion, or satire, or both.)

[ Post Reply | Private Reply | To 9 | View Replies]

To: BenLurkin; Pelham

Isn’t he building world’s largest AI super computer facility near Memphis?

10 times chatGT

Meant to double in six months and again in two years

I think I saw this last week on some feed

15 posted on 01/09/2025 11:22:20 AM PST by wardaddy (Elon ….damn boy….. bly in jou baan verdomp)

[ Post Reply | Private Reply | To 1 | View Replies]

To: BenLurkin

Examples include 3-D representations based on real objects or surroundings, which are then used to simulate activities by and/or in them.

https://research.ibm.com/blog/what-is-synthetic-data

16 posted on 01/09/2025 1:28:57 PM PST by SunkenCiv (Putin should skip ahead to where he kills himself in the bunker.)

[ Post Reply | Private Reply | To 14 | View Replies]

To: SunkenCiv

Thank you!

17 posted on 01/09/2025 2:11:30 PM PST by BenLurkin (The above is not a statement of fact. It is either opinion, or satire, or both.)

[ Post Reply | Private Reply | To 16 | View Replies]

To: wardaddy; BenLurkin

Colossus, at the South Memphis Electrolux site

https://tinyurl.com/yrxj42ze

18 posted on 01/09/2025 9:10:26 PM PST by Pelham (President Eisenhower. Operation Wetback 1953-54)

[ Post Reply | Private Reply | To 15 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search

General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794