Impertinent

Impertinent

Share this post

Impertinent
Impertinent
Big Data, Small Prompt
Copy link
Facebook
Email
Notes
More
All Things AI

Big Data, Small Prompt

AGI has brought out the worst in us. We want to take the lazy road to analytics.

Bill french's avatar
Bill french
May 16, 2023
∙ Paid

Share this post

Impertinent
Impertinent
Big Data, Small Prompt
Copy link
Facebook
Email
Notes
More
Share

I should be able to load up all my time-series data and use natural language to ask it statistical questions and other analyses.

I hear this often. It’s a nice dream. However, stating what you “should” be able to do is a hypothesis that must also factor in practical boundaries.

Time-series data is typically raw and voluminous. Intentionally, IoT signals are collected to ensure real-time perturbations can be detected and corrected. This is especially important for mission-critical processes where a few missed events sometimes indicate a big problem. But time-series data is also valuable for machine learning. The idyllic goal described above is neither - it’s the lazy pathway to analytics. And I’m okay with that - I love a good lazy approach - it’s how great innovations are made.

IN THIS CASE, the AI “fit” is out of reach (based on my skill set, known approaches, and financial practicalities). Practically speaking, this is a round hole and giant earth mover problem. Putting a mega earth mover in a small hole has one challenge - physics.

AI interfaces (UIs and APIs) are presently limited; they’re tiny holes.

There are indications we’ll soon see 100k prompt capabilities. But that’s nothing compared to extremely granular time-series data - at least the volume that would produce valid assessments.

Anthropics Claude is capable of 100,000 token context windows. But analyzing 100,000 text tokens differs greatly from the comparably-sized time-series data streams.

It’s also challenging to take a slice of the series and expect your analytics to be valid; the entire point of analytics is to factor in lots of data. As such, the only rational pathway I can see is to aggregate extremely detailed data sets, then expose the aggregated summaries to the AI model in discrete learner prompts.

Like this:

Keep reading with a 7-day free trial

Subscribe to Impertinent to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Bill French
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More