← Back to Blog Research

What 18,219 Speaking Sessions Taught Us About How People Actually Speak

· 7 min read
William Burden
William Burden Founder @ Elqo

One of the strange privileges of building a speaking-practice product is that, after a while, you stop having to guess what people actually struggle with. You can just look.

This week we ran an analysis on some recordings our users have completed on Elqo — from June 2025 through April 2026. 18,219 recordings. 3,221 users. All anonymised, all aggregated, all analysed against the same set of objective signals and against our AI coach's own log of recurring challenges per user.

Three things stood out, and they didn't quite line up with what people say they need help with when they sign up. Here's what the data showed, and what we're doing about it.

The Headline: Three Habits, Not Twenty

It would have been easy to assume the most common issues are dramatic ones — nerves, freezing up, accent, vocabulary. They're not. Across 18,219 sessions and 3,221 users, the things people actually trip on are simpler and more stubborn:

  1. Pacing — almost half of all sessions land outside the natural speaking band, and "too slow" outweighs "too fast" nearly four to one.
  2. Filler words — the cleanest signal in the dataset, confirmed both by per-session counts and by the AI coach's own user-level memory.
  3. Visual delivery — gestures, eye contact, and posture cluster together as a tight band of recurring challenges, even though most users never list them as their reason for signing up.

Everything else — pronunciation, accuracy, vocabulary, fluency — sits well behind these three. Whatever else "speaking better" means, in practice it almost always means fixing one of these three first.

Finding 1: Pacing Is The Most Under-Recognised Issue

Pacing is tracked on every Elqo session via words per minute, and it's been measured since launch — so we have deep coverage on it (10,805 recordings with a valid WPM reading). The distribution looked like this:

Pacing bandSessionsAverage WPM
Ideal (100–160 WPM)5,635124.5
Too slow (<100 WPM)3,96072.8
Too fast (>160 WPM)1,052305.9

That's 47% of sessions outside the natural conversational band, and the imbalance is striking: too-slow sessions outnumber too-fast ones by nearly 4:1. The average too-slow session sits at 72.8 WPM — well below the 100 WPM floor where listeners start drifting.

The most interesting part isn't the raw number. It's the gap between how often pacing is off and how often the AI coach flags it as a recurring user-level issue: only 13.5% of users had pacing logged as a recurring challenge, despite 47% of sessions actually being off-target.

That's a real signal. Pacing is the most prevalent, most under-recognised issue in the dataset. Most people don't realise they're doing it — and even our own coaching memory was being too quiet about it.

Finding 2: Filler Words Are The Cleanest Signal We Have

If pacing is the issue people don't realise they have, filler words are the issue everyone agrees on. Two completely different views of the data say the same thing:

  • Per-session view: 34% of sessions contain 3 or more filler words. The all-up average is 2.49 fillers per session.
  • Per-user view (AI memory): 38.6% of users with any tagged challenges have "filler words" listed as a recurring issue — the #1 real challenge in the AI's own log.

Two independent measurements within ~4 percentage points of each other. That's about as clean as user-behaviour signals get. Filler words are the single most defensible "most common speaking issue" claim we can make from this data, and they're squarely in the "noticeable but fixable" range — not so frequent that they signal a deep problem, frequent enough that listeners hear them.

Distribution of fillers per session:

  • 0 fillers: 3,318 sessions
  • 1–2 fillers: 3,790 sessions
  • 3–5 fillers: 2,599 sessions
  • 6–10 fillers: 973 sessions
  • 11+ fillers: 125 sessions

Most people are not catastrophically filler-heavy. But about a third of sessions cross the threshold where listeners start tagging the speaker as hesitant or unprepared, even unconsciously.

Finding 3: Visual Delivery Is Bigger Than Most Users Realise

The third surprise in the data was how often the AI coach's user-level memory ended up tagging visual challenges — gestures, eye contact, posture — as recurring issues, even though almost nobody signs up to Elqo specifically saying "I want to fix my body language."

Among users with any tagged recurring challenges:

Recurring challenge% of tagged users
Filler words38.6%
Visual presence36.1%
Gestures26.4%
Eye contact25.0%
Posture22.1%
Speaking pace13.5%

Three of the top six recurring challenges are visual. Stack them together and visual delivery is, behaviourally, every bit as common a bottleneck as anything verbal. Yet it almost never makes it into the goal users describe at signup — people come for "I want to sound more confident" and leave with feedback on the fact that their hands haven't moved in 90 seconds.

This matters because visual signals carry disproportionate weight in how listeners judge competence. A speaker with neutral verbal delivery and strong visual presence reads as confident; the same speaker with weak visual delivery reads as nervous, even when the words are identical.

Find Your Own Pattern In Under 10 Minutes

Run two short Elqo sessions and you'll see exactly where your version of the dataset above lives — pacing band, filler-word density, eye contact, gestures, posture. Free to start. Works in any browser.

Start Practicing For Free

What We're Changing In The Product

Pulling 18,219 sessions of data is only useful if it changes what we ship. Three things are moving in response to this analysis:

1. Stronger pacing rule and a more visible WPM band

Pacing is the most prevalent issue in the dataset and the most under-flagged by the coach. We're tightening the recurring-challenge logic so pacing gets surfaced when sessions trend slow, and the WPM-band feedback is moving higher up in the post-session view so users actually see it. If "too slow" is your pattern, you should know after one session, not after we've watched you for a month.

2. Filler-word feedback gets the headline treatment

Two independent views of the data agree this is the cleanest, most universal speaking issue across our user base. We're treating it that way: filler-word counts, density, and the specific words you over-use are getting featured more prominently in the session summary, and we're building a dedicated "filler word reduction" lesson track based on the techniques in our 7 techniques guide.

3. Per-session visual scores

Visual challenges show up everywhere in the AI's user-level memory but don't have a clean per-session number on the recording itself yet. That's the gap closest to the top of the build queue: per-session scores for eye contact, gestures, and posture, sitting alongside pace and filler words in the session summary — so visual delivery becomes something you can track, not just something the coach occasionally mentions.

What You Can Do Tonight

If you don't want to wait for any of the above to ship, the data already points at what's worth practising. Two specific drills, drawn straight from the patterns in the 18,219 sessions:

  1. Record yourself for 90 seconds answering "tell me about a project you're proud of." Then play it back at normal speed. If you can finish the sentence in your head before your recording does, you're in the too-slow band — which, statistically, is where most people sit. Try the same prompt again, deliberately faster, and notice how much more confident you sound.
  2. Replace one filler word with a pause. Just one. Pick the one you use most (in our data, "um" and "like" dominate) and consciously close your mouth instead. The first three or four pauses will feel eternal. They're not. To listeners, they're invisible. To you, they're how the habit eventually fades.

Both drills work without any tool. Both work better with one. Practising alone with feedback is what closes the gap between "I know what I should do" and "I actually do it" — which, after looking at 18,219 sessions of the gap, is a thing we feel more strongly about than ever.

The Bottom Line

The most common speaking issues are not the dramatic ones. After analysing every recording our users have ever made, the picture is consistent: pace, fillers, and visual delivery. Energy, fluency, and accuracy show up too, but in narrower, more situational ways — while these three appear nearly everywhere.

If you only have time to work on one thing, work on pacing — it's the most prevalent issue and the easiest to ignore until someone shows you a number. If you have time for two, add filler words. If you have time for three, record on video, not just audio, and watch what your body is doing.

That's what 18,219 sessions said. And it's what's now driving where the product goes next.

See Where You Sit In The Data

Elqo gives you instant AI feedback on pace, filler words, eye contact, gestures, and posture — in a single session. Free to start, no credit card, works in any browser.

Try Elqo Free