BeeTheory · Foundations · Technical Note X

Anatomy of the Residuals:
A Linear Trend with Disk Size

The 94-galaxy blind test of Note IX showed a systematic residual trend with disk size. This note characterises that trend quantitatively, isolates the largest deviations on each side, and identifies the structural origin of the dispersion.

1. The result first

A linear residual, two opposite populations

The prediction error scales linearly with the disk scale length: $\text{error}\,(\%) \approx -31.7 + 12.8\,R_d$, with Pearson correlation $r = +0.75$. The line crosses zero at $R_d = 2.48$ kpc, essentially the disk size of the Milky Way that anchored the calibration. The two extremes of this regression correspond to two physically distinct outlier populations: large massive spirals (over-predicted) at one end, compact dwarfs (under-predicted) at the other.

2. The residual is linear in $R_d$

Plotting the prediction error against $R_d$, with each point coloured by Hubble type, makes the linearity of the trend immediately visible. The red line is the linear regression of the error on $R_d$ over all 94 blind galaxies.

Prediction error vs disk size — linear in $R_d$, coloured by Hubble type 0.30.5123510 -50%-25%+0%+25%+50%+75% → Over-predicted region ±30% band ← Under-predicted region Disk scale length $R_d$ (kpc) — log scale Prediction error (%) F579-V1 (Sd-Im) Rd=3.20 Vf=105 err=-12%F583-1 (Sd-Im) Rd=1.80 Vf=83 err=-30%F583-4 (Sd-Im) Rd=1.40 Vf=67 err=-29%IC2574 (Sd-Im) Rd=2.80 Vf=69 err=+35%KK98-251 (Sd-Im) Rd=0.30 Vf=17 err=-23%M33 (Sc-Scd) Rd=1.40 Vf=100 err=-2%NGC0055 (Sd-Im) Rd=1.80 Vf=87 err=-1%NGC0100 (Sc-Scd) Rd=2.30 Vf=83 err=+13%NGC0247 (Sd-Im) Rd=2.40 Vf=90 err=+20%NGC0289 (Sb-Sbc) Rd=3.50 Vf=155 err=+29%NGC0300 (Sd-Im) Rd=1.50 Vf=76 err=+0%NGC0801 (Sc-Scd) Rd=5.80 Vf=208 err=+57%NGC0891 (Sb-Sbc) Rd=4.10 Vf=212 err=+7%NGC0925 (Sd-Im) Rd=3.10 Vf=105 err=+48%NGC1003 (Sc-Scd) Rd=2.80 Vf=115 err=+12%NGC1090 (Sb-Sbc) Rd=3.80 Vf=170 err=+17%NGC1705 (S0-Sa) Rd=0.60 Vf=54 err=-19%NGC2366 (Sd-Im) Rd=1.30 Vf=55 err=+14%NGC2403 (Sc-Scd) Rd=1.80 Vf=131 err=-4%NGC2683 (Sb-Sbc) Rd=2.90 Vf=175 err=+15%NGC2903 (Sb-Sbc) Rd=2.60 Vf=184 err=-0%NGC2915 (Sd-Im) Rd=0.50 Vf=85 err=-38%NGC2955 (Sb-Sbc) Rd=5.50 Vf=266 err=+53%NGC2976 (Sc-Scd) Rd=0.75 Vf=80 err=-37%NGC3109 (Sd-Im) Rd=1.40 Vf=68 err=-25%NGC3521 (Sb-Sbc) Rd=2.80 Vf=225 err=+5%NGC3621 (Sd-Im) Rd=2.10 Vf=149 err=+28%NGC3726 (Sc-Scd) Rd=3.00 Vf=152 err=+19%NGC3741 (Sd-Im) Rd=0.68 Vf=51 err=+26%NGC3769 (Sc-Scd) Rd=2.80 Vf=112 err=+25%NGC3877 (Sc-Scd) Rd=2.70 Vf=163 err=+12%NGC3893 (Sc-Scd) Rd=2.80 Vf=159 err=+14%NGC3949 (Sb-Sbc) Rd=1.40 Vf=125 err=-21%NGC3953 (Sb-Sbc) Rd=3.50 Vf=200 err=-6%NGC3972 (Sc-Scd) Rd=1.60 Vf=135 err=-27%NGC3992 (Sb-Sbc) Rd=3.80 Vf=242 err=-15%NGC4010 (Sc-Scd) Rd=1.80 Vf=128 err=-14%NGC4013 (Sc-Scd) Rd=2.20 Vf=185 err=+8%NGC4051 (Sb-Sbc) Rd=1.90 Vf=110 err=+3%NGC4085 (Sc-Scd) Rd=1.20 Vf=135 err=-41%NGC4088 (Sb-Sbc) Rd=1.90 Vf=175 err=-27%NGC4100 (Sb-Sbc) Rd=1.80 Vf=162 err=-28%NGC4138 (S0-Sa) Rd=1.30 Vf=150 err=-44%NGC4157 (Sb-Sbc) Rd=2.60 Vf=185 err=-1%NGC4183 (Sc-Scd) Rd=1.60 Vf=110 err=-36%NGC4214 (Sd-Im) Rd=0.50 Vf=68 err=-27%NGC4217 (Sb-Sbc) Rd=2.80 Vf=180 err=+5%NGC4389 (Sb-Sbc) Rd=1.20 Vf=110 err=-43%NGC4559 (Sc-Scd) Rd=3.20 Vf=123 err=+28%NGC5005 (Sb-Sbc) Rd=3.00 Vf=260 err=-8%NGC5033 (Sc-Scd) Rd=4.50 Vf=195 err=+44%NGC5055 (Sb-Sbc) Rd=3.50 Vf=180 err=+32%NGC5371 (Sb-Sbc) Rd=3.80 Vf=225 err=+14%NGC5585 (Sd-Im) Rd=1.50 Vf=87 err=-5%NGC5907 (Sc-Scd) Rd=4.20 Vf=210 err=+32%NGC5985 (Sb-Sbc) Rd=4.50 Vf=295 err=+2%NGC6015 (Sc-Scd) Rd=2.40 Vf=142 err=+6%NGC6195 (Sb-Sbc) Rd=5.20 Vf=260 err=+46%NGC6503 (Sc-Scd) Rd=2.40 Vf=121 err=+39%NGC6674 (Sb-Sbc) Rd=5.50 Vf=260 err=+46%NGC6789 (Sd-Im) Rd=0.30 Vf=60 err=-63%NGC6946 (Sc-Scd) Rd=2.60 Vf=180 err=+10%NGC7331 (Sb-Sbc) Rd=3.20 Vf=265 err=+4%NGC7793 (Sd-Im) Rd=1.80 Vf=118 err=-1%UGC00128 (Sd-Im) Rd=7.50 Vf=135 err=+80%UGC02259 (Sd-Im) Rd=1.60 Vf=90 err=+0%UGC02487 (S0-Sa) Rd=7.50 Vf=330 err=+41%UGC02885 (Sc-Scd) Rd=8.50 Vf=290 err=+52%UGC05716 (Sd-Im) Rd=2.00 Vf=75 err=-6%UGC05721 (Sd-Im) Rd=1.20 Vf=85 err=+0%UGC05750 (Sd-Im) Rd=4.50 Vf=80 err=+38%UGC05764 (Sd-Im) Rd=0.40 Vf=57 err=-46%UGC05829 (Sd-Im) Rd=1.60 Vf=69 err=-10%UGC06399 (Sd-Im) Rd=2.50 Vf=89 err=+11%UGC06446 (Sd-Im) Rd=1.80 Vf=87 err=+6%UGC06614 (S0-Sa) Rd=4.50 Vf=200 err=+19%UGC06628 (Sd-Im) Rd=2.50 Vf=75 err=+7%UGC06667 (Sd-Im) Rd=2.50 Vf=90 err=+16%UGC06917 (Sd-Im) Rd=2.50 Vf=110 err=-12%UGC06983 (Sc-Scd) Rd=2.50 Vf=113 err=+4%UGC07125 (Sd-Im) Rd=4.50 Vf=75 err=+37%UGC07151 (Sc-Scd) Rd=1.30 Vf=82 err=-32%UGC07261 (Sd-Im) Rd=1.10 Vf=72 err=-29%UGC07399 (Sd-Im) Rd=1.40 Vf=93 err=-19%UGC07690 (Sd-Im) Rd=0.70 Vf=62 err=-29%UGC08286 (Sc-Scd) Rd=1.30 Vf=84 err=-4%UGC08490 (Sd-Im) Rd=0.65 Vf=80 err=-29%UGC08550 (Sd-Im) Rd=1.50 Vf=67 err=-17%UGC09037 (Sc-Scd) Rd=3.50 Vf=160 err=-8%UGC11455 (Sc-Scd) Rd=5.50 Vf=275 err=-28%UGC11557 (Sd-Im) Rd=3.00 Vf=90 err=+1%UGC11820 (Sd-Im) Rd=4.50 Vf=90 err=+26%UGCA281 (Sd-Im) Rd=0.50 Vf=40 err=-36%UGCA442 (Sd-Im) Rd=1.00 Vf=57 err=-44% err ≈ -31.7 +12.8·Rd zero at Rd=2.48 kpc Pearson correlationr = +0.749 S0-Sa (T=0-2)Sb-Sbc (T=3-4)Sc-Scd (T=5-6)Sd-Im (T=7-10)
94 blind galaxies plotted versus disk size, coloured by Hubble type. The red line is the linear regression of the error on $R_d$. It crosses zero at $R_d = 2.48$ kpc — essentially the disk size that anchored the original calibration.

Error as a function of disk size

$$\text{error}\,(\%) \;\approx\; -31.7 \;+\; 12.8 \times R_d \,[\text{kpc}]$$

Linear fit on 94 blind galaxies, Pearson $r = +0.75$, RMSE of residuals $= 18.4\%$.

Comparison of functional forms

Several alternative parametrisations were compared. The linear form is statistically indistinguishable from log and square-root alternatives:

ModelPearson $r$RMSEComment
$\text{err} = a + b\,R_d$ (linear)$+0.749$$18.4\%$Cleanest analytical form
$\text{err} = a + b\,\log_{10}R_d$$+0.748$$18.4\%$Statistically equivalent
$\text{err} = a + b\,\sqrt{R_d}$$+0.768$$17.7\%$Marginally better, no real gain
$\text{err} = a + b\,R_d + c\,R_d^2$$17.8\%$Quadratic term very small ($c \approx -1.1$)

The linear form is therefore adopted as the simplest faithful description of the data.

Hubble type distribution along the line

Hubble class $N$ Median $R_d$ (kpc) Median error Position
S0–Sa (early-type)42.9$+0.0\%$Centre, near the zero crossing
Sb–Sbc (intermediate)233.2$+3.9\%$Right of centre; tail in the over-predicted region
Sc–Scd (late spiral)272.5$+7.7\%$Spread across the diagram
Sd–Im (dwarf / irregular)401.6$-3.2\%$Left side; tail in the under-predicted region

The colour pattern in the figure is not an independent signature from the linear trend — it is the same signature seen through the morphology axis. The Hubble sequence in disk galaxies correlates with disk size: late-type dwarfs are predominantly compact, intermediate spirals are predominantly large. Each colour therefore sits along a different stretch of the regression line, with Sd–Im on the left, Sc–Scd at the centre, and Sb–Sbc on the right.

A structural residual, not random noise

A scatter that depends linearly on a single physical parameter, and crosses zero at the calibration point, is the signature of a missing additive constant in one of the model’s relations, not of random observational scatter. The deviation is correctable: it can be absorbed by a single additional degree of freedom in the coherence-length law.

3. The ten most over-predicted galaxies

These are the galaxies for which BeeTheory predicts a flat rotation velocity higher than observed. Sorted by the size of the residual:

GalaxyHubble type$R_d$ (kpc)$M_\star/10^{10}$$f_\text{gas}$$\Sigma_d$$V_f$$V_\text{tot}$Error
UGC00128Sd-Im7.501.060.3960135243+80.0%
NGC0801Sb-Sbc5.802.010.32190208326+56.6%
NGC2955Sb-Sbc5.503.990.23420266406+52.7%
UGC02885Sc-Scd8.503.400.41150290441+52.0%
NGC0925Sc-Scd3.100.220.7572105155+48.0%
NGC6195Sb-Sbc5.203.400.26400260380+46.3%
NGC6674Sb-Sbc5.503.330.29350260380+46.2%
NGC5033Sb-Sbc4.501.270.46200195280+43.7%
UGC02487S0-Sa7.505.300.23300330465+40.8%
NGC6503Sc-Scd2.400.380.55210121168+38.9%
PropertyMedian valueRangeComparison
$R_d$4.5 kpc2.4 – 8.5$2\times$ larger than median
$M_\star$$1.3 \times 10^{10}\,M_\odot$$2.2 \times 10^{9}$ – $5.3 \times 10^{10}$$8\times$ more massive
$f_\text{gas}$$0.41$$0.23$ – $0.87$Below median (0.64)
Hubble $T$$5$ (Sbc)$1$ – $8$Concentrated in intermediate spirals
$V_f$$195$ km/s$69$ – $330$Fastest rotators in the sample

Profile of the over-predicted group

Large, massive, intermediate-type spirals. These galaxies sit on the right side of the regression line, well above the zero crossing. The model’s coherence-length law $\ell = c_\text{disk}\,R_d$ produces values of $\ell$ above 20 kpc in this regime, generating more wave-field mass than the observed rotation requires.

4. The ten most under-predicted galaxies

These are the galaxies for which BeeTheory predicts a flat rotation velocity lower than observed. Sorted by the size of the residual:

GalaxyHubble type$R_d$ (kpc)$M_\star/10^{10}$$f_\text{gas}$$\Sigma_d$$V_f$$V_\text{tot}$Error
NGC6789Sd-Im0.300.010.532506022-63.0%
UGC05764Sd-Im0.400.000.86805731-45.6%
UGCA442Sd-Im1.000.000.85155732-44.2%
NGC4138S0-Sa1.300.130.3325015085-43.6%
NGC4389Sb-Sbc1.200.070.3715011062-43.4%
NGC4085Sb-Sbc1.200.090.4220013579-41.1%
NGC2915Sd-Im0.500.010.841608553-38.2%
NGC2976Sb-Sbc0.750.040.292208050-37.4%
NGC4183Sc-Scd1.600.030.814011070-36.3%
UGCA281Sd-Im0.500.010.63804026-36.1%
PropertyMedian valueRangeComparison
$R_d$1.1 kpc0.30 – 1.80$2\times$ smaller than median
$M_\star$$2.7 \times 10^{8}\,M_\odot$$4 \times 10^{7}$ – $1.3 \times 10^{9}$$6\times$ less massive
$f_\text{gas}$$0.58$$0.29$ – $0.86$Below median (0.64)
Hubble $T$$8$ (Sd)$1$ – $10$Concentrated in late-type dwarfs
$V_f$$82$ km/s$40$ – $150$Slow rotators

Profile of the under-predicted group

Compact, low-mass dwarfs and small spirals. These galaxies sit on the left side of the regression line, well below the zero crossing. The coherence-length law $\ell = c_\text{disk}\,R_d$ produces $\ell$ of order $1$–$3$ kpc in this regime, possibly too short to gather the full extent of the wave field.

5. Side-by-side comparison of the three groups

Property (median) Over-predicted
(err > +30%, $N = 15$)
Well-predicted
(|err| ≤ 30%, $N = 67$)
Under-predicted
(err < -30%, $N = 12$)
$R_d$ (kpc)4.52.41.1
$M_\star / 10^{10}$1.270.150.027
$M_\text{gas} / 10^{10}$0.930.270.04
$f_\text{gas}$0.410.640.58
$\Sigma_d$200140115
Hubble $T$5 (Sbc)6 (Sc)8 (Sd)
$V_f$ (km/s)19511382

Every property varies monotonically from left to right. The over-predicted group is larger, more massive, more star-dominated and faster-rotating; the under-predicted group is smaller, lighter, gas-rich and slower; the well-predicted majority sits in between. The Milky Way ($R_d = 2.6$ kpc, $V_f approx 230$ km/s) falls naturally within the well-predicted regime where the calibration was anchored.

6. Interpretation

The model has a single coupling parameter $\lambda$ and three universal geometric constants $(c_\text{disk}, c_\text{sph}, c_\text{arm})$. These were determined on a galaxy of intermediate size (the Milky Way, $R_d = 2.6$ kpc) and validated on twenty-two galaxies of similar size range. The blind test of Note IX shows that they generalise reasonably well, but with a residual that drifts linearly with disk size.

An affine correction is sufficient

The linearity of the residual in $R_d$ — well-fit by a single straight line crossing zero at $R_d = 2.48$ kpc — is the signature of a missing additive offset in the coherence-length relation. The current law $\ell = c_\text{disk}\,R_d$ ties the wave-coherence length strictly proportionally to the disk scale. Replacing it with an affine relation $\ell = c_\text{disk}(R_d – R_0)$, where $R_0$ is a small offset of about $2.5$ kpc, would produce a residual that vanishes at the calibration point and grows linearly on either side — exactly the pattern observed.

The well-predicted majority is broadly representative

Two thirds of the sample fall in the well-predicted band. These 67 galaxies span the full range of Hubble types and a factor of $sim 100$ in stellar mass. The model’s domain of validity is not narrow: it covers most of the SPARC population, with deviations concentrated at the two extremes of disk size, exactly as a linear $R_d$-dependent residual would produce.

7. Summary

1. The prediction error of the 94-galaxy blind test follows a clean linear trend in disk scale length: $\text{error}(\%) \approx -31.7 + 12.8\,R_d$, with Pearson $r = +0.75$ and RMSE of residuals $= 18.4\%$.

2. The linear regression crosses zero at $R_d = 2.48$ kpc, essentially the disk size of the Milky Way that anchored the calibration. The two ends of the line correspond to two physically distinct outlier populations.

3. The 15 galaxies over-predicted by more than $+30\%$ are large, massive, intermediate-type spirals: median $R_d = 4.5$ kpc, $M_\star \approx 10^{10}\,M_\odot$, $V_f \approx 200$ km/s.

4. The 12 galaxies under-predicted by more than $-30\%$ are compact, low-mass dwarfs: median $R_d = 1.1$ kpc, $M_\star \approx 3 \times 10^{8}\,M_\odot$, $V_f \approx 80$ km/s.

5. The deviation is absorbable by an affine correction to the coherence-length law, $\ell = c_\text{disk}(R_d – R_0)$, with $R_0 \approx 2.5$ kpc — introducing a single new constant.


References. Lelli, F., McGaugh, S. S., Schombert, J. M. — SPARC: Mass Models for 175 Disk Galaxies with Spitzer Photometry and Accurate Rotation Curves, AJ 152, 157 (2016). · de Vaucouleurs, G. et al. — Third Reference Catalogue of Bright Galaxies, Springer (1991). · McGaugh, S. S. — The third law of galactic rotation, Galaxies 2, 601 (2014). · Dutertre, X. — Bee Theory™: Wave-Based Modeling of Gravity, v2, BeeTheory.com (2023).

BeeTheory.com — Wave-based quantum gravity · SPARC residuals · © Technoplane S.A.S. 2026