Back to DavisAI Talk to us
Research Paper — physics.info-theory

The Information Dipole: A Scale-Invariant Conservation Law from Subatomic to Ecological Systems

Greg Davis
DavisAI Systems, Columbus, OH
April 2026 — Preprint (submitting to Nature Physics)

Abstract

We report the discovery of a scale-invariant information-theoretic structure — the information dipole — present across six physical scales spanning 19 orders of magnitude. For any system composed of coupled subsystems, the ratio C = Hself / Hcross (self-complexity to cross-coupling) converges to a finite, positive equilibrium value C* that is disrupted by pathological perturbation. We demonstrate this structure in: (1) subatomic particle physics (CERN CMS dimuon resonances, N=100,000 events), (2) molecular gene coexpression networks (PBMC single-cell RNA-seq, 2,638 cells), (3) cellular organization (8 cell types), (4) human organ-system coupling (NHANES, N=7,199 patients), (5) brain consciousness states (Sleep-EDF, 758 epochs; CHB-MIT seizure, 64 epochs), and (6) ecological food webs (6 real networks, 13–56 species). Perturbation disrupts C* bidirectionally: disease and deep sleep collapse C toward zero, while seizure and apex predator removal drive C toward infinity. We propose C* as a candidate information-geometric conservation law analogous to the second law of thermodynamics.

Key Result: The information dipole is detected at 6 of 6 physical scales tested, from 10−15 m (subatomic) to 104 m (ecological) — spanning 19 orders of magnitude in physical size. All four falsification criteria are satisfied.

1. Introduction

Physics possesses conservation laws for energy, momentum, angular momentum, and charge. These laws constrain all physical systems regardless of scale, composition, or mechanism. No analogous conservation law has been established for information structure — the balance between a system’s internal complexity and its coupling to its environment.

Shannon entropy [1] quantifies the amount of information in a system, but does not address how that information is organized. A random gas and a living cell may contain the same total Shannon entropy, yet their information is structured in fundamentally different ways. The cell maintains a partition: internal complexity (metabolic pathways, gene regulatory networks) balanced against environmental coupling (membrane transport, signaling). This partition exists at every scale where stable organized structures are found — from quantum systems maintaining coherence against decoherence, to ecosystems maintaining trophic structure against extinction.

We hypothesize that this partition is not coincidental but reflects a universal constraint. Specifically, we propose that any stable complex system maintains a characteristic ratio between its self-complexity and its cross-coupling:

C = Hself / Hcross

where Hself is the Shannon entropy of internal state variables and Hcross is the mutual information with external systems. We call C the information dipole, by analogy with the electric dipole: a structure defined by the separation of two opposing poles.

1.1 Falsification Criteria

We state our falsification criteria before presenting results:

If any of F1–F4 fail, the universality claim is falsified.

1.2 Relation to Existing Frameworks

The information dipole connects to several established theoretical programs. Friston’s free-energy principle [2] posits that biological systems maintain a Markov blanket separating internal from external states; C quantifies the strength of this partition. Tononi’s Integrated Information Theory [3] measures consciousness via information integration (Φ); C generalizes this beyond consciousness to any scale. In quantum mechanics, decoherence theory [4] describes how environment-system coupling destroys quantum coherence; C tracks this process as information collapse. The dipole also relates to network modularity [5], which measures community structure in complex networks; C provides the information-theoretic analog.

2. Theory

2.1 Definition

For a system S decomposed into subsystem A and its environment B:

When C > 1, the system contains more self-complexity than external coupling (autonomous). When C < 1, the system is more coupled to its environment than internally complex (entangled/dependent). C = 0 represents information collapse: no self-structure remains.

2.2 Operator Form

For multi-component systems with N subsystems, the total mutual information MItotal evolves according to:

dMItotal/dt ∼ ∑i [ cself,i · Hi² ] + ∑i<j [ ccross,ij · Hi · Hj ] + linear terms

The dipole exists when the self-coefficients cself and cross-coefficients ccross have opposing signs. This opposition creates a dynamical balance: self-complexity growth (H² terms) drives coupling, while coupling (H·H′ terms) constrains self-complexity.

2.3 Predictions

3. Methods

3.1 Datasets

We analyzed six datasets spanning scales from 10−15 m (subatomic) to 104 m (ecological):

Subatomic. CMS MuRun2010B dimuon collision dataset from CERN Open Data [6]. 100,000 dimuon events with kinematic variables (E, px, py, pz, pt, η, φ, charge) for each muon, plus invariant mass M. Events span mass range 2.0–110.0 GeV, encompassing the J/ψ (3.1 GeV), Υ (9.5 GeV), and Z boson (91.2 GeV) resonances.

Molecular. PBMC 3k single-cell RNA-seq dataset from 10X Genomics [7]. 2,638 peripheral blood mononuclear cells, 13,714 genes. Log-normalized expression values. Pre-existing Louvain clustering identifies 8 cell types (CD4 T cells, CD14+ Monocytes, B cells, CD8 T cells, NK cells, FCGR3A+ Monocytes, Dendritic cells, Megakaryocytes).

Cellular. Same dataset as molecular, analyzed at the cell-type level rather than the gene-module level. This intentional reuse tests whether the dipole exists at two distinct observation scales within the same biological system.

Organ. NHANES 2021–2023 national health survey [8]. 7,199 patients with simultaneous measurements across 6 organ systems (liver, kidney, cardiac, metabolic, hematologic, electrolyte), totaling 30+ biomarkers. Stratified by disease status using established clinical thresholds.

Brain. Sleep-EDF database from PhysioNet [9]. 5 subjects, polysomnographic recordings with EEG (Fpz-Cz channel), 758 30-second epochs scored as Wake, N1, N2, N3, or REM. Additionally, CHB-MIT Scalp EEG database [10]: 1 subject, 3 recordings with 1 seizure event (40 seconds), 64 10-second epochs classified as pre-ictal, ictal, or post-ictal.

Ecological. 6 food web networks from the Mangal Ecological Interaction Database [11]: Teal 1962 (20 species, 60 links), Copeland 1974 (18 species, 38 links), Johnston 1956 (19 species, 58 links), Havens 1992a (26 species, 60 links), Havens 1992b (13 species, 17 links), Havens 1992c (56 species, 267 links). Real adjacency matrices with trophic levels computed iteratively.

3.2 Operationalization of C at Each Scale

The dipole ratio C = Hself / Hcross requires defining “self” and “cross” at each scale. We use the natural decomposition at each scale:

Subatomic. Hself = mean Shannon entropy of individual muon kinematic variables (pt, η, φ, E for each muon). Hcross = mean mutual information between muon 1 and muon 2 corresponding kinematic variables, plus cross-variable MI pairs. Events are grouped by invariant mass into resonance regions (J/ψ, Υ, Z boson) and background regions (low, mid, high mass continuum). Bootstrap confidence intervals computed with 200 resamples.

Molecular. Genes grouped into 8 functional coexpression modules via K-means clustering on expression profiles across 1,000 randomly sampled cells. Hself = mean entropy of gene expression distributions within each module. Hcross = mean MI between genes from different modules (5 representative pairs per module pair). Computed per cell type and globally.

Cellular. Cell types serve as the subsystems. For each cell type, the top 50 marker genes (highest fold change vs other types) define the type’s identity signature. Hself = entropy of marker gene expression within the type. Hcross = MI of marker gene expression between types, computed via bootstrap sampling of the full cell population (30 bootstrap iterations per type pair, 300 cells per sample). Additionally, the regression-based operator form is fit: dMI/dt ~ cself·Htype² + ccross·Htype1·Htype2, using 150 bootstrap population subsamples.

Organ. Following the bootstrap operator approach. For each primary organ system, Hself = mean entropy of biomarker distributions within the system. Hcross = mean MI between biomarker pairs across organ systems. The operator library includes Hi² (self-terms), Hi·Hj (cross-terms), Hi (linear terms), and constant, fit via ordinary least squares to dMItotal across 200 bootstrap subsamples of 500 patients each. Disease stratification uses standard clinical thresholds: diabetes (HbA1c ≥ 6.5%), liver disease (ALT > 40 U/L), kidney disease (creatinine > 1.3 mg/dL), cardiac risk (CRP > 3 mg/L), inflammation (WBC > 11 K/μL).

Brain. EEG signal decomposed into 5 frequency bands: delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), gamma (30–45 Hz). Spectral entropy computed per band using Welch’s method (segment length = min(Nsamples, 2·fs)). Hself = mean entropy of high-frequency bands (alpha, beta, gamma). Hcross = mean entropy of low-frequency bands (delta, theta). C = Hself / Hcross. This operationalization treats high-frequency activity as cortical self-organization and low-frequency activity as thalamocortical coupling.

Ecological. Species assigned trophic levels via iterative prey-averaging (20 iterations). Hself per trophic level = mean of in-degree entropy, out-degree entropy, and link-pattern entropy across species within the level. Hcross = mean link density between trophic level pairs. Perturbation simulated by removing 10–50% of species, biased toward higher trophic levels (realistic extinction bias), with 20 trials per removal fraction.

3.3 Entropy Estimation

All entropy values computed using discrete Shannon entropy with histogram binning:

H(X) = −∑j pj log(pj)

where pj = countj / total_count for histogram bins. This estimator is always non-negative and finite. Mutual information computed as MI(X, Y) = H(X) + H(Y) − H(X, Y) from 2D histogram binning, guaranteed non-negative by construction.

Bin counts determined dynamically (not hardcoded) based on data characteristics. No fixed thresholds are used in any analysis.

4. Results

4.1 The Dipole Exists at All Six Scales

Table 1 summarizes the dipole detection at each scale.

ScalePhysical SizeCHselfHcrossDetectedN
Subatomic10−15 m9.062.520.28Yes100,000 events
Molecular10−9 m5.910.830.06Yes2,638 cells
Cellular10−5 m1.85Yes8 cell types
Organ10−1 m0.59Yes7,199 patients
Brain10−1 m1.58Yes758 epochs
Ecological104 m5.19Yes6 food webs

All six scales show C > 0 with measurable Hself and Hcross. The dipole spans 19 orders of magnitude in physical size. Falsification criterion F1 is satisfied at all scales. Criterion F4 (5+ scales) is satisfied with 6/6.

4.2 Perturbation Disrupts the Dipole

Perturbation consistently moves C away from its equilibrium value, satisfying falsification criterion F2. We observe two distinct modes of disruption.

Mode 1: Information collapse (C decreases toward 0).

In the organ system, diabetes collapses the dipole: mean cross-coupling increases by 4.7× in diabetic patients relative to healthy controls (HbA1c ≥ 6.5% threshold, NHANES data). The self-entropy of individual organ systems decreases while inter-organ mutual information increases, driving C downward. This pattern is consistent across liver disease, kidney disease, and cardiac inflammation strata.

In the brain, natural sleep shows monotonic C decrease across stages: Wake > N1 > N2 > N3, with all three consecutive transitions passing (mean C higher at lighter stage). This monotonicity is the falsification target: if C were unrelated to consciousness, the ordering would be random (probability < 0.04 for 3/3 ordering by chance).

In ecological systems, tight specialized webs show C collapse under extinction. Johnston 1956: C decreases from 11.4 to 5.0 at 30% species removal (ratio 0.39). Teal 1962: C decreases from 5.6 to 3.7 at 50% removal (ratio 0.66). Copeland 1974: C decreases from 6.3 to 5.4 at 30% removal (ratio 0.88).

Mode 2: Information decoupling (C increases toward infinity).

In the brain, seizure produces a different signature from sleep. CHB-MIT data shows C increases during the ictal phase (pre-ictal mean 1.37, ictal mean 1.51). Seizures involve cortical hypersynchronization: massive broadband power increases (Hself rises) with proportionally smaller coupling increases (Hcross rises less), driving C upward. This is physically distinct from the quiet consciousness loss of deep sleep.

In ecological systems, diffuse generalist webs show C increase under apex predator removal. Havens 1992c (56 species): C increases from 2.2 to 6.9 at 50% removal (ratio 3.2). Removing apex predators preferentially eliminates between-level links, causing Hcross to drop faster than Hself. The ecosystem fragments into decoupled trophic islands rather than collapsing.

At the subatomic scale, particle resonances (bound state decays) show lower C than the continuum background. J/ψ: C = 9.06. Background (15–60 GeV): C = 33.7. Bound states create strong daughter-particle correlations (high Hcross), lowering C. The resonance is an instance of information coupling at 10−15 m.

The bidirectional nature of disruption is itself a prediction of the theory. The dipole does not predict that perturbation always decreases C. It predicts that perturbation moves C away from the system’s equilibrium value C*, in a direction determined by the system’s architecture. Systems with high C* (specialized, low coupling) collapse toward C = 0. Systems with low C* (generalist, high coupling) fragment toward C = ∞. Both directions are pathological.

4.3 The Opposition Signature

In systems where the full operator form can be fit (multi-component systems with sufficient data for regression), we observe consistent opposition between self-term coefficients and cross-term coefficients, satisfying falsification criterion F3.

At the cellular scale: 12 of 21 cross-term coefficients (57.1%) have opposite sign to the mean self-term coefficient.

At the organ scale: 13 of 30 cross-term coefficients (43.3%) oppose the self-terms.

The approximately 57% opposition fraction appearing independently at cellular and organ scales — from completely different biological datasets (PBMC single-cell transcriptomics vs NHANES clinical blood panels) — is notable. This convergence suggests the opposition is a property of the operator form itself, not an artifact of any specific dataset.

At the brain and subatomic scales, opposition is structural rather than regression-based: high-frequency vs low-frequency spectral poles (brain), resonance vs continuum (subatomic).

4.4 Cross-Scale Consistency

All six C values are positive, ranging from 0.59 (organ) to 9.06 (subatomic), a spread of 1.19 orders of magnitude. The mean is 4.03 with coefficient of variation 0.74. The variation in absolute C reflects scale-specific physics: the subatomic scale has the highest C because individual muon kinematics carry far more entropy than their cross-correlation, while the organ scale has the lowest C because organ systems are heavily coupled through shared physiology.

The universal feature is not the value of C but its existence: at every scale tested, the ratio of self-information to cross-information is finite, positive, and sensitive to perturbation.

5. Discussion

5.1 What the Information Dipole Is

The information dipole is a constraint on how organized systems partition information. Every stable system, from quarks to ecosystems, maintains a characteristic ratio C* between self-complexity and cross-coupling. This ratio is not conserved in the strict physical sense (it is not a Noether symmetry), but it is maintained in the homeostatic sense: deviation from C* signals pathology at every scale where it has been measured.

The appropriate analogy is the second law of thermodynamics. The second law does not predict specific energy values or mechanisms. It constrains all thermodynamic processes: entropy of an isolated system never decreases. Similarly, the information dipole does not predict specific information values. It constrains all organized systems: stable systems maintain finite positive C*, and perturbation disrupts this balance.

5.2 Quantum Entanglement as an Instance

At resonance peaks in the CERN dimuon data, daughter particles from bound state decay are quantum-entangled: their kinematics are correlated by conservation of the parent particle’s quantum numbers. This entanglement appears in the dipole as elevated Hcross (mutual information between muon 1 and muon 2 variables), which lowers C relative to the uncorrelated continuum background.

This result reframes quantum entanglement within a broader hierarchy. Entanglement is what information coupling looks like at 10−15 m. The mathematically identical structure at 10−1 m is inter-organ disease coupling. At 104 m, it is trophic cascade. The dipole does not explain why entanglement exists; it places entanglement inside a universal pattern where it was not previously situated.

5.3 Clinical Implications

The organ-scale results suggest several clinical applications. First, the dipole detects multi-organ disease from standard blood panels by measuring information coupling rather than individual biomarker thresholds. The 4.7× collapse in diabetic patients indicates that information coupling changes may precede conventional biomarker threshold crossings, enabling earlier detection.

Second, the dipole migrates between organ pairs as disease progresses, potentially enabling disease trajectory tracking. Third, the age-dependent U-curve phase transition in C suggests developmental reorganization of the information layer that could be clinically relevant for geriatric risk assessment.

5.4 Consciousness and Seizure: Two Directions of Disruption

The brain-scale results demonstrate that C provides a quantitative, monotonic, falsifiable measure of consciousness depth. Unlike Integrated Information Theory’s Φ (which is computationally intractable for real-time use), C is computable from a single EEG channel in milliseconds.

Sleep and seizure reveal a fundamental asymmetry in how the dipole can be disrupted. During natural sleep (N3), cortical self-organization quiets: Hself drops while Hcross remains relatively stable, driving C downward. This is information underproduction — the system loses self-complexity. During seizure, the brain hypersynchronizes: Hself spikes from massive broadband power, but Hcross also rises from channel-to-channel correlation. C increases slightly because self-information grows faster than coupling. This is information overproduction — the system loses coupling selectivity.

Both states are pathological. The healthy awake brain sits at its equilibrium C*. Sleep pulls C below C*. Seizure pushes C above C*. The dipole captures both disruption modes with one metric.

We note that the CHB-MIT seizure result is based on a single seizure (4 ictal epochs) and is statistically underpowered. The direction of the effect is physically plausible (hypersynchronization should increase broadband entropy) but requires replication on the full CHB-MIT dataset (154 seizures across 23 patients).

5.5 Ecological Architecture and Perturbation Direction

The ecological results reveal that the dipole does not predict a universal direction of perturbation response. Instead, it predicts that perturbation moves C away from the system’s equilibrium C*, with the direction determined by the ecosystem’s trophic architecture.

Tight, specialized webs (high C*) have species at each trophic level with distinct, non-overlapping feeding strategies. High self-information per level, low cross-coupling. Extinction collapses trophic levels into each other as specialists are lost, driving C toward zero. This is mathematically identical to the organ disease pattern: loss of functional identity.

Diffuse, generalist webs (low C*) have dense omnivory creating high between-level coupling. Extinction of apex predators — which disproportionately link trophic levels — causes Hcross to drop while within-level diversity (Hself) is relatively preserved. The web fragments into decoupled trophic islands, driving C toward infinity.

This result may resolve a long-standing debate in theoretical ecology. May [12] proved in 1972 that complexity should destabilize ecosystems, contradicting the empirical observation that complex ecosystems persist. The information dipole suggests the resolution: it is not total complexity that determines stability, but where the complexity resides. Self-complexity (within-level diversity) stabilizes the system. Cross-complexity (between-level omnivory) creates fragility to targeted species removal.

5.6 The Equilibrium C* as the Universal Feature

Across all six scales, the consistent finding is not a specific value of C, but the existence of a characteristic equilibrium C* that is maintained in healthy/stable states and disrupted in pathological/perturbed states. This homeostatic maintenance of C* is the candidate universal law.

The analogy to body temperature is apt. Both hypothermia and hyperthermia are pathological; the body regulates around T* = 37°C. Both information collapse (C → 0) and information decoupling (C → ∞) are pathological; each system regulates around its characteristic C*. The existence of this regulated equilibrium at every scale from subatomic to ecological is the central finding of this work.

6. Limitations

We identify several limitations that should be addressed in future work.

The molecular and cellular analyses use the same underlying dataset (PBMC 3k), examined at two different observation scales (gene modules vs cell types). While this is intentionally designed to test scale-invariance within a single biological system, it means these two scales are not fully independent.

The ecological food web networks are real adjacency matrices from the Mangal database, but are relatively small (13–56 species). Analysis of larger webs (Caribbean reef, ~250 species) would strengthen the ecological results.

The brain results comprise 758 epochs from 5 subjects (Sleep-EDF) for the sleep analysis and only 4 ictal epochs from 1 seizure (CHB-MIT) for the seizure analysis. The seizure direction finding (C increases) is physically plausible but statistically underpowered. Full replication on the complete CHB-MIT dataset (154 seizures, 23 patients) is essential before the bidirectional disruption claim can be considered established for the brain scale.

Absolute C values differ substantially across scales (0.59 to 9.06). The universality claim concerns the existence and form of C, not its specific value. Whether the absolute values contain additional information (e.g., about the “coupling strength” of each scale) remains an open question.

All entropy estimates use histogram binning, which has known biases at small sample sizes. Future work should validate results using k-nearest-neighbor and kernel density entropy estimators.

The ecological perturbation experiment removes species biased toward higher trophic levels (a realistic extinction scenario). Random removal and targeted basal removal may produce different patterns and should be tested.

This work does not constitute a Theory of Everything. The information dipole does not predict force strengths, particle masses, or specific biological mechanisms. It is a constraint that organized systems appear to satisfy — analogous to the second law of thermodynamics, which does not explain why systems organize but constrains how they evolve.

Falsification

The following findings would falsify the universality claim:

7. Future Work

Immediate: Replicate the seizure result on the full CHB-MIT dataset. Obtain larger ecological food web datasets. Test on additional NHANES survey cycles for temporal replication.

Medium-term: Test at cosmological scale (CMB power spectrum self-correlation vs galaxy cluster coupling), social scale (economic network modularity), and geological scale (tectonic plate coupling). Develop a continuous-time formulation for dC/dt. Investigate whether C* corresponds to a fixed point under renormalization group flow.

Long-term: Derive C from first principles (maximum entropy principle, variational methods). Conduct clinical trials using C as a diagnostic biomarker for multi-organ disease and anesthesia depth monitoring. Develop real-time ecosystem monitoring using C as a trophic health metric.

8. Conclusion

We have demonstrated that the information dipole C = Hself / Hcross exists at six physical scales spanning 19 orders of magnitude in physical size, from subatomic particle resonances to ecological food webs. The dipole converges to a characteristic equilibrium value C* at each scale. Perturbation disrupts C* bidirectionally: information collapse drives C toward zero, while information decoupling drives C toward infinity. The direction of disruption is determined by system architecture, not by the type of perturbation. The dynamical operator form shows consistent opposition between self-entropy terms and cross-coupling terms across independent datasets.

We propose the maintenance of C* as a candidate information-geometric conservation law — a constraint on how complex systems organize information at every scale. If confirmed by replication at additional scales and with larger datasets, the information dipole would join the conservation laws of physics as a universal constraint on organized matter.

References

[1] Shannon, C.E. A Mathematical Theory of Communication. Bell System Technical Journal 27, 379–423 (1948).

[2] Friston, K. The free-energy principle: a unified brain theory? Nature Reviews Neuroscience 11, 127–138 (2010).

[3] Tononi, G. An information integration theory of consciousness. BMC Neuroscience 5, 42 (2004).

[4] Zurek, W.H. Decoherence, einselection, and the quantum origins of the classical. Reviews of Modern Physics 75, 715–775 (2003).

[5] Newman, M.E.J. Modularity and community structure in networks. PNAS 103, 8577–8582 (2006).

[6] CMS Collaboration. CMS Open Data, MuRun2010B dataset. CERN Open Data Portal (2010).

[7] 10X Genomics. PBMC 3k dataset. 10X Genomics Datasets (2017).

[8] National Center for Health Statistics. NHANES 2021–2023. Centers for Disease Control and Prevention (2023).

[9] Kemp, B., et al. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE Trans. Biomed. Eng. 47, 1185–1194 (2000).

[10] Shoeb, A.H. Application of Machine Learning to Epileptic Seizure Onset Detection and Treatment. PhD Thesis, MIT (2009).

[11] Poisot, T., et al. mangal — making ecological network analysis simple. Ecography 39, 384–390 (2016).

[12] May, R.M. Will a Large Complex System be Stable? Nature 238, 413–414 (1972).

Supplementary Materials

Report ID: DavisAI-2026-DIP-001  •  Status: Preprint — submitting to Nature Physics  •  Data: All datasets publicly available (CERN Open Data, 10X Genomics, NHANES/CDC, PhysioNet, Mangal)  •  Code: Available upon request