Scientific
methodology,
automated.

Formulate falsifiable hypotheses. Design controlled experiments. Test against 28 real databases. Grade the evidence. 91 computational tools across 4 scientific domains. Grounded in Popper, Fisher, and GRADE.

Open source (AGPL-3.0) · Self-hostable · COSS

91 tools28 data sources4 domains17 visualizations

Start Free View on GitHub

How It Works

Six phases. Each grounded in established science.

Every investigation follows a structured scientific protocol. Ehrlich doesn't just search the internet — it formulates hypotheses, designs experiments, tests them against real data, validates with controls, and grades the evidence using peer-reviewed frameworks.

Classification & PICO

Sackett (1996)

Decompose your question into Population, Intervention, Comparison, Outcome. Auto-detect domains. Multi-domain questions merge configs automatically.

Literature Survey

GRADE + AMSTAR-2

Systematic search with citation chasing. GRADE-adapted evidence grading. AMSTAR-2 quality self-assessment. Haiku compresses and classifies.

Hypothesis Formulation

Popper + Platt + Bayes

Falsifiable hypotheses with predictions, null predictions, success/failure criteria, scope, type, and prior confidence. You approve before testing starts.

Experiment Execution

Fisher (1935)

Experiments with independent/dependent variables, controls, confounders, and analysis plans. Two experiments run in parallel. 91 tools across 4 domains.

Validation & Controls

Zhang (1999) + Y-scrambling

Negative controls with known-inactive compounds. Z'-factor assay quality. Permutation significance testing. Scaffold-split vs random-split comparison.

Synthesis

GRADE synthesis

Certainty grading (5 downgrading + 3 upgrading domains). Priority tiers. Limitations taxonomy. Knowledge gap analysis. Follow-up recommendations.

Every Hypothesis Carries

statement The core claim

prediction What should be true if correct

null_prediction What to expect if wrong

success_criteria Measurable threshold for support

failure_criteria Measurable threshold for refutation

prior_confidence Bayesian prior (0-1)

Every Experiment Carries

independent_var What is being manipulated

dependent_var What is being measured

controls Positive and negative controls

confounders Known confounding variables

analysis_plan Statistical approach + thresholds

sensitivity How robust to parameter changes

Console

What you see while it runs.

SSE events stream into the console in real time. Hypotheses update live. Candidates rank as experiments complete. Charts render when visualization tools fire. You approve hypotheses before testing begins.

localhost:5173/investigation/inv_8f3a2b

Investigation

Research Question

Find antimicrobial compounds effective against MRSA with low resistance risk and favorable ADMET profiles

Investigation Timeline~30s total

PICO

LIT

FORM

TEST

CTRL

SYNTH

[PICO]Domain detected: Molecular Science0.8s

[LIT]23 papers found via Semantic Scholar3.2s

[LIT]GRADE assessment: moderate certainty4.1s

[FORM]3 hypotheses formulated (Opus)8.4s

[FORM]Awaiting approval...8.5s

[TEST]ChEMBL screen → 47 candidates12.3s

[TEST]AutoDock Vina → 12 binding hits18.7s

[TEST]XGBoost trained (scaffold-split AUC: 0.84)22.1s

[CTRL]Z'-factor: 0.72 (excellent)25.6s

[SYNTH]GRADE synthesis: ⊕⊕⊕⊖ moderate30.2s

Hypothesis Board

H1SUPPORTED

Compound X MIC < 4 µg/mL against MRSA

0.89

H2REFUTED

Resistance risk via efflux pump mutations is low

0.23

H3TESTING

ADMET profile permits oral bioavailability

0.65

localhost:5173/investigation/inv_8f3a2b#candidates

Candidates

Candidate Ranking

ID	Score	Docking	ADMET	Lipinski
CMP-1247	0.94	-8.7 kcal	Pass	5/5
CMP-0893	0.87	-7.9 kcal	Pass	5/5
CMP-2156	0.81	-7.2 kcal	Warn	5/5
CMP-0412	0.74	-6.8 kcal	Pass	4/5

localhost:5173/investigation/inv_8f3a2b#admet

ADMET Profile

ADMET Radar

Multi-Model Architecture

Choose your team, match the task.

Every investigation assembles a team of three specialized models. Pick the tier that fits your question -- from fast exploration to maximum reasoning power.

Director

Opus 4.6

Formulates hypotheses with predictions, criteria, and Bayesian priors. Designs experiments with controls and confounders. Evaluates evidence and synthesizes findings with GRADE certainty. Streaming with 10K token extended thinking.

WHY: Hypothesis quality requires deep reasoning. No tool access -- pure scientific thinking.

Parallel Execution · 2 experiments per batch

Researcher A

Sonnet 4.5

Executes experiment protocol. Queries databases, trains models, runs statistical tests, screens candidates. Max 10 tool calls per experiment.

WHY: Fast tool execution. Domain-filtered to only relevant tools.

Researcher B

Sonnet 4.5

Independent experiment on a different hypothesis. Cross-references literature, validates controls, computes metrics.

WHY: Parallel execution halves wall-clock time per batch.

Summarizer

Haiku 4.5

Compresses tool outputs over 2000 characters. PICO decomposition. Domain classification. GRADE evidence grading. Keeps the Director focused on reasoning, not parsing.

WHY: Compression is mechanical, not creative. Haiku is 60x cheaper than Opus.

Scientific Domains

Four domains. Domain-agnostic engine.

Each domain brings its own tools, scoring definitions, and prompt examples. The orchestrator, methodology, and persistence work identically across all. Multi-domain questions are auto-detected and merged.

MOLECULAR SCIENCE

22 TOOLS

Drug discovery, antimicrobial resistance, environmental toxicology, agricultural biocontrol.

•Molecular property analysis (structure, 3D shape, fingerprints)

•Drug effectiveness screening across thousands of assays

•3D binding simulation (how molecules fit into proteins)

•Drug safety and absorption prediction (ADMET profiling)

•Environmental toxicity data (EPA CompTox)

•Protein target discovery (200K+ structures)

Visualizations

Binding scatterADMET radarForest plotEvidence matrix

Example:Find drug candidates effective against antibiotic-resistant Klebsiella

TRAINING SCIENCE

11 TOOLS

Exercise physiology, protocol optimization, injury risk assessment, clinical trial evidence.

•Combine multiple studies to measure overall training impact

•Side-by-side protocol comparison with evidence ranking

•Injury risk scoring (sport, load, history, age)

•Training load monitoring (cumulative stress, recovery, fatigue)

•Clinical trial + PubMed literature search

•Performance modeling (predict fatigue dips and peak readiness)

Visualizations

Training timelineMuscle heatmapPerformance chartDose-response

Example:Compare periodized vs non-periodized resistance training in trained athletes

NUTRITION SCIENCE

10 TOOLS

Supplement evidence, nutrient adequacy, drug interactions, inflammatory scoring, safety monitoring.

•Supplement label ingredient lookup (120K+ products)

•Nutrient profiling across 1.1M+ foods

•Intake adequacy vs recommended targets (minimum, safe upper limit)

•Drug-supplement interaction screening

•Adverse event reports from FDA database

•Inflammatory index scoring for dietary patterns

Visualizations

Nutrient comparisonNutrient adequacyTherapeutic windowFunnel plot

Example:Assess safety and efficacy of vitamin D3 + K2 supplementation at high doses

IMPACT EVALUATION

9 TOOLS

Causal analysis of social programs: education, health, employment, housing, sports. Four causal methods (DiD, PSM, RDD, Synthetic Control), 13 data sources across US and Mexico.

•4 causal methods to measure real program impact (not just correlation)

•Automated checks for hidden biases in study design

•US federal data: spending, education, housing, health, labor

•Mexico data: INEGI, Banxico, datos.gob.mx + indicator quality validation

•World Bank + WHO global indicators (190+ countries)

•Cross-program comparison + cost-effectiveness analysis

Visualizations

Program dashboardGeographic comparisonParallel trends

Example:What is the causal effect of conditional cash transfers on school enrollment in Latin America?

Multi-Domain Investigations

Ask a question that spans multiple domains and Ehrlich detects it automatically. DomainRegistry.detect() returns all matching domains. merge_domain_configs() creates a synthetic config with the union of tool tags, concatenated scoring definitions, and joined prompt examples. The researcher sees tools from all relevant domains.

Example:“Evaluate creatine supplementation for resistance training performance and renal safety”→ Nutrition + Training domains merged

Add Your Domain

Register a DomainConfig with tool tags, data sources, scoring definitions, and prompt examples. The engine handles orchestration, persistence, visualization, and reporting. Connect external tools via MCP servers — community-built domains plug in without modifying the core engine.

Contributing Guide

Visualizations

The system picks the right visualization.

The orchestrator intercepts tool results and renders the matching visualization automatically. 3D molecular viewers for docking results. Statistical plots for meta-analysis. Anatomy diagrams for training. Node graphs for hypothesis tracking. No configuration needed.

3D Molecular Viewers

3Dmol.js WebGL

•Live Lab Viewer — SSE-driven scene: protein targets load, ligands dock, candidates color by score
•3D Conformer Viewer — MMFF94-optimized 3D structures with interactive rotate/zoom
•Docking Viewer — Protein + ligand overlay showing binding pocket and interactions

Statistical Charts

Recharts + Visx

•Forest Plot — Meta-analysis effect sizes with confidence intervals
•Funnel Plot — Publication bias assessment across studies
•Dose-Response Curve — Dose-response with confidence band (Visx)
•Evidence Matrix — Hypothesis-by-evidence heatmap (Visx)

Domain-Specific Charts

Recharts + Custom SVG

•Binding Scatter — Compound binding affinities across targets
•ADMET Radar — Drug-likeness property profiles (6 axes)
•Training Timeline — Training load with ACWR danger zones + brush
•Performance Chart — Banister fitness-fatigue model (CTL/ATL/TSB)
•Muscle Heatmap — Anatomical front/back body diagram with activation intensity
•Nutrient Comparison — Grouped bar chart comparing foods
•Nutrient Adequacy — Horizontal bars showing % RDA with MAR score
•Therapeutic Window — EAR/RDA/UL safety zones per nutrient
•Program Dashboard — Multi-indicator KPI view with target tracking
•Geographic Comparison — Region bar chart with benchmark line
•Parallel Trends — DiD treatment vs control over time

Investigation UI

React Flow + Custom

•Investigation Diagram — Hypothesis/experiment/finding node graph with status colors and revision edges
•Hypothesis Board — Kanban grid with expandable confidence bars and approval cards
•Candidate Table — Thumbnail grid with 2D SVG + expandable 3D viewer + Lipinski badge
•Candidate Comparison — Side-by-side scoring view for 2-4 candidates with best-in-group highlighting
•Investigation Report — 8-section structured report with full audit trail and markdown export

Add Your Own

When you register a new domain, you can create custom visualization components using any rendering library: Recharts, Visx, D3, custom SVG, WebGL, maps, network graphs. Register them in the VizRegistry by viz_type string. The orchestrator auto-intercepts any tool result containing that type and renders it inline. Suspense boundaries, grid layout, and error fallbacks are handled for you.

Ground Truth

Every claim
has a source.

Ehrlich queries trusted global databases in real time. Findings link to ChEMBL compound IDs, PDB structure codes, DOIs, and PubChem CIDs. No hallucinated citations. No invented data points.

External APIs

Institutional Memory

Self-Referential Research

Every investigation's findings are indexed in a full-text search database. Future investigations query past findings via search_prior_research. Knowledge compounds over time.

ChEMBL

Bioactivity data for any assay type

2.5M compounds·Free

Semantic Scholar

Literature search + citation chasing

200M+ papers·Free

RCSB PDB

Protein target discovery

200K+ structures·Free

PubChem

Compound search by target/activity

100M+ compounds·Free

EPA CompTox

Environmental toxicity + bioaccumulation

1M+ chemicals·API Key

UniProt

Protein function + disease associations

250M+ sequences·Free

Open Targets

Disease-target associations (scored)

12K+ targets·Free

GtoPdb

Expert pharmacology (pKi, pIC50)

Curated·Free

ClinicalTrials.gov

Exercise/training RCT evidence

500K+ studies·Free

PubMed

Biomedical literature with MeSH

36M+ articles·Free

wger

Exercise database (muscles, equipment)

800+ exercises·Free

NIH DSLD

Supplement label ingredients

180K+ products·Free

USDA FoodData

Nutrient profiles (macro + micro)

300K+ foods·API Key

OpenFDA CAERS

Supplement adverse event reports

Ongoing·Free

RxNav

Drug-nutrient interaction screening

RxNorm DB·Free

World Bank

Development indicators by country (GDP, poverty, education)

16K+ indicators·Free

WHO GHO

Global health statistics (mortality, disease, life expectancy)

2K+ indicators·Free

FRED

US economic time series (GDP, unemployment, CPI)

800K+ series·API Key

Census Bureau

US demographics, poverty, education

ACS 5-year·Free

BLS

US labor statistics (unemployment, CPI, wages)

130K+ series·Free

USAspending

Federal spending awards and grants

All agencies·Free

College Scorecard

US higher education outcomes

6K+ schools·API Key

HUD

Fair Market Rents, income limits

All counties·Free

CDC WONDER

US mortality, natality, public health

National·Free

data.gov

US federal open dataset discovery

300K+ datasets·Free

INEGI

Mexico economic/demographic time series

400K+ series·API Key

Banxico

Mexico central bank series

Financial·API Key

datos.gob.mx

Mexico federal open datasets

1000+ datasets·Free

Ehrlich tsvectorSelf-referential

Past findings (institutional memory)

Growing·Internal

Who It's For

Same product at every level.

All 91 tools, all 28 data sources, and the full 6-phase methodology at every tier. The only variable is the Director model quality.

Student

Free Haiku. 3 investigations/month.

Learn scientific methodology by doing it. Every investigation teaches hypothesis design, experimental controls, and evidence evaluation. Same tools the professionals use.

Academic Researcher

Monthly credits. Sonnet for routine, Opus for publications.

Run systematic reviews, test hypotheses across domains, build on prior findings through self-referential search. Full audit trail for reproducibility.

Industry / Government

BYOK. Your Anthropic key, our methodology + tools.

91 computational tools, 28 data sources, structured reporting. Commercial license for private modifications. Self-host or use the hosted instance with your own Anthropic key.

Why Ehrlich

What makes this different

The AI implements the scientific methodology. It doesn't invent it. Tools execute on real data. Findings link to real sources.

Real Computation

91 tools that compute, not summarize.

Ehrlich trains ML models, runs causal inference, executes statistical tests, and validates with controls. Every tool returns structured data from real computation or real APIs -- not summaries.

•Molecular docking + drug-likeness profiling
•ML classifiers on any structured data (train, predict, cluster)
•Causal inference: Difference-in-Differences, Propensity Score Matching, RDD
•Statistical testing (t-test, Mann-Whitney, Fisher, chi-squared)
•Nutrient interaction screening + adverse event monitoring

Open Source, Self-Hostable

COSS. Same code, two paths.

Self-host with your own API key for free -- no limits, no credits, no account. Or use the hosted instance where credits cover Anthropic API costs. A student in Mexico and a pharma company in Boston get the same 91 tools, the same 28 data sources, the same methodology.

•Self-host: clone, bring your API key, no limits
•Hosted: credits cover Anthropic costs (Opus is expensive)
•Credits: Haiku (1), Sonnet (3), Opus (5)
•AGPL-3.0: inspect, modify, extend, contribute
•Commercial license for private modifications

Structured Methodology

Popper, Fisher, GRADE. Not conversation.

Every investigation follows a 6-phase protocol with falsifiable hypotheses, controlled experiments, evidence hierarchies, and GRADE certainty grading. Findings link to real source IDs. You approve hypotheses before testing begins.

•Falsifiable hypotheses with predictions + criteria
•Controlled experiments with confounders + analysis plans
•8-tier evidence hierarchy traced to original sources
•GRADE certainty grading on final synthesis
•User approval gate before experiment execution

Open Source

Ehrlich is COSS -- Commercial Open-Source Software. The same model used by Supabase, PostHog, Cal.com, and GitLab. The entire codebase is open source under AGPL-3.0. There is no proprietary version.

domain_config.py

MATERIALS_SCIENCE = DomainConfig(
    name="Materials Science",
    tool_tags=frozenset({"materials", "simulation"}),
    score_definitions=[
        ScoreDefinition(
            name="hardness",
            label="Vickers Hardness",
            unit="HV",
        ),
    ],
    prompt_examples=[
        "Discover alloys with high-temperature stability..."
    ],
)

registry.register(MATERIALS_SCIENCE)

AGPL-3.0 (Free Use)

Students, academics, and individual researchers use Ehrlich freely. Self-host internally without restrictions. If you offer Ehrlich as a network service, modifications must be open-sourced.

Commercial License

Companies that want private modifications purchase an AGPL exemption. Includes commercial support, SLA, and custom domain development. Precedent: MongoDB, Confluent, GitLab, Spree Commerce.

91 Tools, 4 Domains

Molecular, training, nutrition, and impact evaluation. Each domain brings its own tools, scoring, and visualization. Add a DomainConfig and the engine handles the rest.

github.com/Sequela02/ehrlich

Roadmap

Three domains today. Any domain tomorrow.

The engine is domain-agnostic. Register a DomainConfig with tools, data sources, and scoring definitions. The orchestrator, methodology, and visualization pipeline work identically across all domains.

Planned Domains

Materials Science

Alloy design, polymer properties, crystal structure prediction. ICSD, Materials Project, AFLOW databases.

Genomics

Gene expression analysis, variant interpretation, pathway enrichment. NCBI, Ensembl, UniProt cross-referencing.

Environmental Science

Pollution monitoring, climate data analysis, biodiversity assessment. EPA, NOAA, GBIF integration.

Platform Features

MCP Ecosystem

Connect external MCP servers as tool providers. Community-built domains plug in without code changes to the core engine.

REST API

Programmatic access to investigations. Start, monitor, and retrieve results via API. Webhook notifications on completion.

Multi-Provider

Swap the Director, Researcher, or Summarizer to any LLM provider. OpenAI, Google, open-weight models. Mix providers per role for cost or capability.

Team Collaboration

Shared investigations, commenting, branching hypotheses. Build on each other's findings across your research group.

Public Beta

Hosted instance pricing.

Self-hosting is free with your own API key. The hosted instance uses Pay-as-you-go Credits (Haiku=1, Sonnet=3, Opus=5) to cover Anthropic costs. Alternatively, use Bring Your Own Key (BYOK) for free, unlimited hosted access (subject only to your Anthropic API limits).

Credits

Pay-as-you-go

Haiku (1), Sonnet (3), Opus (5)

Hosted infrastructure with no setup. Credits cover Anthropic API costs.

•Haiku investigation = 1 credit
•Sonnet investigation = 3 credits
•Opus investigation = 5 credits
•Full 6-phase methodology
•Hosted high-performance infrastructure

Buy Credits

BYOK

Freeduring beta

Unlimited (Subject to Anthropic limits)

Bring Your Own Key. Use your Anthropic API key directly. Ideal for judges and heavy testing.

•Your own Anthropic API key
•No Ehrlich credit limits
•We cover the compute/hosting cost
•Full 91 tool access
•Perfect for hackathon evaluation

Use Own Key

Or self-host.

Clone the repo, add your API key, run the server. No account needed. Full AGPL-3.0 access to everything.

terminal

$ git clone https://github.com/Sequela02/ehrlich
$ cd ehrlich/server && uv sync
$ export ANTHROPIC_API_KEY=sk-...
$ uv run uvicorn ehrlich.api.app:create_app --factory --port 8000

Run your first investigation.

Free tier. No credit card. 3 Haiku investigations per month. Full methodology. All tools.

Start Free View on GitHub

Scientificmethodology,automated.

How It Works

Six phases. Each grounded in established science.

Classification & PICO

Literature Survey

Hypothesis Formulation

Experiment Execution

Validation & Controls

Synthesis

Every Hypothesis Carries

Every Experiment Carries

Console

What you see while it runs.

Choose your team, match the task.

Scientific Domains

Four domains. Domain-agnostic engine.

MOLECULAR SCIENCE

TRAINING SCIENCE

NUTRITION SCIENCE

IMPACT EVALUATION

Multi-Domain Investigations

Add Your Domain

Visualizations

The system picks the right visualization.

3D Molecular Viewers

Statistical Charts

Domain-Specific Charts

Investigation UI

Add Your Own

Every claimhas a source.

Self-Referential Research

Who It's For

Same product at every level.

Student

Academic Researcher

Industry / Government

Why Ehrlich

What makes this different

Real Computation

Open Source, Self-Hostable

Structured Methodology

Open Source

AGPL-3.0 (Free Use)

Commercial License

91 Tools, 4 Domains

Roadmap

Three domains today. Any domain tomorrow.

Materials Science

Genomics

Environmental Science

MCP Ecosystem

REST API

Multi-Provider

Team Collaboration

Public Beta

Hosted instance pricing.

Credits

BYOK

Or self-host.

Run your first investigation.

Scientific
methodology,
automated.

Every claim
has a source.