Everyone Can Have a Personal Health Adviser Now
For most of history, "personalized medicine" meant being rich enough to afford a doctor who knew you by name. Last week I built a version of it on my laptop, for free, from a file I'd been ignoring for seven years.
Not a metaphor. An actual file — the raw data from a consumer DNA test I'd taken back when the novelty was the ancestry pie chart. I'd read the glossy report once, filed it, and forgotten it. The raw data sat in a folder, untouched, the whole time. Last week I finally opened it, pointed a few AI agents and some open-source bioinformatics tools at it, and by the end of an afternoon I had something the report never gave me: answers to questions I could actually bring to a doctor.
This isn't a story about my DNA. It's a story about a capability that just quietly became available to everyone — and almost nobody has noticed.
The drawer of dead reports
If you've ever spat in a tube for 23andMe or AncestryDNA, you have a folder like mine. Glossy PDFs. Ancestry breakdowns. Trait cards telling you whether you can smell asparagus in your urine. A few health notes, carefully hedged.
Here's the thing I never understood until last week: those PDFs are interpretations frozen at a date. The company took your raw data, ran it against what was known on the day they generated the report, picked the findings they were comfortable showing you, and printed it. That was 2019, or 2016, or whenever. The science has moved a great deal since. The report hasn't.
But underneath the report is the actual asset — the raw data file. For a typical consumer test that's a genotyping array: a few hundred thousand carefully chosen positions in your genome, read directly. Open tools can statistically fill in the gaps between them — a process called imputation — to estimate several million variants. It isn't a full genome sequenced letter by letter. But it's far, far more than the PDF ever surfaced. And critically, that file doesn't expire.
The reports were the product. The raw data was the thing that mattered. I'd been keeping the receipt and throwing away the groceries.
The asset you already own
Three things had to happen for "a doctor who knows your biology" to stop being a luxury good. All three have now happened, more or less at once.
First, the data got cheap, and you already own it. Sequencing the first human genome, through the Human Genome Project, cost roughly $2.7 billion across the full programme — about $300 million just for the draft sequence. By 2022, the production cost of sequencing a genome at a research centre had fallen to around $525, per the US National Human Genome Research Institute's own figures. Consumer whole-genome sequencing now sells for a few hundred dollars on sale. And the SNP-array tests most people have already taken cost about $100 to $200 — the data is already sitting in millions of drawers. The expensive part is finished. It happened years ago, and you paid for it.
Second, the interpretation layer is open, free, and improving. This is the part that surprised me most. The reference databases that clinical labs actually rely on are public goods, free to anyone:
- ClinVar, run by the US National Center for Biotechnology Information, catalogues the clinical significance of genetic variants. It's updated weekly, and as of mid-2026 it holds over 4.5 million variant records from thousands of submitters.
- The PGS Catalog, hosted by EMBL-EBI, is an open database of polygenic scores — the statistical models that estimate genetic loading for common conditions. It crossed 5,300 published scores in 2026, up from around 4,700 a year earlier. It grows continuously.
- PharmGKB and CPIC turn pharmacogenomic research into actual prescribing guidance. CPIC's peer-reviewed, freely available guidelines now span 34 genes and 164 drugs.
And the analysis tools that read your file against those databases are open source: variant annotators like Ensembl VEP and snpEff, pharmacogene callers like PharmCAT, polygenic-risk tools like pgsc_calc and PRSice-2. These are the same categories of tool used in clinical and research genomics. They cost nothing and run on a laptop.
So your raw data is a renewable asset. The file is fixed. What we know about it grows every single week. A variant that meant nothing in 2018 might be tied to a drug response, or a clinical-trial criterion, or a reclassified risk today. The commercial report could only ever show you the curated panel, frozen at print time. The raw file holds the lot — it just needs interpreting, again and again, as the knowledge compounds.
Third — and this is what made it accessible to me, a non-specialist — AI removed the expertise bottleneck. The reason normal people never touched these tools wasn't price. It was that standing up a genomics pipeline is a multi-week slog of dependency hell, reference-genome mismatches, and arcane file formats. That barrier is what AI agents have quietly demolished.
The afternoon
I'll keep this tactile but high-level, because the point isn't the recipe. With AI agents handling the tooling, I set up the open-source stack and ran three kinds of analysis: functional annotation (what are my variants associated with?), pharmacogenomics (how do I metabolise common drugs?), and polygenic risk (what's my genetic loading for a few common conditions?).
The friction was exactly what you'd expect, and exactly what normally kills this for a non-bioinformatician. Reference-genome version mismatches — my file was built against one coordinate system, a tool expected another. Chromosome-naming differences (chr1 versus 1 — yes, really, that breaks things). Dependency conflicts. Download mirrors that timed out halfway through a multi-gigabyte reference.
Every one of those would once have cost me a day of Stack Overflow archaeology. Instead the agents diagnosed and fixed them in-loop — read the error, identified the mismatch, applied the conversion, moved on. That's the whole story of what AI did here, and it's worth being precise about: AI didn't do the genomics. The open-source tools did. AI removed the expertise that gatekept them. It flattened weeks of specialist friction into an afternoon of supervised iteration.
That distinction matters, because it's the same pattern showing up everywhere AI is actually useful right now. It rarely replaces the expert system. It collapses the on-ramp to the expert system.
What fell out of it
The first thing I checked was caffeine, and I checked it for a laugh. I'd heard someone on a podcast mention CYP1A2 — the gene behind the enzyme that handles the bulk of caffeine metabolism, somewhere around 95% of it — and realised I had no idea which way mine fell. I also have a faintly ridiculous caffeine tolerance: an espresso late in the evening does nothing to my sleep, and pre-workout caffeine reliably works. So I went looking, expecting nothing.
The result matched the lived experience exactly: rapid metaboliser. Caffeine hits, does its job, clears fast. Nothing I didn't already know in my bones — but there's something quietly satisfying about a file confirming a decade of anecdotal self-knowledge.
And then it taught me the real lesson, by accident. My report labelled the result CYP1A2 *1F/*1F. But when I looked the variant up, a good chunk of the academic literature uses *1F to mean the slower allele — the opposite of how my report used the label. The phenotype was unambiguous and matched my life; the naming was a genuine mess, inconsistent across sources. That is the interpretation layer in miniature: powerful, free, improving — and contested enough that you treat any single label as a question to verify, not a verdict to act on. Exactly the right instinct to carry into the findings that actually matter.
Because the most useful output wasn't the fun one. It was a drug-metabolism flag.
For one common, cheaply prescribed class of medication, my genotype carries an elevated risk of a poor response — with a safer alternative available in the same class. Nothing exotic. Nothing frightening. But exactly the kind of thing you'd want on the table before a prescription is written, rather than discovered after a bad reaction.
I'm deliberately not naming the gene or the drug, because my specific biology isn't the point and isn't anyone's business. What matters is the category. This kind of pharmacogenomic flag is not fringe science — it's established clinical practice. The textbook examples are well documented and FDA-labelled: CYP2C19 and clopidogrel (a common cardiovascular drug whose activation depends on a gene), CYP2D6 and codeine, HLA-B*57:01 and abacavir. Hospitals with the budget already test for these before prescribing.
What struck me is that this flag fell straight out of a free, open-source pharmacogene caller — and it had not been clearly surfaced by the paid consumer reports I'd been sitting on. That's not an indictment of any company; it's the difference between a generic report built for millions of people and asking the data one specific question on your own behalf. The report answers "what's broadly interesting about this person?" The tool answers "what should this person check before their next prescription?" Those are different questions, and only one of them is yours.
The real unlock: from once to ongoing
Here's the shift that turns this from a fun weekend project into something genuinely new.
The genome is static. The science isn't. So the workflow isn't "analyse once and file the result" — the mistake I'd made for seven years. It's re-consult on demand:
- A doctor proposes a new medication → check it against your own data first.
- A clinical trial appears with a genetic eligibility criterion → check whether you'd qualify. This is routine now, especially in oncology and rare disease.
- A new risk study or supplement trend lands → re-run against the updated databases and see whether it actually changes anything for you specifically, rather than following generic advice.
- A carrier finding becomes relevant to family planning, or to a relative's diagnosis.
Each of those used to be a research project requiring an expert. Now each is a ten-minute query against a file you already own, run against databases that are richer this month than they were last month. Same data, more value every year. That's what "renewable" actually means in practice — and it's the part the glossy report can never give you, because the report stopped learning the day it was printed.
Rendering diagram...
The loop from "re-run later" back to "analysis" is the whole point. Everything above it is a one-time setup. The loop is the asset.
The honest boundary
Let me be unambiguous, because this is the part that matters most and the part most likely to be misread: none of this is a substitute for a doctor. It is a preparation layer.
AI surfaces candidates. Clinicians verify and decide. A polygenic risk score is a population-level statistical estimate, not a diagnosis. A pharmacogenomic flag is a question to ask, not an instruction to follow. Imputed variants carry uncertainty. Plenty of associations in the literature are weak, contested, or won't replicate. If you treat a free afternoon's output as medical fact, you've not democratized medicine — you've just given yourself worse advice than your GP, with more confidence.
The recurring truth across everything I write about AI applies here too: the bottleneck has moved from "can we get the insight?" to "can a human validate it before acting?" For genomics, that human is a qualified clinician. The value isn't walking out of the consult with a prescription you talked your way into. It's walking in with better questions — specific to your own biology, pre-loaded, instead of the vague "is there anything I should know?" that gets you nowhere.
That's a genuinely better patient. Not a replaced doctor.
Why local-and-free is the headline, not the footnote
There's one more reason this matters, and it's the one I feel most strongly about. Everything I described runs locally. The file never leaves the laptop. No upload, no third-party server, no terms of service.
It's worth being precise about how that squares with using AI at all, because it's the first objection people raise: didn't you just send your DNA to a cloud model? No. The AI agents orchestrated the tools — they wrote and debugged the Python, untangled the pipeline, fixed the reference-genome mismatches. The genome itself was crunched by open-source binaries running on my own machine; the raw file's contents never had to be pasted into a model or uploaded anywhere. And you can take it further still: the orchestration itself can run on a local, offline model, so even the scaffolding never touches the network. "Using AI" and "keeping the data local" are only in tension if you let them be — architect it right and the intelligence comes to the data, not the other way round.
For the most sensitive data a person will ever own, that isn't a nice-to-have. It's the entire point. Consider what happened to the people who did hand their genome to someone else's infrastructure. In 2023, a breach exposed the genetic data of nearly seven million 23andMe users. Then in 2025, the company filed for bankruptcy, and a court approved the sale of around 15 million customers' genetic records to a new owner — one nobody had chosen to trust. In the US, federal health-privacy law (HIPAA) covers insurers and providers, not consumer genetics companies. The protection people assumed they had wasn't there.
Your genome is not a password you can rotate after a leak. It's permanent, it implicates your relatives, and once it's on someone else's server you've lost control of it forever. "Runs on your own machine, for free" answers all of that in four words. The same democratization that makes the analysis cheap also makes it private — if you choose tools that keep it local. That's not a coincidence worth burying at the bottom of the post. It's the reason to do it this way at all.
The democratization
Personalized medicine stopped being a luxury good and nobody sent out a press release.
The data is cheap and already in your possession. The interpretation tools are free, open, and smarter every week. AI removed the specialist expertise that gatekept the whole thing. Put those together and the result is something that, a decade ago, took a concierge clinic and serious money: an ongoing, personal health adviser, re-consulted whenever life or science gives you a reason — and run privately, on your own machine.
Not a replacement for your doctor. A way to show up to your doctor informed, proactive, and specific to your own biology — holding a file you've owned all along, finally asking it the right questions.
The genome was always renewable. We just didn't have a cheap enough way to keep asking it things. Now we do, and everyone can.
This is a practitioner's account of a capability, not medical advice. Genetic data is probabilistic and easy to over-read; polygenic scores and pharmacogenomic flags are inputs for a qualified clinician to verify, not conclusions to act on alone. If something here is relevant to a real decision, take it to your doctor — that's the whole idea.
The Cutler.sg Newsletter
Weekly notes on AI, engineering leadership, and building in Singapore. No fluff.
Protect the Juniors: Cognitive Debt and the Stack Overflow Collapse
AI is making junior output look senior-level while preventing junior skill from forming — and the Stack Overflow collapse just removed the ambient learning layer that used to catch the deficit. Three interventions that work.
AI as the Great Equaliser: Neurodiversity, Disclosure, and the Tools That Change Everything
For neurodivergent professionals, AI isn't just a productivity tool — it's the first accommodation you can access privately, without disclosure, without stigma, and without asking anyone's permission.
Where are we on the Big Data hype cycle?
[](http://www. flickr.