Welcome to Josh Loecker's Personal Site!

My name is Josh Loecker, and I am currently a 5th-year graduate student at the University of Nebraska-Lincoln in Dr. Tomáš Helikar's Lab. I am obtaining a Doctorate in Biochemistry with a Specialization in Bioinformatics; despite this, I consider myself more of a Software Engineer than biologist. I am skilled in working with and building high-performance pipelines and solutions, and have a passion for developing robust and maintainable software that is usable by individuals without computational expertise.

Projects

Mech(AI)nistic (BioRxiv; Manuscript in review)

Genome-scale metabolic modeling is an incredibly useful tool to understand how diseases work on at a cellular scale, but the technical barrier to entry can be quite high because it requires knowledge in programming and metabolic/mechanistic modeling. To overcome this gap, I, leading several undergraduate students, developed MechAInistic. It’s a multi-agent system that uses Large Language Models to turn natural language questions into executable, model-grounded workflows. By using an "Architect-Reviewer" design, the system builds its own analysis pipelines and returns structured reports on everything from drug-target exploration to pathway comparisons. To validate MechAInistic, we tested it against to identify a drug therapy for (1) Naive B cells in a Rheumatoid Arthritis context, and (2) CD4+ Th17 cells in a Multiple Sclerosis context. In the Naive B cells, MechAInistic nominated Devimistat, and for the Th17 cells, it recommended Ivosidenib.

COMO (Briefings in Bioinformatics, GitHub)

Constraint-based Optimization of Metabolic Objectives (COMO) is a comprehensive, user-friendly pipeline designed to streamline the integration of multi-omics data for genome-scale metabolic modeling (GSMN) and drug discovery. By combining heterogeneous datasets (including bulk and single-cell RNA-seq, microarrays, and proteomics) with GSMNs, COMO allows researchers to efficiently construct context-specific models for various cell and tissue types within a unified Docker or Conda environment. The pipeline automates complex tasks such as data processing, simulation, and drug perturbation analysis to identify potential therapeutic targets and repurposable drugs. COMO was validated through a study on B cell metabolism in rheumatoid arthritis and systemic lupus erythematosus and offers a robust computational solution for accelerating the discovery of low-cost, effective disease treatments.

AutoRNAseq (BioRxiv, GitLab)

AutoRNAseq is a highly parallelized Snakemake workflow designed to automate the processing of bulk RNA-seq data on high-performance computing (HPC) clusters. Tailored to interface with NCBI's Gene Expression Omnibus Database, the pipeline requires only a list of SRR codes as input. It handles parallel downloading and unpacking of raw data, performs quality control (using FastQC and MultiQC), optionally trims reads (Trim Galore!), aligns sequences to the genome using STAR and performs gene quantification with Salmon. While specifically optimized to format inputs for metabolic drug discovery and repurposing packages (specifically COMO), AutoRNAseq serves as a robust, standalone solution for any individual needing to efficiently generate standardized gene count matrices and quality metrics for downstream RNA-seq analysis.

Hobbies

Home Lab

I have a home-server (running this site!) that I enjoy using for various tech-related experiments. Some of my current favorites are:

Tailscale for easy, VPN-based access to private services
LinkWarden for a "read later" list of links
Paperless NGX for document scanning and OCR searching
Home Assistant for interacting with smart lights, switches, etc.
Open Web UI + Llama-Swap gives private LLM usage, specifically through Tailscale, for easy and secure access anywhere in the world