Melanie Lou, Ph.D.

Passionate about making sense of complicated data, strong appetite for knowledge, & quite gritty

Who is Melanie?

  • Data analyst, programmer, and researcher:

    • Develop, design and implement a processing and analysis pipeline for single-cell sequencing data and somatic variant calling in paired tumor/normal data
    • Develop computational pipeline that calls an initial set of transcriptome/genome mismatch sites and filters these calls taking into account biases in alignment, sequencing depth, strand bias, and genomic variation to identify accurate RNA editing sites in RNA-seq and whole genome seq (WGS) data sets; perform analyzes examining the distribution of RNA edits and conserved and sample-specific editing patterns;
    • Introduced a Bayesian identification method that can assign an unknown DNA sequence to one of several known species groups and conducted simulations to assess and benchmark the performance of the method;
    • Developed a web-based tool to visualize genetic changes in multiple sequence alignments

  • Personal initiative and leadership:

  • Well-rounded and adaptable: I enjoy latin dance and being in the kitchen to satisfy a strong desire for constant learning; fitness training keeps my mind clear for creativity, problem solving and to keep stress levels low!

Short skill summary

  • Knowledge of genetics (molecular evolution, phylogenetics), population genetics (infinite sites, structure and migration, Watterson's theta), coalescent theory, Bayesian probability
  • Programming languages: Python, Perl, C, Java, Octave (like Matlab)
  • Statistical analysis: R, Bioconductor
  • Knowledge of NGS: whole genome sequencing (WGS), whole exome sequencing (WES), RNA-seq, single-cell DNA sequencing, FastQC, GSNAP, Bowtie, BWA, TopHat, Cufflinks, SnpEff, SAMtools, Picard, VarScan2, Mutect, IGV, etc.
  • Understanding of statistical concepts: correlation, regression, null hypothesis significance testing vs. Bayesian inference, machine learning (linear & logistic regression, k-means, PCA)
  • Parallel-processing & cloud-computing: implement MapReduce method using Python and set up and use of virtual servers in the Amazon Cloud using Pig

Experience

Present Computational Biologist at Maverix Biomics:

  • Designed and implemented a pipeline to process low-coverage single-cell DNA that generates variant calls and hierarchical-clustered dendrograms that can distinguish breast cancer and matched normal tissues
  • In conjunction with a customer, helped implement a non-invasive prenatal bioinformatics pipeline to detect chromosomal aberrations in three chromosomes from WES data
  • Building a somatic variant calling pipeline based on paired tumor normal WES data from lung and kidney carcinoma cell lines
  • Work in cross-functional teams with computational biologists, software engineers, and application scientists to solve customer projects and build Maverix's proprietary cloud-based analytic platform

Nov 2013-April 2014 Scientific Computing Consultant and Analyst (Telecommute) at Genentech:

  • Develop computational pipeline that applies appropriate QC filters to identify accurate variants in RNA-seq (across several in-house and and external data sets) and WGS
  • Perform analyses to determine conserved and sample-specific patterns of variation and the results are part of a manuscript that has been submitted to BMC Genomics

2012 R&D Genentech Intern:

  • Identified reliable variants in large (~TB) NGS data (considering biases with RNA-seq data) and interesting mutation patterns (at the molecular, gene, and tissue levels)
  • Collaborated through the use of software version control system (SVN) and weekly meetings to discuss progress and brainstorm new project goals
  • Gave a formal presentation of results to the Bioinformatics and Computational Biology (B&CB) department at Genentech

2005-2012 Teaching Assistant: Communicated complex information to non-experts in 6 different biological topics over 7 years
2005 Database and User Interface Developer: Integrated additional microarray data into an existing genomic database, created a web interface, and developed back-end code to generate data visualizations

Education

2012 Doctor of Philosophy (Ph.D.), Computational Biology

1. Barcoding life: Due to the lack of statistical, tree- and threshold-less identification methods of an unknown sample (e.g., fish fillet from a market) to a species it could potentially come from, introduced a method that is:

  • Fast (10,000 identifications in about 3 seconds).
  • Returns supported (high probability), correct, identifications.
  • Works well in species that are difficult to identify by the naked eye.
PDF: This paper introduces the new identification method that provides a statistical measure of confidence and does not rely on phylogenetic trees or fixed thresholds.

Source code can be found here: http://info.mcmaster.ca/TheAssigner.

2. Informative data matters: The species group that an unknown sample comes from (e.g., salmon fish species) should include samples from different regions of its home (e.g., Atlantic and Pacific salmon). Including even just one produces more correct identifications.

PDF: This paper shows improved performance of the method using informative data; here is a graphical abstract.

3. Original model was too simple: Extended and improved a set of probability (recursion) equations that calculates the chance of seeing a number of genetic changes or variants (considering geographic location and substitution rates).

Work completed with Dr. G. Brian Golding at Dept of Biology at McMaster University.

2007 Master of Science (M.Sc.), Computational Biology

Informative graphics. Designed and built a web visualization tool that can:
  • Quickly identify interesting genetic changes in a set of aligned sequences.
  • Identify misalignments.
  • Provide high quality graphics for publications.

PDF: This paper introduces the web-based tool.

Program available here: http://evol.mcmaster.ca/fingerprint.

Work completed with Dr. G. Brian Golding at Dept of Biology at McMaster University.

2005 Bachelor of Science with Honours (B.Sc. Hons.), Biology and Computer Science
Beijing
With Dr. Deng-Ke Niu's lab at Beijing Normal University

Where was Melanie?

Learning traveller: fullfilled a long-time dream of studying language abroad.
Tom Edge BGSS career seminar audience for Tom Edge BGSS career seminar
Dr. Tom Edge, a Scientist at Environment Canada, and audience at a career seminar

Proud accomplishments!

  • Scholarships: Five academic (provincial) and five travel (national and international)
  • Created career seminar series: Invited professionals to speak about a career path outside of academia and how to get there
  • Created bioinformatics workshop: To understand how biology, computer science and math can be used to solve problems, designed and developed a task where senior high school students manage, process and interpret sequence DNA
Last updated: Oct 2015