Artwork

Contenuto fornito da Roman Cheplyaka. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Roman Cheplyaka o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Player FM - App Podcast
Vai offline con l'app Player FM !

#32 Deep tensor factorization and a pitfall for machine learning methods with Jacob Schreiber

1:15:14
 
Condividi
 

Manage episode 232338443 series 1537951
Contenuto fornito da Roman Cheplyaka. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Roman Cheplyaka o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.

In this episode, we hear from Jacob Schreiber about his algorithm, Avocado.

Avocado uses deep tensor factorization to break a three-dimensional tensor of epigenomic data into three orthogonal dimensions corresponding to cell types, assay types, and genomic loci. Avocado can extract a low-dimensional, information-rich latent representation from the wealth of experimental data from projects like the Roadmap Epigenomics Consortium and ENCODE. This representation allows you to impute genome-wide epigenomics experiments that have not yet been performed.

Jacob also talks about a pitfall he discovered when trying to predict gene expression from a mix of genomic and epigenomic data. As you increase the complexity of a machine learning model, its performance may be increasing for the wrong reason: instead of learning something biologically interesting, your model may simply be memorizing the average gene expression for that gene across your training cell types using the nucleotide sequence.

Links:

If you enjoyed this episode, please consider supporting the podcast on Patreon.

  continue reading

70 episodi

Artwork
iconCondividi
 
Manage episode 232338443 series 1537951
Contenuto fornito da Roman Cheplyaka. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Roman Cheplyaka o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.

In this episode, we hear from Jacob Schreiber about his algorithm, Avocado.

Avocado uses deep tensor factorization to break a three-dimensional tensor of epigenomic data into three orthogonal dimensions corresponding to cell types, assay types, and genomic loci. Avocado can extract a low-dimensional, information-rich latent representation from the wealth of experimental data from projects like the Roadmap Epigenomics Consortium and ENCODE. This representation allows you to impute genome-wide epigenomics experiments that have not yet been performed.

Jacob also talks about a pitfall he discovered when trying to predict gene expression from a mix of genomic and epigenomic data. As you increase the complexity of a machine learning model, its performance may be increasing for the wrong reason: instead of learning something biologically interesting, your model may simply be memorizing the average gene expression for that gene across your training cell types using the nucleotide sequence.

Links:

If you enjoyed this episode, please consider supporting the podcast on Patreon.

  continue reading

70 episodi

Tutti gli episodi

×
 
Loading …

Benvenuto su Player FM!

Player FM ricerca sul web podcast di alta qualità che tu possa goderti adesso. È la migliore app di podcast e funziona su Android, iPhone e web. Registrati per sincronizzare le iscrizioni su tutti i tuoi dispositivi.

 

Guida rapida