SMiLE-seq is a new effective experimental method for transcription factor (TF) binding site sequence inference. Still, some TFs are challenging to analyze. We hope to improve the method by using modern statistical and deep learning approaches in both experiment design and the subsequent data analysis.
Deliverables:
- a tool for inferring binding motifs that cover the sequence space representatively
- GUI for analysis and analysis improvement
- “denoisifier” – a tool to use prior to the HMM-based analysis
Milestones:
- familiarizing with the SMiLE-seq method and with the current workflow in detail (mainly with analysis – HMM-based workflow)
- prototype a tool for motif inference
- test the tool for designing sequences
- identify noise sources (more like number of noise sources) – statistical methods, Autoencoder architecture?
- prototype “real” TF binding site extraction
Nodes involved
People involved