Towards Analysis of SMiLE-seq raw data with the ultimate goal of identification of binding sites of the poorly characterized transcription factors

SMiLE-seq is a new effective experimental method for transcription factor (TF) binding site sequence inference. Still, some TFs are challenging to analyze. We hope to improve the method by using modern statistical and deep learning approaches in both experiment design and the subsequent data analysis.

Deliverables:

a tool for inferring binding motifs that cover the sequence space representatively
GUI for analysis and analysis improvement
“denoisifier” – a tool to use prior to the HMM-based analysis

Milestones:

familiarizing with the SMiLE-seq method and with the current workflow in detail (mainly with analysis – HMM-based workflow)
prototype a tool for motif inference
test the tool for designing sequences
identify noise sources (more like number of noise sources) – statistical methods, Autoencoder architecture?
prototype “real” TF binding site extraction

Nodes involved

ELIXIR Czech Republic

ELIXIR Switzerland

Platform/Community

People involved

jiri vondrasek Kateřina Faltejsková