There are many methods used to predict the pathogenic impact of single-nucleotide variants (SNVs)1,2,3,4,5,6, indels7 and other genomic alterations, including epigenetic features. Predicting the pathogenic impact of variants in coding regions of the genome tends to be more accurate than predicting the impact of variants in non-coding regions, partly because more useable sources of data are available for the former. Available data can include measures that indicate the possible effect of variants on protein structure or function, or the functional impact of a variant. For example, a variant may be classifiable as non-synonymous (where an amino acid is substituted) or synonymous (where the amino acid is not modified), or it may create a stop codon. These types of variant would be expected to have substantially different effects from each other, making the accurate prediction of pathogenic impact more tractable. Non-coding regions of the genome also have a multiplicity of functional elements, such as enhancers, promoters, untranslated regions, splice sites and non-coding genes expressing microRNA, in addition to other sites that can become functional, such as pseudogenes. Predictors for non-coding regions are therefore frequently optimized for these elements — particularly regulatory elements that mediate the transcription of non-coding RNAs (ncRNAs). Now writing in Nature Biomedical Engineering, Chikashi Terao and colleagues8 report a machine-learning model for cell-type-specific prediction of the effects of genomic mutations on the expression of ncRNAs.