Towards Algorithmic Analytics for Large-scale Datasets.
Bzdok D., Nichols TE., Smith SM.
The traditional goals of quantitative analytics cherish simple, transparent models to generate explainable insights. Large-scale data acquisition, enabled for instance by brain scanning and genomic profiling with microarray-type techniques, has prompted a wave of statistical inventions and innovative applications. Modern analysis approaches 1) tame large variable arrays capitalizing on regularization and dimensionality-reduction strategies, 2) are increasingly backed up by empirical model validations rather than justified by mathematical proofs, 3) will compare against and build on open data and consortium repositories, as well as 4) often embrace more elaborate, less interpretable models in order to maximize prediction accuracy. Here we review these trends in learning from "big data" and illustrate examples from imaging neuroscience.