Selected recent publications:
Predictive Insights, From Linear Regression to Machine Learning (mongraph/professional book), Chapman and Hall, to appear 2016
A Closer Look at What We Should Mean by “Big” in “Big Data,” Handbook of Big Data (invited book chapter), Hans Buhlmann and Michael Kane (eds.), Chapman and Hall, to appear 2015
Improved Estimation of Class Probabilities through Unlabeled Data, arXiv:1510.01422, 2015
Parallel Computation for Data Science (professional book), Chapman and Hall, 2015
A Different Approach to the Problem of Missing Data, with Xiao Gu, JSM 2015
A New Approach to the Parallel Coordinates Method for Large Data Sets, with Yingkang Xie, JSM 2014
Long Live (Big Data-fied) Statistics!, invited paper, JSM 2013, 98-108
Efficient Parallel R Loops on Long-Latency Platforms, invited paper, Proceedings of Interface 2012– Future of Statistical Computing: Internet Scale Data, Flexible Modeling, and Visualization , Scott, Wickham and Morris (eds.), 2012
The Art of R Programming (professional book), No Starch Press, 2011
A New Method for Rule Finding Via Bootstrapped Confidence Intervals. SIAM Conference on Data Mining, 2008, 547-552
Using Soft-line Recursive Response to Improve Query Aggregation in Wireless Sensor Networks (with X. Lu, M. Spear, K. Levitt, F. Wu). 2008 IEEE International Conference on Communications, 2309-2316
Availability-Aware Provisioning Strategies for Differentiated Protection Services in Wavelength-Convertible WDM Mesh Networks (with B. Mukherjee, H. Zang, J. Zhang and K. Zhu). IEEE/ACM Transactions on Networking, 2007, 15, 5, 1177-1190.
Estimation of Internet File-Access/Modification Rates, ACM Transactions on Modeling and Computer Simulation, 2005, 15, 3, 233-253.