with Applications in R

Author: Gareth James,Daniela Witten,Trevor Hastie,Robert Tibshirani

Publisher: Springer Science & Business Media

ISBN: 1461471389

Category: Mathematics

Page: 426

View: 9183

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Read More

with Applications in R

Author: Gareth James,Daniela Witten,Trevor Hastie,Robert Tibshirani

Publisher: Springer

ISBN: 9781461471370

Category: Mathematics

Page: 426

View: 5212

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Read More

Author: Richard A. Berk

Publisher: Springer

ISBN: 3319440489

Category: Mathematics

Page: 347

View: 8193

This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. As in the first edition, a unifying theme is supervised learning that can be treated as a form of regression analysis. Key concepts and procedures are illustrated with real applications, especially those with practical implications. The material is written for upper undergraduate level and graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. The author uses this book in a course on modern regression for the social, behavioral, and biological sciences. All of the analyses included are done in R with code routinely provided.
Read More

A Concise Course in Statistical Inference

Author: Larry Wasserman

Publisher: Springer Science & Business Media

ISBN: 0387217363

Category: Mathematics

Page: 442

View: 1535

Taken literally, the title "All of Statistics" is an exaggeration. But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like non-parametric curve estimation, bootstrapping, and classification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analysing data.
Read More

Data Mining, Inference, and Prediction

Author: Trevor Hastie,Robert Tibshirani,Jerome Friedman

Publisher: Springer Science & Business Media

ISBN: 0387216065

Category: Mathematics

Page: 536

View: 4581

During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide” data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Read More

Understanding Why and How

Author: F.M. Dekking,C. Kraaikamp,H.P. Lopuhaä,L.E. Meester

Publisher: Springer Science & Business Media

ISBN: 1846281687

Category: Mathematics

Page: 488

View: 2904

Suitable for self study Use real examples and real data sets that will be familiar to the audience Introduction to the bootstrap is included – this is a modern method missing in many other books
Read More

Author: Simon Sheather

Publisher: Springer Science & Business Media

ISBN: 0387096078

Category: Mathematics

Page: 393

View: 9387

This book focuses on tools and techniques for building regression models using real-world data and assessing their validity. A key theme throughout the book is that it makes sense to base inferences or conclusions only on valid models. Plots are shown to be an important tool for both building regression models and assessing their validity. We shall see that deciding what to plot and how each plot should be interpreted will be a major challenge. In order to overcome this challenge we shall need to understand the mathematical properties of the fitted regression models and associated diagnostic procedures. As such this will be an area of focus throughout the book. In particular, we shall carefully study the properties of resi- als in order to understand when patterns in residual plots provide direct information about model misspecification and when they do not. The regression output and plots that appear throughout the book have been gen- ated using R. The output from R that appears in this book has been edited in minor ways. On the book web site you will find the R code used in each example in the text.
Read More

An Introduction

Author: David Ruppert

Publisher: Springer

ISBN: 1441968768

Category: Business & Economics

Page: 474

View: 3551

This book emphasizes the applications of statistics and probability to finance. The basics of these subjects are reviewed and more advanced topics in statistics, such as regression, ARMA and GARCH models, the bootstrap, and nonparametric regression using splines, are introduced as needed. The book covers the classical methods of finance and it introduces the newer area of behavioral finance. Applications and use of MATLAB and SAS software are stressed. The book will serve as a text in courses aimed at advanced undergraduates and masters students. Those in the finance industry can use it for self-study.
Read More

Author: Brian S. Everitt

Publisher: Springer Science & Business Media

ISBN: 1846281245

Category: Mathematics

Page: 221

View: 1857

Applied statisticians often need to perform analyses of multivariate data; for these they will typically use one of the statistical software packages, S-Plus or R. This book sets out how to use these packages for these analyses in a concise and easy-to-use way, and will save users having to buy two books for the job. The author is well-known for this kind of book, and so buyers will trust that he’s got it right.
Read More

Author: Douglas A. Wolfe,Grant Schneider

Publisher: Springer

ISBN: 3319560727

Category: Mathematics

Page: 976

View: 6200

This textbook is designed to give an engaging introduction to statistics and the art of data analysis. The unique scope includes, but also goes beyond, classical methodology associated with the normal distribution. What if the normal model is not valid for a particular data set? This cutting-edge approach provides the alternatives. It is an introduction to the world and possibilities of statistics that uses exercises, computer analyses, and simulations throughout the core lessons. These elementary statistical methods are intuitive. Counting and ranking features prominently in the text. Nonparametric methods, for instance, are often based on counts and ranks and are very easy to integrate into an introductory course.​ The ease of computation with advanced calculators and statistical software, both of which factor into this text, allows important techniques to be introduced earlier in the study of statistics. This book's novel scope also includes measuring symmetry with Walsh averages, finding a nonparametric regression line, jackknifing, and bootstrapping​. Concepts and techniques are explored through practical problems. Quantitative reasoning is at the core of so many professions and academic disciplines, and this book opens the door to the most modern possibilities.
Read More

Author: Michael W. Trosset

Publisher: CRC Press

ISBN: 9781584889489

Category: Mathematics

Page: 496

View: 7408

Emphasizing concepts rather than recipes, An Introduction to Statistical Inference and Its Applications with R provides a clear exposition of the methods of statistical inference for students who are comfortable with mathematical notation. Numerous examples, case studies, and exercises are included. R is used to simplify computation, create figures, and draw pseudorandom samples—not to perform entire analyses. After discussing the importance of chance in experimentation, the text develops basic tools of probability. The plug-in principle then provides a transition from populations to samples, motivating a variety of summary statistics and diagnostic techniques. The heart of the text is a careful exposition of point estimation, hypothesis testing, and confidence intervals. The author then explains procedures for 1- and 2-sample location problems, analysis of variance, goodness-of-fit, and correlation and regression. He concludes by discussing the role of simulation in modern statistical inference. Focusing on the assumptions that underlie popular statistical methods, this textbook explains how and why these methods are used to analyze experimental data.
Read More

An Intermediate Course with Examples in R

Author: Richard M. Heiberger,Burt Holland

Publisher: Springer

ISBN: 1493921223

Category: Mathematics

Page: 898

View: 7765

This contemporary presentation of statistical methods features extensive use of graphical displays for exploring data and for displaying the analysis. The authors demonstrate how to analyze data—showing code, graphics, and accompanying tabular listings—for all the methods they cover. They emphasize how to construct and interpret graphs. They discuss principles of graphical design. They identify situations where visual impressions from graphs may need confirmation from traditional tabular results. All chapters have exercises. The authors provide and discuss R functions for all the new graphical display formats. All graphs and tabular output in the book were constructed using these functions. Complete R scripts for all examples and figures are provided for readers to use as models for their own analyses. This book can serve as a standalone text for statistics majors at the master’s level and for other quantitatively oriented disciplines at the doctoral level, and as a reference book for researchers. In-depth discussions of regression analysis, analysis of variance, and design of experiments are followed by introductions to analysis of discrete bivariate data, nonparametrics, logistic regression, and ARIMA time series modeling. The authors illustrate classical concepts and techniques with a variety of case studies using both newer graphical tools and traditional tabular displays. The Second Edition features graphs that are completely redrawn using the more powerful graphics infrastructure provided by R's lattice package. There are new sections in several of the chapters, revised sections in all chapters and several completely new appendices. New graphical material includes: • an expanded chapter on graphics • a section on graphing Likert Scale Data to build on the importance of rating scales in fields from population studies to psychometrics • a discussion on design of graphics that will work for readers with color-deficient vision • an expanded discussion on the design of multi-panel graphics • expanded and new sections in the discrete bivariate statistics capter on the use of mosaic plots for contingency tables including the n×2×2 tables for which the Mantel–Haenszel–Cochran test is appropriate • an interactive (using the shiny package) presentation of the graphics for the normal and t-tables that is introduced early and used in many chapters The new appendices include discussions of R, the HH package designed for R (the material in the HH package was distributed as a set of standalone functions with the First Edition of this book), the R Commander package, the RExcel system, the shiny package, and a minimal discussion on writing R packages. There is a new appendix on computational precision illustrating and explaining the FAQ (Frequently Asked Questions) about the differences between the familiar real number system and the less-familiar floating point system used in computers. The probability distributions appendix has been expanded to include more distributions (all the distributions in base R) and to include graphs of each. The editing appendix from the First Edition has been split into four expanded appendices—on working style, writing style, use of a powerful editor, and use of LaTeX for document preparation.
Read More

An Introduction to Statistical Learning Methods with R

Author: Daniel D. Gutierrez

Publisher: Technics Publications

ISBN: 1634620984

Category: Computers

Page: 282

View: 7141

A practitioner’s tools have a direct impact on the success of his or her work. This book will provide the data scientist with the tools and techniques required to excel with statistical learning methods in the areas of data access, data munging, exploratory data analysis, supervised machine learning, unsupervised machine learning and model evaluation. Machine learning and data science are large disciplines, requiring years of study in order to gain proficiency. This book can be viewed as a set of essential tools we need for a long-term career in the data science field – recommendations are provided for further study in order to build advanced skills in tackling important data problem domains. The R statistical environment was chosen for use in this book. R is a growing phenomenon worldwide, with many data scientists using it exclusively for their project work. All of the code examples for the book are written in R. In addition, many popular R packages and data sets will be used.
Read More

Author: Peter Dalgaard

Publisher: Springer Science & Business Media

ISBN: 0387790543

Category: Mathematics

Page: 364

View: 5256

This book provides an elementary-level introduction to R, targeting both non-statistician scientists in various fields and students of statistics. The main mode of presentation is via code examples with liberal commenting of the code and the output, from the computational as well as the statistical viewpoint. Brief sections introduce the statistical methods before they are used. A supplementary R package can be downloaded and contains the data sets. All examples are directly runnable and all graphics in the text are generated from the examples. The statistical methodology covered includes statistical standard distributions, one- and two-sample tests with continuous data, regression analysis, one-and two-way analysis of variance, regression analysis, analysis of tabular data, and sample size calculations. In addition, the last four chapters contain introductions to multiple linear regression analysis, linear models in general, logistic regression, and survival analysis.
Read More

Author: Peter D. Hoff

Publisher: Springer Science & Business Media

ISBN: 9780387924076

Category: Mathematics

Page: 272

View: 9957

A self-contained introduction to probability, exchangeability and Bayes’ rule provides a theoretical understanding of the applied material. Numerous examples with R-code that can be run "as-is" allow the reader to perform the data analyses themselves. The development of Monte Carlo and Markov chain Monte Carlo methods in the context of data analysis examples provides motivation for these computational methods.
Read More

Author: Jim Pitman

Publisher: Springer Science & Business Media

ISBN: 1461243742

Category: Mathematics

Page: 560

View: 5239

This is a text for a one-quarter or one-semester course in probability, aimed at students who have done a year of calculus. The book is organised so a student can learn the fundamental ideas of probability from the first three chapters without reliance on calculus. Later chapters develop these ideas further using calculus tools. The book contains more than the usual number of examples worked out in detail. The most valuable thing for students to learn from a course like this is how to pick up a probability problem in a new setting and relate it to the standard body of theory. The more they see this happen in class, and the more they do it themselves in exercises, the better. The style of the text is deliberately informal. My experience is that students learn more from intuitive explanations, diagrams, and examples than they do from theorems and proofs. So the emphasis is on problem solving rather than theory.
Read More

With Exercises, Solutions and Applications in R

Author: Christian Heumann,Michael Schomaker,Shalabh

Publisher: Springer

ISBN: 3319461621

Category: Mathematics

Page: 456

View: 4408

This introductory statistics textbook conveys the essential concepts and tools needed to develop and nurture statistical thinking. It presents descriptive, inductive and explorative statistical methods and guides the reader through the process of quantitative data analysis. In the experimental sciences and interdisciplinary research, data analysis has become an integral part of any scientific study. Issues such as judging the credibility of data, analyzing the data, evaluating the reliability of the obtained results and finally drawing the correct and appropriate conclusions from the results are vital. The text is primarily intended for undergraduate students in disciplines like business administration, the social sciences, medicine, politics, macroeconomics, etc. It features a wealth of examples, exercises and solutions with computer code in the statistical programming language R as well as supplementary material that will enable the reader to quickly adapt all methods to their own applications.
Read More

With Applications in the Life Sciences

Author: Thomas Haslwanter

Publisher: Springer

ISBN: 3319283162

Category: Computers

Page: 278

View: 4898

This textbook provides an introduction to the free software Python and its use for statistical data analysis. It covers common statistical tests for continuous, discrete and categorical data, as well as linear regression analysis and topics from survival analysis and Bayesian statistics. Working code and data for Python solutions for each test, together with easy-to-follow Python examples, can be reproduced by the reader and reinforce their immediate understanding of the topic. With recent advances in the Python ecosystem, Python has become a popular language for scientific computing, offering a powerful environment for statistical data analysis and an interesting alternative to R. The book is intended for master and PhD students, mainly from the life and medical sciences, with a basic knowledge of statistics. As it also provides some statistics background, the book can be used by anyone who wants to perform a statistical data analysis.
Read More

Author: Sanjeev Kulkarni,Gilbert Harman

Publisher: John Wiley & Sons

ISBN: 9781118023464

Category: Mathematics

Page: 288

View: 2306

A thought-provoking look at statistical learning theory and its role in understanding human learning and inductive reasoning A joint endeavor from leading researchers in the fields of philosophy and electrical engineering, An Elementary Introduction to Statistical Learning Theory is a comprehensive and accessible primer on the rapidly evolving fields of statistical pattern recognition and statistical learning theory. Explaining these areas at a level and in a way that is not often found in other books on the topic, the authors present the basic theory behind contemporary machine learning and uniquely utilize its foundations as a framework for philosophical thinking about inductive inference. Promoting the fundamental goal of statistical learning, knowing what is achievable and what is not, this book demonstrates the value of a systematic methodology when used along with the needed techniques for evaluating the performance of a learning system. First, an introduction to machine learning is presented that includes brief discussions of applications such as image recognition, speech recognition, medical diagnostics, and statistical arbitrage. To enhance accessibility, two chapters on relevant aspects of probability theory are provided. Subsequent chapters feature coverage of topics such as the pattern recognition problem, optimal Bayes decision rule, the nearest neighbor rule, kernel rules, neural networks, support vector machines, and boosting. Appendices throughout the book explore the relationship between the discussed material and related topics from mathematics, philosophy, psychology, and statistics, drawing insightful connections between problems in these areas and statistical learning theory. All chapters conclude with a summary section, a set of practice questions, and a reference sections that supplies historical notes and additional resources for further study. An Elementary Introduction to Statistical Learning Theory is an excellent book for courses on statistical learning theory, pattern recognition, and machine learning at the upper-undergraduate and graduate levels. It also serves as an introductory reference for researchers and practitioners in the fields of engineering, computer science, philosophy, and cognitive science that would like to further their knowledge of the topic.
Read More