Skip to product information
1 of 1

Data science is an ever-evolving field, which is growing in popularity at an exponential rate. Data science includes techniques and theories extracted from the fields of statistics; computer science, and, most importantly, machine learning, databases, data visualization, and so on.

This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. It starts off with simple statistics and then move on to statistical methods that are used in data science algorithms. The R programs for statistical computation are clearly explained along with logic. You will come across various mathematical concepts, such as variance, standard deviation, probability, matrix calculations, and more. You will learn only what is required to implement statistics in data science tasks such as data cleaning, mining, and analysis. You will learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks.

By the end of the book, you will be comfortable with performing various statistical computations for data science programmatically.

What You Will Learn

  • Analyze the transition from a data developer to a data scientist mindset
  • Get acquainted with the R programs and the logic used for statistical computations
  • Understand mathematical concepts such as variance, standard deviation, probability, matrix calculations, and more
  • Learn to implement statistics in data science tasks such as data cleaning, mining, and analysis
  • Learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks
  • Get comfortable with performing various statistical computations for data science programmatically

James D. Miller

An IBM certified expert, creative innovator and accomplished Director, Sr. Project Leader & Application/System Architect with +35 years of extensive applications and system design & development experience across multiple platforms and technologies. Experiences include introducing customers to new and sometimes disruptive technologies and platforms, integrating with IBM Watson Analytics, Cognos BI, TM1 and Web architecture design, systems analysis, GUI design and testing, Database modelling and systems analysis, design, and development of OLAP, Client/Server, Web and Mainframe applications and systems utilizing: IBM Watson Analytics, IBM Cognos BI & TM1 (TM1 rules, TI, TM1Web and Planning Manager), Cognos Framework Manager, dynaSight - ArcPlan, ASP, DHTML, XML, IIS, MS Visual Basic and VBA, Visual Studio, PERL, SPLUNK, WebSuite, MS SQL Server, ORACLE, SYBASE Server, etc.

Table of Contents

1: TRANSITIONING FROM DATA DEVELOPER TO DATA SCIENTIST

2: DECLARING THE OBJECTIVES

3: A DEVELOPER'S APPROACH TO DATA CLEANING

4: DATA MINING AND THE DATABASE DEVELOPER

5: STATISTICAL ANALYSIS FOR THE DATABASE DEVELOPER

6: DATABASE PROGRESSION TO DATABASE REGRESSION

7: REGULARIZATION FOR DATABASE IMPROVEMENT

8: DATABASE DEVELOPMENT AND ASSESSMENT

9: DATABASES AND NEURAL NETWORKS

10: BOOSTING YOUR DATABASE

11: DATABASE CLASSIFICATION USING SUPPORT VECTOR MACHINES

12: DATABASE STRUCTURES AND MACHINE LEARNING

 

View full details