**Autor: Murtaza Haider**

Broj strana: 608

ISBN broj: 9780133991024

Izdavač:
WILEY

Godina izdanja: 2015.

Predlog za prevod

*Harvard Business Review* recently called data science “The Sexiest Job of the 21st Century.” It’s not just sexy: For millions of managers, analysts, and students who need to solve real business problems, it’s indispensable. Unfortunately, there’s been nothing easy about learning data science–until now.**Getting Started with Data Science** takes its inspiration from worldwide best-sellers like *Freakonomics *and Malcolm Gladwell’s *Outliers*: It teaches through a powerful narrative packed with unforgettable stories.

Murtaza Haider offers informative, jargon-free coverage of basic theory and technique, backed with plenty of vivid examples and hands-on practice opportunities. Everything’s software and platform agnostic, so you can learn data science whether you work with R, Stata, SPSS, or SAS. Best of all, Haider teaches a crucial skillset most data science books ignore: how to tell powerful stories using graphics and tables. Every chapter is built around real research challenges, so you’ll always know why you’re doing what you’re doing.

You’ll master data science by answering fascinating questions, such as:

• Are religious individuals more or less likely to have extramarital affairs?

• Do attractive professors get better teaching evaluations?

• Does the higher price of cigarettes deter smoking?

• What determines housing prices more: lot size or the number of bedrooms?

• How do teenagers and older people differ in the way they use social media?

• Who is more likely to use online dating services?

• Why do some purchase iPhones and others Blackberry devices?

• Does the presence of children influence a family’s spending on alcohol?

For each problem, you’ll walk through defining your question and the answers you’ll need; exploring how others have approached similar challenges; selecting your data and methods; generating your statistics; organizing your report; and telling your story. Throughout, the focus is squarely on what matters most: transforming data into insights that are clear, accurate, and can be acted upon.

Preface xix**Chapter 1 The Bazaar of Storytellers 1**

Data Science: The Sexiest Job in the 21st Century 4

Storytelling at Google and Walmart 6

Getting Started with Data Science 8

Do We Need Another Book on Analytics? 8

Repeat, Repeat, Repeat, and Simplify 10

Chapters’ Structure and Features 10

Analytics Software Used 12

What Makes Someone a Data Scientist? 12

Existential Angst of a Data Scientist 15

Data Scientists: Rarer Than Unicorns 16

Beyond the Big Data Hype 17

Big Data: Beyond Cheerleading 18

Big Data Hubris 19

Leading by Miles 20

Predicting Pregnancies, Missing Abortions 20

What’s Beyond This Book? 21

Summary 23

Endnotes 24**Chapter 2 Data in the 24/7 Connected World 29**

The Liberated Data: The Open Data 30

The Caged Data 30

Big Data Is Big News 31

It’s Not the Size of Big Data; It’s What You Do with It 33

Free Data as in Free Lunch 34

FRED 34

Quandl 38

U.S. Census Bureau and Other National Statistical Agencies 38

Search-Based Internet Data 39

Google Trends 40

Google Correlate 42

Survey Data 44

PEW Surveys 44

ICPSR 45

Summary 45

Endnotes 46**Chapter 3 The Deliverable 49**

The Final Deliverable 52

What Is the Research Question? 53

What Answers Are Needed? 54

How Have Others Researched the Same Question in the Past? 54

What Information Do You Need to Answer the Question? 58

What Analytical Techniques/Methods Do You Need? 58

The Narrative 59

The Report Structure 60

Have You Done Your Job as a Writer? 62

Building Narratives with Data 62

“Big Data, Big Analytics, Big Opportunity” 63

Urban Transport and Housing Challenges 68

Human Development in South Asia 77

The Big Move 82

Summary 95

Endnotes 96**Chapter 4 Serving Tables 99**

2014: The Year of Soccer and Brazil 100

Using Percentages Is Better Than Using Raw Numbers 104

Data Cleaning 106

Weighted Data 106

Cross Tabulations 109

Going Beyond the Basics in Tables 113

Seeing Whether Beauty Pays 115

Data Set 117

What Determines Teaching Evaluations? 118

Does Beauty Affect Teaching Evaluations? 124

Putting It All on (in) a Table 125

Generating Output with Stata 129

Summary Statistics Using Built-In Stata 130

Using Descriptive Statistics 130

Weighted Statistics 134

Correlation Matrix 134

Reproducing the Results for the Hamermesh and Parker Paper 135

Statistical Analysis Using Custom Tables 136

Summary 137

Endnotes 139**Chapter 5 Graphic Details 141**

Telling Stories with Figures 142

Data Types 144

Teaching Ratings 144

The Congested Lives in Big Cities 168

Summary 185

Endnotes 185**Chapter 6 Hypothetically Speaking 187**

Random Numbers and Probability Distributions 188

Casino Royale: Roll the Dice 190

Normal Distribution 194

The Student Who Taught Everyone Else 195

Statistical Distributions in Action 196

Z-Transformation 198

Probability of Getting a High or Low Course Evaluation 199

Probabilities with Standard Normal Table 201

Hypothetically Yours 205

Consistently Better or Happenstance 205

Mean and Not So Mean Differences 206

Handling Rejections 207

The Mean and Kind Differences 211

Comparing a Sample Mean When the Population SD Is Known 211

Left Tail Between the Legs 214

Comparing Means with Unknown Population SD 217

Comparing Two Means with Unequal Variances 219

Comparing Two Means with Equal Variances 223

Worked-Out Examples of Hypothesis Testing 226

Best Buy–Apple Store Comparison 226

Assuming Equal Variances 227

Exercises for Comparison of Means 228

Regression for Hypothesis Testing 228

Analysis of Variance 231

Significantly Correlated 232

Summary 233

Endnotes 234**Chapter 7 Why Tall Parents Don’t Have Even Taller Children 235**

The Department of Obvious Conclusions 235

Why Regress? 236

Introducing Regression Models 238

All Else Being Equal 239

Holding Other Factors Constant 242

Spuriously Correlated 244

A Step-By-Step Approach to Regression 244

Learning to Speak Regression 247

The Math Behind Regression 248

Ordinary Least Squares Method 250

Regression in Action 259

This Just In: Bigger Homes Sell for More 260

Does Beauty Pay? Ask the Students 272

Survey Data, Weights, and Independence of Observations 276

What Determines Household Spending on Alcohol and Food 279

What Influences Household Spending on Food? 285

Advanced Topics 289

Homoskedasticity 289

Multicollinearity 293

Summary 296

Endnotes 296**Chapter 8 To Be or Not to Be 299**

To Smoke or Not to Smoke: That Is the Question 300

Binary Outcomes 301

Binary Dependent Variables 301

Let’s Question the Decision to Smoke or Not 303

Smoking Data Set 304

Exploratory Data Analysis 305

What Makes People Smoke: Asking Regression for Answers 307

Ordinary Least Squares Regression 307

Interpreting Models at the Margins 310

The Logit Model 311

Interpreting Odds in a Logit Model 315

Probit Model 321

Interpreting the Probit Model 324

Using Zelig for Estimation and Post-Estimation Strategies 329

Estimating Logit Models for Grouped Data 334

Using SPSS to Explore the Smoking Data Set 338

Regression Analysis in SPSS 341

Estimating Logit and Probit Models in SPSS 343

Summary 346

Endnotes 347**Chapter 9 Categorically Speaking About Categorical Data 349**

What Is Categorical Data? 351

Analyzing Categorical Data 352

Econometric Models of Binomial Data 354

Estimation of Binary Logit Models 355

Odds Ratio 356

Log of Odds Ratio 357

Interpreting Binary Logit Models 357

Statistical Inference of Binary Logit Models 362

How I Met Your Mother? Analyzing Survey Data 363

A Blind Date with the Pew Online Dating Data Set 365

Demographics of Affection 365

High-Techies 368

Romancing the Internet 368

Dating Models 371

Multinomial Logit Models 378

Interpreting Multinomial Logit Models 379

Choosing an Online Dating Service 380

Pew Phone Type Model 382

Why Some Women Work Full-Time and Others Don’t 389

Conditional Logit Models 398

Random Utility Model 400

Independence From Irrelevant Alternatives 404

Interpretation of Conditional Logit Models 405

Estimating Logit Models in SPSS 410

Summary 411

Endnotes 413**Chapter 10 Spatial Data Analytics 415**

Fundamentals of GIS 417

GIS Platforms 418

Freeware GIS 420

GIS Data Structure 420

GIS Applications in Business Research 420

Retail Research 421

Hospitality and Tourism Research 422

Lifestyle Data: Consumer Health Profiling 423

Competitor Location Analysis 423

Market Segmentation 423

Spatial Analysis of Urban Challenges 424

The Hard Truths About Public Transit in North America 424

Toronto Is a City Divided into the Haves, Will Haves, and Have Nots 429

Income Disparities in Urban Canada 434

Where Is Toronto’s Missing Middle Class? It Has Suburbanized Out of Toronto 435

Adding Spatial Analytics to Data Science 444

Race and Space in Chicago 447

Developing Research Questions 448

Race, Space, and Poverty 450

Race, Space, and Commuting 454

Regression with Spatial Lags 457

Summary 460

Endnotes 461**Chapter 11 Doing Serious Time with Time Series 463**

Introducing Time Series Data and How to Visualize It 464

How Is Time Series Data Different? 468

Starting with Basic Regression Models 471

What Is Wrong with Using OLS Models for Time Series Data? 473

Newey–West Standard Errors 473

Regressing Prices with Robust Standard Errors 474

Time Series Econometrics 478

Stationary Time Series 479

Autocorrelation Function (ACF) 479

Partial Autocorrelation Function (PCF) 481

White Noise Tests 483

Augmented Dickey Fuller Test 483

Econometric Models for Time Series Data 484

Correlation Diagnostics 485

Invertible Time Series and Lag Operators 485

The ARMA Model 487

ARIMA Models 487

Distributed Lag and VAR Models 488

Applying Time Series Tools to Housing Construction 492

Macro-Economic and Socio-Demographic Variables Influencing Housing Starts 498

Estimating Time Series Models to Forecast New Housing Construction 500

OLS Models 501

Distributed Lag Model 505

Out-of-Sample Forecasting with Vector Autoregressive Models 508

ARIMA Models 510

Summary 522

Endnotes 524**Chapter 12 Data Mining for Gold 525**

Can Cheating on Your Spouse Kill You? 526

Are Cheating Men Alpha Males? 526

UnFair Comments: New Evidence Critiques Fair’s Research 527

Data Mining: An Introduction 527

Seven Steps Down the Data Mine 529

Establishing Data Mining Goals 529

Selecting Data 529

Preprocessing Data 530

Transforming Data 530

Storing Data 531

Mining Data 531

Evaluating Mining Results 531

Rattle Your Data 531

What Does Religiosity Have to Do with Extramarital Affairs? 533

The Principal Components of an Extramarital Affair 539

Will It Rain Tomorrow? Using PCA For Weather Forecasting 540

Do Men Have More Affairs Than Females? 542

Two Kinds of People: Those Who Have Affairs, and Those Who Don’t 542

Models to Mine Data with Rattle 544

Summary 550

Endnotes 550**Index 553**

**Budite prvi koji će ostaviti komentar.**

© Sva prava pridržana, Kompjuter biblioteka, Beograd, Obalskih radnika 4a, Telefon: +381 11 252 0 272 |
||