2  Foundational Skills

2.1 Introduction

In summary, this section has Business Science Foundational Skills content. This includes the entire Business Science process; from data importing to cleaning, wrangling, exploratory data analysis (EDA), feature engineering, splitting, model building and evaluation, reporting and communication of results.

2.1.1 Business Science Workflow in R

IMPORT
readr, readxl
tidyquant, rvest
TIDY
tidyr, tidytext
tibble
VISUALIZE
ggplot2, plotly
TRANSFORM
lubridate, forcats
dplyr, stringr
MODEL
tidymodels
COMMUNICATE
Rmarkdown, Shiny

Syntax error in graphmermaid version 9.1.1

2.2 Data Cleaning

This involves:

  • removing duplicates,

  • checking missing data and performing imputations, if necessary,

  • verifying data types if match the data dictionary,

  • dropping of irrelevant columns.

2.2.1 Libraries

Thanks to skimr as this package is capable of scanning your data and gives you the skeletal view and most important descriptive summary of variables in the data set.

Data summary
Name data
Number of rows 32
Number of columns 5
_______________________
Column type frequency:
factor 4
numeric 1
________________________
Group variables None

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
Class 0 1 FALSE 4 1st: 8, 2nd: 8, 3rd: 8, Cre: 8
Sex 0 1 FALSE 2 Mal: 16, Fem: 16
Age 0 1 FALSE 2 Chi: 16, Adu: 16
Survived 0 1 FALSE 2 No: 16, Yes: 16

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Freq 0 1 68.78 136 0 0.75 13.5 77 670

2.3 Data Wrangling

2.4 Visualization

2.5 Exploratory Data Analysis

2.6 Machine Learning

  • Clustering

  • Reporting

  • Programming