Category Archives: Data Science Done Right

Data Science Done Right (the Kitchen Style) #13

On possible criteria for partitioning PCA eigen space to the “error” and “linear modeling” subspaces As we mentioned in previous chapters, by performing PCA, we implicitly find the least lossy linear models of the hidden variables that describe our data, … Continue reading

Posted in Data Science Done Right | Tagged , , , | Leave a comment

Data Science Done Right (the Kitchen Style) #12

Example of using PCA for signal separation problem Before going into more details of more sophisticated multidimensional scaling techniques, and criteria for selecting principal components/factors, let’s take a look at data that has the same measurement units in all their … Continue reading

Posted in Data Science Done Right | Tagged , , , | Leave a comment

Data Science Done Right (the Kitchen Style) #11

Meanings of Covariance Metrics and Principal Component Analysis The Covariance Family metrics (including coefficient of correlation and r squared) may be interpreted in various ways, which indicates that may be poorly understood, hence, and inappropriately used. For example, the usual … Continue reading

Posted in Data Science Done Right | Tagged , , , | Leave a comment

Data Science Done Right (the Kitchen Style) #10

Adding Principal Component Analysis into our cooking mix… If we stop our stepwise regression from the previous chapter on last two dimensions (don’t look at the eigen parameter yet :)), and draw a scatter plot (Fig.10.1): load(url(“http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/diabetes.sav&#8221;)) dim <- c(“glyhb”, … Continue reading

Posted in Data Science Done Right | Tagged , , , | Leave a comment

Data Science Done Right (the Kitchen Style) #9

Picking up loose ends… When previously we were talking about the transformation matrix N-1 – the matrix of changing basis from the original to a basis of the element of the quotient space we project our dataset to – we … Continue reading

Posted in Data Science Done Right | Tagged , , , | Leave a comment

Data Science Done Right (the Kitchen Style) #8

Using linear regression for dimensionality reduction In our linear model, designed and developed (in R) in the previous chapters, we calculate not only new images of the data, being projected on the chosen element of the quotient space that satisfy … Continue reading

Posted in Data Science Done Right | Tagged , , , | Leave a comment

Data Science Done Right (the Kitchen Style) #7

We just created, in the previous post, a simple univariate, multivariable regression function ks_lm0 that calculates regression slopes and intercept. However, it would be interesting not only to find out what our quotient (or factor) space X/Y would be (which … Continue reading

Posted in Data Science Done Right | Tagged , , , | Leave a comment