Let’s implement univariate multivariable linear regression (see derivations in previous chapter) in R the way it is usually used, i.e. mapping our dataset A not just to an element of the Quotient space Y*l*, but, after that, mapping it back into original space X. Our ** ks_lm0** R function is presented in Appendix 1.

Let’s run it against a dataset we used in one of the Statistical Methods classes (Statistical hypotheses testing for Diabetes Type II risk factors), also, for comparison, running these data through the standard *lm* function, making sure that results do match:

*> load(url(“http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/diabetes.sav”))*

*> ks_lm0( diabetes, c(“chol”), c(“glyhb”,”ratio”,”bp.1s”,”bp.1d”) )*

* [,1]*

* [1,] 1.5868593*

* [2,] 11.2724797*

* [3,] 0.1593560*

* [4,] 0.3243256*

* [5,] 98.7703565*

*> cols <- c(“chol”,”glyhb”,”ratio”,”bp.1s”,”bp.1d”)*

* > clean2.diabetes <- diabetes[complete.cases(diabetes[,cols]),cols]*

* > View(clean2.diabetes)*

* > lmodel <- lm(“chol ~ glyhb + ratio + bp.1s + bp.1d”, clean2.diabetes)*

* > lmodel*

*Call:*

* lm(formula = “chol ~ glyhb + ratio + bp.1s + bp.1d”, data = clean2.diabetes)*

*Coefficients:*

* (Intercept) glyhb ratio bp.1s bp.1d*

* 98.7704 1.5869 11.2725 0.1594 0.3243*

Which, of course (no rocket science there, just simple matrix arithmetic), they do 🙂

However, let’s not stop here, and continue visualizing and playing with our data further…

**Appendix 1**

**ks_lm0** <- function (df, yname, xnames){

* all.names = c( unlist(yname), unlist(xnames) )*

* clean.df <- df[complete.cases( df[, all.names] ), all.names]*

*x <- clean.df*

* x[, yname] <- NULL*

* x[, ncol(clean.df)] <- rep_len(1, nrow(clean.df))*

*y <- clean.df[, yname]*

*#X <- matrix(unlist(x), nrow(x))*

X <- matrix(do.call(cbind,x), nrow(x))

*solve(t(X) %*% X) %*% t(X) %*% y*

* }*