How to view full regression summary table from SparkR::spark.logit

31 Views Asked by At

If I view a logistic regression model from SparkR::spark.logit() ...

library(SparkR)
mtcars_sdf <- createDataFrame(mtcars)
model <- spark.logit(mtcars_sdf, am ~ wt, family = 'binomial')
summary(model)

The summary of the model will only display the coefficients

$coefficients
            Estimate
(Intercept) 12.04037
wt          -4.02397

How can I view the additional model statistics like standard errors and p-values as I do with glm()?

base_glm <- glm(am ~ wt, family = 'binomial', mtcars)
summary(base_glm)

I know that I can fit a glm() to my spark data frame, but the model fitting speed is 3x longer for glm() using a spark data frame vs spark.logit() using a spark data frame.

model_s <- glm(am ~ wt, family = 'binomial', mtcars_sdf)
summary(model_s)
Deviance Residuals: 
(Note: These are approximate quantiles with relative error <= 0.01)
     Min        1Q    Median        3Q       Max  
-1.70604   0.72898   0.72898   0.83613   0.83613  

Coefficients:
                Estimate  Std. Error  t value  Pr(>|t|)
(Intercept)  -2.6851e+00  2.42620590  -1.1067   0.26841
recipe_id     7.6825e-05  0.00004985   1.5411   0.12329

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 520.25  on 499  degrees of freedom
Residual deviance: 568.59  on 498  degrees of freedom
AIC: 572.6

Number of Fisher Scoring iterations: 10
0

There are 0 best solutions below