Netherlands: Software

Introductie van Micorosoft SQL Server 2016

Issue link: http://hub-nl.insight.com/i/692679

Contents of this Issue

Navigation

Page 133 of 212

121 C H A P T E R 6 | More analytics Example 6-9: Predicting values scoredOutput <- RxSqlServerData( connectionString = connStr, table = "taxiScoreOutput") rxPredict(modelObject = logitObj, data = modelDataSource, outData = scoredOutput, predVarNames = "Score", type = "response", writeModelVars = TRUE, overwrite = TRUE) Figure 6-20 shows the results of the output stored in the table. A value below 0.5 in the Score column indicates a tip is not likely. Figure 6-20: Viewing the output of the rxPredict function in the taxiScoreOutput table in SSMS. Note For simplicity in this example, the data used to train the model is also used to test the model. Typically, you partition the data, using one set to train the model and one set to test the model. To learn more about the rxPredict function, see http://www.rdocumentation.org/packages /RevoScaleR/functions/rxPredict. Model accuracy After you create a model, you can use R functions to test its accuracy. ROCR is a useful package for testing the performance of classification models. Example 6-10 shows the code to install and load this library by using the install.packages and library functions. Example 6-10: Testing a model's accuracy if (!('ROCR' %in% rownames(installed.packages()))){ install.packages('ROCR') } library(ROCR) scoredOutput <- rxImport(scoredOutput) pr <- prediction(scoredOutput$Score, scoredOutput$tipped) prf <- performance(pr, measure = "tpr", x.measure = "fpr") plot(prf) To use the functions in ROCR, you must bring data from the server into your local environment by using the rxImport function. Next, you need to load the results of your predictions into a prediction object by using the prediction function, which takes the score from your model as its first argument and the predicted value as the second argument. Notice that the format of these arguments uses the name of the data source first, then a $ symbol, which is followed by the data source column. The performance function takes the prediction object as the first argument and then you specify measures to return. In this case, tpr and fpr represent true positive rate and false positive rate and are

Articles in this issue

Links on this page

Archives of this issue

view archives of Netherlands: Software - Introductie van Micorosoft SQL Server 2016