Title: | Variable Selection for Joint Modeling of Mean and Dispersion |
---|---|
Description: | A Package for selecting variables for the joint modeling of mean and dispersion (including models for mixture experiments) based on hypothesis testing and the quality of model's fit. In each iteration of the selection process, a criterion for checking the goodness of fit is used as a filter for choosing the terms that will be evaluated by a hypothesis test. Pinto & Pereira (2021) <arXiv:2109.07978>. |
Authors: | Leandro A. Pereira [aut, cre], Edmilson R. Pinto [aut] |
Maintainer: | Leandro A. Pereira <[email protected]> |
License: | GPL-3 |
Version: | 0.0.1 |
Built: | 2024-11-13 03:57:57 UTC |
Source: | https://github.com/cran/stepjglm |
Data from a bread-making mixture experiment, to investigate and to value the final quality of flour.
data(bread_mixture)
data(bread_mixture)
A data frame containing 90 rows and 6 variables.
The response variable was considered as the loaf volume after baking with target value of 530 ml.
Control variables:
: Tjalve
: Folke
: HardRed Spring
Process variables:
: mixing time
: proofing (resting) time of the dough
The bread-making problem, originally presented by Faergestad and Naes (1997), according to Naes et al. (1998), consisted of an experiment with three ingredients of mixture and two noise variables, and had as objective to investigate and to value the final quality of flour, composed by different mixtures of wheat flour, for production of bread.
Faergestad, E. M., Naes, T. (1997). Evaluation of baking quality of wheat flours: I: small scale straight dough baking test of heart bread with variable mixing time and proofing time. In: Report MATFORSK, As, Norway.
Naes, T., Faergestad, E. M., Cornell, J. A. (1998). A comparison of methods for analyzing data from a three component mixture experiment in the presence of variation created by two process variables, Chemometrics and Intelligence Laboratory Systems, v. 41, pp. 221-235.
data(bread_mixture) head(bread_mixture)
data(bread_mixture) head(bread_mixture)
The experiment was performed to study the influence of seven controllable factors and three noise factors on the mean value and the variation in the percentage of shrinkage of products made by injection molding.
data(injection_molding)
data(injection_molding)
A data frame containing 32 rows and 11 variables.
The responses were percentages of shrinkage of products made by injection molding (Y).
Controllable factors:
A: cycle time
B: mould temperature
C: cavity thickness
D: holding pressure
E: injection speed
F: holding time
G: gate size
At each setting of the controllable factors, four
observations were obtained from a
fractional factorial with three noise factors:
M: percentage regrind
N: moisture content
O: ambient temperature
The data set considered is well known in the literature of industrial experiments and has been analyzed by several authors such as Engel (1992), Engel and Huele (1996) and Lee and Nelder (1998). The experiment was performed to study the influence of seven controllable factors and three noise factors on the mean value and the variation in the percentage of shrinkage of products made by injection molding.Noise factors are fixed during the experiment but are expected to vary randomly outside the experimental context.
The aim of the experiment was to determine the process parameter settings so that the shrinkage percentage was close to the target value and robust against environmental variations.
Engel, J. (1992). Modeling variation in industrial experiments. Applied Statistics, 41, 579-593.
Engel, J. and Huele, A. F. (1996). A generalized linear modeling approach to robust Design. Technometrics, 38, 365-373.
Lee, Y. and Nelder, J.A. (1998). Generalized linear models for analysis of quality improvement experiments. The Canadian Journal of Statistics, 26, 95-105.
data(injection_molding) head(injection_molding)
data(injection_molding) head(injection_molding)
A Procedure for selecting variables in JMMD (including mixture models) based on hypothesis testing and the quality of the model's fit.
stepjglm(model,alpha1,alpha2,datafram,family,lambda1=1,lambda2=1,startmod=1, interations=FALSE)
stepjglm(model,alpha1,alpha2,datafram,family,lambda1=1,lambda2=1,startmod=1, interations=FALSE)
model |
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. if |
alpha1 |
significance level for testing add new terms on the mean models. |
alpha2 |
significance level for testing add new terms on the dispersion models. |
datafram |
a data frame containing the data. |
family |
a character string naming a family function or the result of a call to a family function. For |
lambda1 |
some function of the sample size to calculate the |
lambda2 |
some function of the sample size to calculate the |
startmod |
if |
interations |
if |
The function implements a method for selection of variables for both the mean and dispersion models in the JMMD introduced by Nelder and Lee (1991) considering the Adjusted Quasi Extended Likelihood introduced by Lee and Nelder (1998). The method is a procedure for selecting variables, based on hypothesis testing and the quality of the model's fit. A criterion for checking the goodness of fit is used, in each iteration of the selection process, as a filter for choosing the terms that will be evaluated by a hypothesis test. For more details on selection algorithms, see Pinto and Pereira (in press).
model.mean |
a glm object with the adjustments for the mean model. |
model.disp |
a glm object with the adjustments for the dispersion model. |
EAIC |
a numeric object containing the Extended Akaike Information Criterion. |
For details, see Wang and Zhang (2009). | |
EQD |
a numeric object containing the Extended Quasi Deviance. |
For details, see Nelder and Lee (1991). | |
R2m |
a numeric object containing the standard correction for the . |
For details, see Pinto and Pereira (in press). | |
R2d |
a numeric object containing the standard correction for the . |
For details, see Pinto and Pereira (in press). | |
Leandro Alves Pereira, Edmilson Rodrigues Pinto.
Hu, B. and Shao, J. (2008). Generalized linear model selection using . Journal of Statistical Planning and Inference, 138, 3705-3712.
Lee, Y., Nelder, J. A. (1998). Generalized linear models for analysis of quality improvement experiments. The Canadian Journal of Statistics, v. 26, n. 1, pp. 95-105.
Nelder, J. A., Lee, Y. (1991). Generalized linear models for the analysis of Taguchi-type experiments. Applied Stochastic Models and Data Analysis, v. 7, pp. 107-120.
Pinto, E. R., Pereira, L. A. (in press). On variable selection in joint modeling of mean and dispersion. Brazilian Journal of Probability and Statistics. Preprint at https://arxiv.org/abs/2109.07978 (2021).
Wang, D. and Zhang, Z. (2009). Variable selection in joint generalized linear models. Chinese Journal of Applied Probability and Statistics, v. 25, pp.245-256.
Zhang, D. (2017). A coefficient of determination for generalized linear models. The American Statistician, v. 71, 310-316.
# Application to the bread-making problem: data(bread_mixture) Form = as.formula(y~ x1:x2+x1:x3+x2:x3+x1:x2:(x1-x2)+x1:x3:(x1-x3)+ + x1:z1+x2:z1+x3:z1+x1:x2:z1 + x1:x3:z1+x1:x2:(x1-x2):z1 + x1:x3:(x1-x3):z1 + x1:z2+x2:z2+x3:z2+x1:x2:z2 + x1:x3:z2+x1:x2:(x1-x2):z2 +x1:x3:(x1-x3):z2) object=stepjglm(Form,0.1,0.1,bread_mixture,gaussian,sqrt(90),"AIC","-1+x1+x2+x3") summary(object$modelo.mean) summary(object$modelo.disp) object$EAIC # Print the EAIC for the final model # Application to the injection molding data: form = as.formula(Y ~ A*M+A*N+A*O+B*M+B*N+B*O+C*M+C*N+C*O+D*M+D*N+D*O+ E*M+E*N+E*O+F*M+F*N+F*O+G*M+G*N+G*O) data(injection_molding) obj.dt = stepjglm(form, 0.05,0.05,injection_molding,gaussian,sqrt(nrow(injection_molding)),"AIC") summary(obj.dt$modelo.mean) summary(obj.dt$modelo.disp) obj.dt$EAIC # Print the EAIC for the final model obj.dt$EQD # Print the EQD for the final model obj.dt$R2m # Print the R2m for the final model obj.dt$R2d # Print the R2d for the final model
# Application to the bread-making problem: data(bread_mixture) Form = as.formula(y~ x1:x2+x1:x3+x2:x3+x1:x2:(x1-x2)+x1:x3:(x1-x3)+ + x1:z1+x2:z1+x3:z1+x1:x2:z1 + x1:x3:z1+x1:x2:(x1-x2):z1 + x1:x3:(x1-x3):z1 + x1:z2+x2:z2+x3:z2+x1:x2:z2 + x1:x3:z2+x1:x2:(x1-x2):z2 +x1:x3:(x1-x3):z2) object=stepjglm(Form,0.1,0.1,bread_mixture,gaussian,sqrt(90),"AIC","-1+x1+x2+x3") summary(object$modelo.mean) summary(object$modelo.disp) object$EAIC # Print the EAIC for the final model # Application to the injection molding data: form = as.formula(Y ~ A*M+A*N+A*O+B*M+B*N+B*O+C*M+C*N+C*O+D*M+D*N+D*O+ E*M+E*N+E*O+F*M+F*N+F*O+G*M+G*N+G*O) data(injection_molding) obj.dt = stepjglm(form, 0.05,0.05,injection_molding,gaussian,sqrt(nrow(injection_molding)),"AIC") summary(obj.dt$modelo.mean) summary(obj.dt$modelo.disp) obj.dt$EAIC # Print the EAIC for the final model obj.dt$EQD # Print the EQD for the final model obj.dt$R2m # Print the R2m for the final model obj.dt$R2d # Print the R2d for the final model