Data Imputation Methods and Technologies

We introduce a class of linear quantile estimators for panel data. Our framework contains dynamic autoregressive models, models with general predetermined regressors, and models with multiple individual effects as special cases. We follow a correlated random-effects approach, and rely additional layers of quantile regressions as a flexible tool to model conditional distributions. Conditions are given under which the model is nonparametrically identified in static or Markovian dynamic models. We develop a sequential method-of-moment a estimation, and compute the estimator using an iterative algorithm that exploits the computational simplicity of ordinary quantile regression in each iteration step. Finally, a Monte-Carlo exercise and an application to measure the effect of smo pregnancy on children’s birthweights complete the paper. K-means and K-medoids clustering algorithms are widely used for many practical applications. Original k-mean and k-medoids algorithms select initial centroids and medoids randomly that af quality of the resulting clusters and sometimes it generates unstable and empty clusters which are meaningless. The original k-means and k algorithm is computationally expensive and requires time proportional to the product of the number o items, number of clusters and the number of iterations. The new approach for the k mean algorithm eliminates the deficiency of exiting k mean. It first calculates the initial centroids k as per requirements of users and then gives better, effective cluster. It also takes less execution time because it eliminates unnecessary distance computation by using previous iteration. The new approach for k | Volume – 2 | Issue – 4 | May-Jun 2018 6470 | www.ijtsrd.com | Volume Journal of Trend in Scientific and Development (IJTSRD) International Open Access Journal Dr Asha Ambhaikar Ph.D., Dept. of CSE, Kalinga Naya Raipur, Chhattisgarh

K-means and K-medoids clustering algorithms are widely used for many practical applications. Original k-mean and k-medoids algorithms select initial centroids and medoids randomly that af quality of the resulting clusters and sometimes it generates unstable and empty clusters which are meaningless. The original k-means and k algorithm is computationally expensive and requires time proportional to the product of the number o items, number of clusters and the number of iterations. The new approach for the k mean algorithm eliminates the deficiency of exiting k mean. It first calculates the initial centroids k as per requirements of users and then gives better, effective cluster. It also takes less execution time because it eliminates unnecessary distance computation by using previous iteration. The new approach for k @ IJTSRD | Available Online @ www.ijtsrd.com | Volume -2 | Issue -4 | May-Jun 2018 We introduce a class of linear quantile regression estimators for panel data. Our framework contains dynamic autoregressive models, models with general predetermined regressors, and models with multiple ects as special cases. We follow a ects approach, and rely on additional layers of quantile regressions as a flexible tool to model conditional distributions. Conditions are given under which the model is nonparametrically identified in static or Markovian dynamic models. We moment approach for estimation, and compute the estimator using an iterative algorithm that exploits the computational simplicity of ordinary quantile regression in each Carlo exercise and an ect of smoking during pregnancy on children's birthweights complete the medoids clustering algorithms are widely used for many practical applications. Original medoids algorithms select initial centroids and medoids randomly that affect the quality of the resulting clusters and sometimes it generates unstable and empty clusters which are means and k-mediods algorithm is computationally expensive and requires time proportional to the product of the number of data items, number of clusters and the number of iterations. The new approach for the k mean algorithm eliminates the deficiency of exiting k mean. It first calculates the initial centroids k as per requirements of users and then gives better, effective and stable execution time because it nates unnecessary distance computation by using previous iteration. The new approach for k-medoids

INTRODUCTION
Nonlinear panel data models are central to applied research. However, despite some recent progress, it is fair to say that we are still short of answers for panel versions of many models commonly used in empirical work. 1 In this paper we focus on one particular nonlinear model for panel data: quantile regression.
Since Koenker and Bassett (1978), quantile regression has become a prominent methodol the effects of explanatory variables across the entire outcome distribution. Extending the quantile regression approach to panel data has proven challenging, however, mostly because of the di to handle individual-specific he with Koenker (2004), most panel data approaches to date proceed in a quantile-by include individual dummies as additional covariates in the quantile regression. As shown by some recent work, however, this fixedspecial challenges when applied to quantile regression. Galvao, Kato and Montes develop the large-N, T analysis of the fixed quantile regression estimator, and show that it may suffer from large asymptotic biases. R shows that the fixed-effects model for a single quantile is not point-identified. Nonlinear panel data models are central to applied research. However, despite some recent progress, it is short of answers for panel versions of many models commonly used in empirical In this paper we focus on one particular nonlinear model for panel data: quantile regression.
Since Koenker and Bassett (1978), quantile regression methodol-ogy for examining ects of explanatory variables across the entire outcome distribution. Extending the quantile regression approach to panel data has proven challenging, however, mostly because of the difficulty specific heterogeneity. Starting with Koenker (2004), most panel data approaches to by-quantile fash-ion, and include individual dummies as additional covariates in the quantile regression. As shown by some recent -effects approach faces special challenges when applied to quantile regression. Galvao, Kato and Montes-Rojas We depart from the previous literature by proposing a random-effects approach for quantile models from panel data. This approach treats individual unobserved heterogeneity as time-invariant missing data. To describe the model, let i = 1, ..., N denote individual units, and let t = 1, ..., T denote time periods. The random-effects quantile regression (REQR) model specifies the τ -specific conditional quantile of an outcome variable Y it , given a se-quence of strictly exogenous covariates X i = (X i ′ 1 , ..., X iT ′ ) ′ and unobserved heterogeneity η i , as follows: Note that η i does not depend on the percentile value τ. Were data on η i available, one could use a standard quantile regression package to recover the parameters β (τ ) and γ (τ ).
Model (1) specifies the conditional distribution of Y it given X it and η i . In order to complete the model, we also specify the conditional distribution of η i given the sequence of covariates X i . For this purpose, we introduce an additional layer of quantile regression and specify the τ -th conditional quantile of η i given covariates as follows: This modelling allows for a flexible conditioning on strictly exogenous regressors-and on initial conditions in dynamic settings-that may also be of interest in other panel data models. Together, equations (1)-(2) provide a fully specified semiparametric model for the joint distribution of outcomes given the sequence of strictly exogenous covariates. The aim is then to recover the model's parameters: β (τ ), γ (τ ), and δ (τ ), for all τ Our identification result for the REQR model is nonparametric. In particular, identification holds even if the conditional distribution of individual effects is left unrestricted. Recent research has emphasized the identification content of nonlinear panel data models with continuous outcomes (Bonhomme, 2012), as opposed to discrete outcomes models where parameters of interest are typically set-identified (Honor´e and Tamer, 2006, Chernozhukov, Fern´andez-Val, Hahn and Newey, 2011). Pursuing this line of research, our analysis provides conditions for nonparametric identification of REQR in panels where the number of time periods T is fixed, possibly very short (e.g., T = 3). One of the required conditions to apply Hu and Schennach (2008)'s result is a completeness assumption. Although completeness is a high-level assumption, recent papers have provided primitive conditions in specific models, including a special case of model (1). 3 Our analysis is most closely related to Wei and Carroll (2009), who proposed a con-sistent estimation method for cross-sectional linear quantile regression subject to covariate measurement error. In particular, we rely on the approach in Wei and Carroll to deal with the continuum of model parameters indexed by τ ∈ (0, 1). As keeping track of all parameters in the algorithm is not feasible, we build on their insight and use interpolating splines to combine the quantilespecific parameters in (1)-(2) into a complete likelihood function that depends on a finite number of parameters. Our proof of consistency-in a panel data asymp-totics where N tends to infinity and T is kept fixed-also builds on theirs. As the sample size increases, the number of knots, and hence the accuracy of the spline approximation, increase as well. A key difference with Wei and Carroll is that, in our setup, the conditional distribution of individual effects is unknown, and needs to be estimated along with the other parameters of the model.

Model and identification
In this section and the next we focus on the static version of the random-effects quantile regression (REQR) model. Section 6 will consider various extensions to dynamic models. We start by presenting the model along with several examples, and then provide conditions for nonparametric identification.

2.1.Model
Let Y i = (Y i1 , ..., Y iT ) ′ denote a sequence of T scalar outcomes for individual i, and let X i = (X i ′ 1 , ..., X iT ′ ) ′ denote a sequence of strictly exogenous regressors, which may contain a constant. In addition, let η i denote a q-dimensional vector of individual-specific effects, and let U it denote a scalar error term. The model specifies the conditional quantile response function of Y it given X it and η i as follows: for all τ ∈ (0, 1).
Assumption 1 (outcomes) (i) U it follows a standard uniform distribution conditional on X i and η i .
(iii) U it is independent of U is for each t =6 s conditional on X i and η i .
Assumption 1 (i) contains two parts. First, U it is assumed independent of the full se-quence X i1 , ..., X IT and independent of individual effects. This assumption of strict exo-geneity rules out predetermined or endogenous covariates. Second, the marginal distribution of U it is normalized to be uniform on the unit interval. Part (ii) guarantees that outcomes.

Identification
In this section we study nonparametric identification in model (3)-(4). We start with the case where there is a single scalar individual effect (i.e., q = dim η i = 1), and we set T = 3.

REQR estimation
This section considers estimation in the static model (6)- (7). We start by describing the moment restrictions that our estimator exploits, and then present the sequential estimator. In the next two sections we will study the asymptotic properties of the estimator and discuss implementation issues in turn.
In order to derive the main moment restrictions, we start by noting that, for all τ ∈ (0, 1), the following infeasible moment restrictions hold, as a direct implication of Assumptions 1 Indeed, (6) is the first-order condition associated with the infeasible population quantile regression of Y it on X it and η i . Similarly, (5) corresponds to the infeasible quantile regression of η i on X i .

CONCLUSION
Random-effects quantile regression (REQR) provides a flexible approach to model nonlinear panel data models. In our approach, quantile regression is used as a versatile tool to model the dependence between individual effects and exogenous regressors or initial conditions, and to model feedback processes in models with and 2: Predetermined covariates. The empirical application illustrates the benefits of having a flexible approach to allow for heterogeneity and nonlinearity within the same model in a panel data context.
The analysis of the asymptotic properties of the REQR estimator requires an approxima-tion argument. However, while our consistency proof allows the quality of the approximation to increase with the sample size, at this stage in our characterization of the asymptotic distribution we keep the number of knots L fixed as the number of observations N increases. Assessing the asymptotic behavior of the quantile estimates as both L and N tend to infinity is an important task for future work.
Lastly, note that our quantile-based modelling of the distribution of individual effects could be of interest in other models as well. For example, one could consider semiparametric likelihood panel data models, where the conditional likelihood of the outcome Y i given X i and η i depends on a finite-dimensional parameter vector α, and the conditional distribution of η i given X i is left unrestricted. The approach of this paper is easily adapted to this case, and delivers a semiparametric likelihood of the form: Z f (y i |x i ; α, δ(·)) = f (y i |x i , η i ; α)f (η i |x i ; δ(·))dη i , where δ(·) is a process of quantile coefficients.
As another example, our framework naturally extends to models with time-varying un-observables, such as: Yit = QY (Xit, ηit, Uit) , it = Qη ηi,t−1, Vit , Where U it and V it are i.i.d. and uniformly distributed. It seems worth assessing the usefulness of our approach in these other contexts.