Taking Melitz Seriously: A Simple Approach for Identifying Structural Parameters of Trade Models with Firm Heterogeneity
Saad Ahmad
Zeynep Akgul
Abstract
Quantitative trade models based on the theory of firm heterogeneity generally assume that distributions of firm productivity as well as firm size are characterized by the powerlaw, especially in the upper tail of the distribution where rare events such as exporting occur. The most frequent distribution used in such models is the Pareto distribution. While powerlaws are widely used in the literature, it may not be an accurate fit for the entire distribution due to fluctuations in the tails. Therefore, it is paramount to identify a minimum threshold above which the powerlaw provides a good fit for the data. This is especially important in estimating the structural parameters of the firm heterogeneity model for use in policy analysis as biased estimates may distort trade volume and welfare responses. In this paper, as in Clauset et al. (2009), we combine maximumlikelihood and the KolmogorovSmirnov (KS) statistic to estimate both the minimum threshold for truncating the data as well as the shape parameter, under a powerlaw, of the firm size and productivity distributions. We then impute the elasticity of substitution across varieties that are appropriate to use in firm heterogeneity models of trade.
Introduction
While there are a number of methodologies prevalent in the current literature for obtaining these parameters, they often lack consistency with the underlying firm heterogeneity theory, indicating a clear need for continued efforts towards theoryconsistent parameterization of firm heterogeneity models (Akgul et al., 2015; Ahmad and Akgul, 2017). Indeed, the lack of a general and theoreticallysound approach for obtaining parameters in firm heterogeneity models remain one of the main challenges in advancing their widespread adoption for policy analysis.
A recent approach in Ahmad and Akgul (2017) proposes estimating the structural parameters of firm heterogeneity models by using the theoretical relationship between the distribution of firm size and the distribution of firm productivity. In firm heterogeneity models, the common assumption is that firm productivity follows the Pareto distribution, which is a powerlaw model with an exponent $\gamma $, equal to the shape parameter of productivity distribution. If firm productivity follows a powerlaw model, then firm size also follows a powerlaw model; however, with a different exponent $\mathrm{\alpha =}\frac{\gamma}{\mathrm{\sigma 1}}$. Since $\alpha $ can be estimated directly from firmlevel data, it provides a useful way to infer the ratio of structural parameters in the firm heterogeneity model. Based on this methodology, Ahmad and Akgul (2017) use firmlevel ORBIS data for motor vehicles and parts sector (MVH) to fit the total factor productivity and firm size distribution to power law models and estimate the structural parameters of the firm heterogeneity model.
We start with this approach and improve the methodology in two key respects. First, there are a limited number of observations for US firms in the MVH sector, and this limits the efficiency of the estimates. We address sample size concerns by using this methodology on all US manufacturing firms in ORBIS as well as considering other countries that have more firm observations in ORBIS, such as Japan and the EU. These changes allow us to vastly increase our sample for subsequent estimations.
Second, one of the principal criticisms of assuming a powerlaw model for a given empirical dataset is the fact that the real world variables do not follow the powerlaw model over the entire range. In fact, the data typically only follows the powerlaw model above a lower bound. In particular, the powerlaw function $\mathrm{p(x)=C}{x}^{\mathrm{\alpha}}$ diverges as x approaches 0, given a positive value of $\alpha $. This means that there is a minimum value ${x}_{\mathrm{min}}$ (lower bound) below which the distribution deviates from the powerlaw. In order to obtain an unbiased estimate of the shape parameter of productivity distribution, it is important to focus on the upper tail of the sample, rather than the entire range of firm observations. This in turn requires that we are able to obtain an accurate estimate of the lower bound. If the lower bound value is too low, the sample does not follow a powerlaw and we are trying to fit a nonpowerlaw data to a powerlaw distribution, and the result is a biased estimate of the shape parameter. If the lower bound value is too high, then we are omitting relevant information from the sample that may increase the statistical error on the parameter estimates as well as the bias from finite size effects (Clauset et al., 2009).
A common method to obtain a lower bound is visualization. There are two ways to estimate visually. The first one is to plot the shape parameter as a function of the lower bound, i.e. truncation point, and identify where the value fluctuates and where it becomes stable. Then choose the point where the relationship becomes stable. The second method is to depict a loglog plot and identify the point where the PDF or the CDF of the distribution becomes relatively straight. The second approach is adopted in Ahmad and Akgul (2017).
In order to avoid the subjectivity of this visualization method, Clauset et al. (2009) offer a more robust and methodical approach in choosing a lower bound. In this paper, we adopt their approach, which is based on minimizing the “distance” between the powerlaw model and the empirical data. They suggest choosing the value of ${x}_{\mathrm{min}}$ that makes the model and the distribution of the empirical data as similar as possible. If chosen ${x}_{\mathrm{min}}$ is higher than the true value, then the sample size is reduced, which may cause statistical fluctuation and makes the probability distribution a poor match. If instead, the chosen ${x}_{\mathrm{min}}$ is lower than the true value, the data and the model will be fundamentally different causing the distribution to differ. To quantify the distance between the two distributions, they use the KolmogorovSmirnov (KS) statistic, which is defined as the maximum distance between the CDFs of the data and the fitted model. Clauset et al. (2009) state that n=1000 observations or more is sufficient to obtain good results with this approach. Lastly, they estimate the powerlaw exponent on the truncated sample using maximum likelihood.
To summarize our approach in this paper, we combine the methodology in Ahmad and Akgul (2017) and Clauset et al. (2009) to estimate the structural parameters of firm heterogeneity, namely the shape parameter of productivity distribution and the elasticity of substitution across varieties. We analyze the powerlaw model in three steps following Clauset et al. (2009). Then in the fourth step we impute the value of elasticity of substitution across varieties based on the powerlaw model.
1. We estimate the lower bound parameter ${x}_{\mathrm{min}}$ by minimizing the distance between the empirical distribution and model distribution based on the KolmogorovSmirnov (KS) statistic.
2. Based on the value of ${x}_{\mathrm{min}}$ we estimate the powerlaw exponent of firm size and firm productivity using the method of Maximum Likelihood.
3. We compare the fit of the powerlaw model with alternative distributions such as the exponential and lognormal distributions using a Likelihood Ratio (LR) test.
4. We use the estimates of powerlaw exponents in firm size and firm productivity to impute the elasticity of substitution across varieties in the sector with heterogeneous firms.
We use the ORBIS firmlevel database and focus on the manufacturing sector of the US, Japan, and the EU for the years 20122016. We then compare the results of this aggregated sector with that of a more disaggregated sector, MVH, in the same regions. For firm size, we use two variables: firm operating revenue and number of employees. In order to calculate the firm productivity levels, we use labor productivity where the firm’s operating revenue is divided by the number of employees.
The results show that the powerlaw provides reasonably good fits for the empirical data and returns exponents above unity, satisfying the theoretical constraint that $\mathrm{\gamma >\sigma 1}$. The likelihood ratio tests suggest that the powerlaw model is a better fit for firm size and firm productivity in manufacturing and the MVH sector than the exponential distribution. However, the likelihood ratio tests against the lognormal distribution are mostly inconclusive, with associated pvalues larger than the target value.
The resulting values of elasticity of substitution for the manufacturing sector are found to be in the range of 2.28 – 2.71 when operating revenue is used as a proxy for firm size. There is a slight variation in elasticity values across regions, which may result in variation in trade volume and welfare responses to trade policies. Elasticity values are relatively lower when the number of employees is used as a proxy for firm size. The range is 2.26 – 2.55, reflecting slight variation across regions.
When we focus on the MVH sector alone, we observe that $$ values vary slightly across regions and are in the range of 2.882.92 when operating revenue is used for firm size, and in the range of 2.722.97 when number of employees is used for firm size. Overall, these values are slightly lower than what is found in the literature, suggesting that the estimation strategy is important in obtaining the appropriate values for the theoretical model in question.
Empirical Methodology
$p\left(x\right)\mathrm{dx=}\mathrm{Pr}\⁡\left(\mathrm{x\le X<x+dx}\right)\mathrm{=C}{x}^{\mathrm{\alpha}}\mathrm{dx}$
where $C$ is a constant, $X$ is the observed value, $x$ is the data that are modeled by the distribution and $\alpha $ is the corresponding powerlaw exponent, i.e. the shape parameter of the distribution. As discussed in Clauset et al. (2009), this PDF does not hold for all $x$. In fact, it may diverge as $\mathrm{x\to 0}$. Therefore, the powerlaw model applies only above a lower bound, which is denoted as ${x}_{\mathrm{min}}$. The resulting PDF of a continuous powerlaw model is given as
$p\left(x\right)=\frac{\mathrm{\alpha 1}}{{x}_{\mathrm{min}}}{\left(\frac{x}{{x}_{\mathrm{min}}}\right)}^{\mathrm{\alpha}}$
where ${x}_{\mathrm{min}}$ is the lower bound for the powerlaw model, data follows a powerlaw for ${\mathrm{x\ge x}}_{\mathrm{min}}$, and $\alpha $ is the corresponding powerlaw exponent. The associated complementary cumulative distribution function (CCDF, i.e. 1CDF) is given as
$P\left(x\right)\mathrm{=Pr}\left(\mathrm{X\ge x}\right)=\underset{x}{\overset{\infty}{\∫}}p\left(x\right)\mathrm{dx}={\left(\frac{x}{{x}_{\mathrm{min}}}\right)}^{\mathrm{\alpha +1}}$
We analyze the powerlaw model in three steps following Clauset et al. (2009). For the implementation of this methodology, we rely on a powerlaw fitting library in Python that was developed by Alstott et al. (2014). We now turn to the description of the methodology for each of these three steps.
Estimating the Lower Bound Parameter:
Following Clauset et al. (2009), a numerical method is used to select the ${x}_{\mathrm{min}}$ that yields the best powerlaw model for the data. Specifically, for each ${x}_{\mathrm{min}}$ over some reasonable range, the KolmogorovSmirnov (KS) statistic is utilized to quantify the distance between the empirical distribution and model distribution. While other measures can also be used for quantifying distance, the KS statistic has been shown by Clauset et al. (2009) to perform well in these estimations. The KS statistic is computed as the maximum distance between the CDFs of the data and the powerlaw model:
$\mathrm{D=}\mathrm{max}\⁡\mathrm{S}\left(x\right)\mathrm{P(x)}$
where $S\left(x\right)$ is the CDF of the data and $\mathrm{P(x)}$ is the CDF for the theoretical powerlaw model, both for observations with value $\mathrm{x\ge}{x}_{\mathrm{min}}$. The estimated ${x}_{\mathrm{min}}$ is the one that provides the best fit to the data by minimizing the distance $D$.
Estimating the PowerLaw Exponent:
The first step of estimating the exponent in a powerlaw model requires the correct identification and estimation of the lower bound parameter, ${x}_{\mathrm{min}}$. Once the value of ${x}_{\mathrm{min}}$ is estimated based on the methodology described in (i) above, the powerlaw exponent is estimated using the method of Maximum Likelihood.
The Log likelihood function $L$ is given as
$\mathrm{L=}\mathrm{ln\; p}\left(\mathrm{x\alpha}\right)\mathrm{=ln}\underset{\mathrm{i=1}}{\overset{n}{\∏}}\frac{\mathrm{\alpha 1}}{{x}_{\mathrm{min}}}{\left(\frac{x}{{x}_{\mathrm{min}}}\right)}^{\mathrm{\alpha}}$
When we maximize $L$ with respect to $\alpha $ such that $\frac{\partial L}{\mathrm{\partial \alpha}}\mathrm{=0}$, the Maximum Likelihood Estimator is:
$\alpha \mathrm{=1+n}{\left(\underset{\mathrm{i=1}}{\overset{n}{\∑}}\mathrm{ln}\frac{{x}_{i}}{{x}_{\mathrm{min}}}\right)}^{1}$
where ${x}_{i}$ for $\mathrm{i=1,2,\dots ,n}$ are the observed values of $x$ such that $\mathrm{x\ge}{x}_{\mathrm{min}}$.
Likelihood Ratio Tests for Alternative Distributions:
If the powerlaw is a good fit for the dataset, we should also investigate whether alternative distributions provide a better fit than the powerlaw. In order to make that evaluation, we follow Clauset et al. (2009) and use the likelihood ratio test, which computes the logarithm of the ratio of the likelihoods of the data between two distributions. We compare the powerlaw model to the exponential and lognormal distributions. A positive value of the likelihood ratio indicates that the powerlaw model is a better fit compared to the alternative, while a negative value indicates that the powerlaw model is a worse fit compared to the alternative.
Computing the Elasticity of Substitution:
In the firm heterogeneity literature, when firm productivity follows a powerlaw model, in this case Pareto distribution, firm size also follows a powerlaw model with a Pareto distribution, but with a different powerlaw exponent. di Giovanni et al. (2011) shows that these two exponents are connected such that we can infer the value of elasticity of substitution across varieties by using these two estimated powerlaw exponents. More specifically, if firm productivity is from a powerlaw model with exponent$\gamma $, then the firm size also follows a powerlaw model with exponent $\mathrm{\zeta =}\frac{\gamma}{\mathrm{\sigma 1}}$ where $\sigma $ is the elasticity of substitution. Since $\zeta $ along with γ can be estimated directly from firmlevel data, this theoretical relationship can be used to infer a value for σ.
FirmLevel Data
In this paper, we use the ORBIS database to obtain annual firmlevel financial data on the manufacturing sector in the US, Japan, and the EU. We restrict the time frame of our study to the 20122016 period. While we consider an aggregated manufacturing sector, we also recognize that a more disaggregated sectoral analysis may result in different powerlaw exponents and higher dispersion. Thus, for comparison, we also conduct our powerlaw estimation on a more disaggregated sector. We select the MVH sector in GTAP for this analysis, and we use the 4digit NACE codes are used for sectoral identification. Sector codes include: (i) 2910 Manufacture of Motor Vehicles, (ii) 2920 Manufacture of bodies (coachwork) for motor vehicles, manufacture of trailers and semitrailers, and (iii) 2930 Manufacture of parts and accessories for motor vehicles.
ORBIS uses both administrative and public data to provide firmlevel information for over 200 million companies worldwide. Several procedures have been undertaken in ORBIS to verify the quality of reported data, including an indexation strategy to ensure the uniqueness of individual firms as well as an analysis to detect unusual variations in financial values between years.
Results
We estimate both the lower bound on the powerlaw behavior and the powerlaw exponent in firm size as well as for productivity. As a proxy for firm size, we consider firmlevel information on operating revenue and the number of employees. As a proxy for productivity, we use the ratio of operating revenue to number of employees, a standard measure of labor productivity. We also use goodness of fit tests to compare the powerlaw distribution to alternative distributions such as the exponential and lognormal distributions.
In the first section of the results, we focus on an aggregated analysis with the complete manufacturing sector data, where we group the data into four regions: (i) firmlevel data for the manufacturing sector in the US, (ii) firmlevel data for the manufacturing sector in Japan, (iii) firmlevel data for the manufacturing sector in Europe and (iv) a pooled version of all firms in the manufacturing sector of the US, Japan and Europe. In the second section of the results, we focus on a more disaggregated analysis where only firms in the MVH sector are considered. As in the first section of the results, we again group the data for these firms in four regions and see if this results in different estimates.
Results for the Manufacturing Sector
We first discuss the powerlaw fits and estimates of powerlaw exponents in the manufacturing sector. For firm size, we use two proxies: operating revenue and the number of employees.
Table 1 presents the estimates of lower bound ${x}_{\mathrm{min}}$ of the powerlaw distribution for operating revenue in the manufacturing sector of the US, Japan, the EU, and the pooled data. Once the value of ${x}_{\mathrm{min}}$ is calculated based on the KS statistic in each region, the data are truncated at that lower bound and the powerlaw exponents are estimated. Table 1 also presents the number of observations in each region before and after truncation. We observe that the value of lower bound for operating revenue in the US manufacturing sector is dramatically higher than that for the rest of the regions. While ${x}_{\mathrm{min}}$ values for Japan, the EU and pooled data are close ($63,204, $79,146, and $60,393, respectively), it is $4,200,836 for the US. This difference stems from the fact that there are fewer observations in ORBIS for the US (4852 before truncation). According to our estimates, 19% of all available observations in US manufacturing are above the ${x}_{\mathrm{min}}$ value (926 after truncation). This percentage drops to 11% for Japan, 2% in the EU and around 3% in the pooled region. Overall, a small range of observations in all regions follow a powerlaw, yet the corresponding firms are much larger in the US compared to other regions.
Table 1: Lower Bound for Pareto Law in the Manufacturing Sector for Operating Revenue





















The fit of the powerlaw to our data sets for operating revenue in the manufacturing sector is shown in Figure 1. Complementary cumulative distribution function (1CDF;$p\left({X}_{i}\mathrm{\ge x}\right)$) in each region are reported based on the ${x}_{\mathrm{min}}$ values on Table 1. Firm operating revenue seems to follow a powerlaw model up until the tail of the distribution in each region. At the tail, the powerlaw fit diverges slightly from the empirical fit. This may result from the fact that we use all the firms in the database and cannot distinguish between exporters and nonexporters. As discussed in di Giovanni et al. (2011), there is a systematic effect of international trade on firm size distribution such that powerlaw exponents differ between exporters and nonexporters in their French firmlevel data. Since the righttail of the firm size distribution is often associated with exporting firms, the divergence observed in Figure 1 can be explained by the fact that the righttail may have a different exponent due to differences between exporters and nonexporters. Since the ORBIS database does not provide the exporting status of the firm this information, we pool together all of the firmlevel information in the database without taking into account exporting activity.
Table 2 reports estimates of the powerlaw exponents as well as the likelihood ratio tests in each region. We follow Clauset et al. (2009) in choosing exponential and lognormal as the alternative distributions. The powerlaw exponent is estimated based on the ${x}_{\mathrm{min}}$ values reported in Table 1. The resulting $\alpha $ values are found to be above one for each region, which satisfies the mathematical constraint for firm heterogeneity parameters, $\mathrm{\gamma >\sigma 1}$. Specifically, $\alpha $ is 1.913 for the US, 1.841 for Japan, 1.936 for the EU, and 1.822 for all the regions. The results show that the powerlaw exponents are similar across the regions for this particular sector.
Table 2: PowerLaw Estimates and LR Tests for Manufacturing Sector (Operating Revenue)































We also test the fits against alternative distributions and report the loglikelihood ratio as well as the corresponding pvalue in Table 2. Positive values of the LR mean that the powerlaw model provides a better fit compared to the alternative distribution, while negative values mean that the powerlaw model provides a worse fit compared to the alternative distribution. The reported pvalues indicate the significance of the test. Small pvalues indicate that the alternate model has a worse fit and should be rejected in favor of the powerlaw model. In this paper we choose a pvalue of 0.1 following Brzezinski (2014) such that if the reported pvalue is larger than 0.1, it is not possible to choose between the two models.
Positive LR values in Column 2 of Table 2 indicate that the powerlaw model is a better fit compared to the exponential distribution for all regions. The associated pvalues are low such that the exponential distribution can be ruled out as a plausible model for the operating revenue data in manufacturing. However, the LR values for lognormal distribution are negative in the US, Japan and All regions. This suggests that powerlaw is not a good fit against lognormal in this dataset. While for EU, the LR value is positive, the test is inconclusive since the corresponding pvalue is large. Therefore, for the EU and also for the pooled sample, powerlaw and lognormal are not distinguishable.
A similar analysis for firm size is conducted with the number of employees in manufacturing for the same four regions. Table 3 reports the estimates of lower bound ${x}_{\mathrm{min}}$ of powerlaw and the number of observations for the number of employees data. The average number of observations above ${x}_{\mathrm{min}}$ for the number of employees in the manufacturing sector is 1345 (27%), 14459 (14%), 68442 (4%), and 110844 (5%) for the US, Japan, the EU, and All, respectively. Similar to the operating revenue data, we observe that a relatively larger fraction of observations for number of employees are above the lower bound in the US compared to the other regions.
Table 3: Lower Bound for Pareto Law in the Manufacturing Sector for Number of Employees





















The powerlaw exponent, based on the${x}_{\mathrm{min}}$ values, is reported in Table 4. The resulting $$ values are found to be around 2 for each region. These results satisfy the mathematical constraint for firm heterogeneity parameters, $\mathrm{\gamma >\sigma 1}$. Specifically, $\alpha $ is 1.936 for the US, 2.030 for Japan, 2.036 for the EU, and 1.933 for all the regions. Similar to the operating revenue data, we observe little variation in the powerlaw exponents across regions.
Table 4: PowerLaw Estimates and LR Tests for Manufacturing Sector (Number of Employees)































Comparison against alternative distributions for the number of employees draws a similar conclusion to the operating revenue case. Positive LR values are observed when the powerlaw is compared against the exponential distribution, which suggests that the powerlaw model is a better fit for every region. On the other hand, comparison against lognormal distribution does not produce a systematic conclusion. LR value is negative for the US with a low pvalue, which suggests that powerlaw is a worse fit against lognormal. For Japan and All, LR value is still negative; however, the pvalues are large such that the test is inconclusive. In comparison, for the EU, LR value is positive with a low pvalue suggesting that powerlaw model is a better fit for these data against lognormal.
In order to impute the elasticity of substitution, we also require the powerlaw exponent for firm productivity.In this paper, we use a standardized measure of firm productivity by calculating labor productivity, as operating revenue divided by the number of employees.
Table 5 reports the estimates of lower bound ${x}_{\mathrm{min}}$ of powerlaw and the number of observations of firm productivity. We observe that the average number of observations above ${x}_{\mathrm{min}}$ of firm productivity in the manufacturing sector is 3103 (65%), 5598 (6%), 167709 (9%), and 183138 (9%) for the US, Japan, the EU, and All, respectively. A substantially larger fraction of observations for firm productivity is above the lower bound in the US compared to the other regions. This stems from the fact that ${x}_{\mathrm{min}}$ for the US is lower than the rest of the regions. Firm productivity in Japanese manufacturing sector seems to have the largest ${x}_{\mathrm{min}}$ value.
Table 5: Lower Bound for Pareto Law in the Manufacturing Sector for Productivity





















The resulting $\gamma $ values, shown in Table 6, are in the range of 2.442 – 3.146. These are the values that we use for the shape parameter of Pareto distribution in the firm heterogeneity model.
When we compare the powerlaw model against alternative distributions in Table 6, LR values are found to be positive for both the exponential and lognormal distributions suggesting that powerlaw model is a better fit than the alternatives. Pvalues are low for all regions in the exponential case and also for the EU and All in the lognormal case. More generally, we can rule out exponential distribution as a plausible model for productivity in all regions. We can also rule out lognormal for the EU and All regions.
Table 6: PowerLaw Estimates and Tests for Manufacturing Sector (Productivity)































Table 7 reports the imputed values of the elasticity of substitution across manufacturing varieties. We use the powerlaw exponent in firm size $\left(\mathrm{\alpha =}\frac{\gamma}{\mathrm{\sigma 1}}\right)$ and in productivity $\gamma $to impute the elasticity of substitution.
Table 7: Imputed Values of Elasticity of Substitution for the Manufacturing Sector
















Results for the Motor Vehicles and Parts (MVH) Sector





















Similar to the manufacturing sector, Table 9 shows that the exponential distribution can be ruled out as a plausible fit for operating revenue in the MVH sector. LR values are positive for all regions with low pvalues. On the other hand, lognormal distribution cannot be ruled out. While the LR values are negative, indicating that powerlaw model is a worse fit compared to lognormal for the operating revenue data, the pvalues are large which makes the test inconclusive. Therefore, powerlaw and lognormal are not distinguishable in these data. The resulting estimates of powerlaw exponents are above 1 for all regions, within the range of 1.348 – 1.571. This range satisfies the mathematical constraint.
Table 9: PowerLaw Estimates and LR tests for the MVH Sector (Operating Revenue)































Table 10 presents the estimates of lower bound ${x}_{\mathrm{min}}$ of powerlaw for the number of employees in the motor vehicles and parts sector. The size of the available data for the US is quite limited with only 9 observations before truncation, 6 of which is above the lower bound 1300 for number of employees.
Table 10: Lower Bound for Pareto Law in the MVH Sector (Number of Employees)





















Powerlaw fits for number of employees are shown in Figure 5. They are similar to the operating revenue data, where powerlaw model seems to provide a better fit in the EU and All regions, compared to the US and Japan. However, the number of available observations in the MVH sector of the US is only 18 and so may not be enough to do a meaningful analysis of powerlaw fit.