GE_Gravity Example

This example RMD uses the example data TradeData0014, which is included with this package. The data set consists of a panel of 44 countries trading with one another over the years 2000-2014. The trade data uses aggregated trade flows based on WIOD and information on FTAs is from the NSF-Kellogg database maintained by Scott Baier and Jeff Bergstrand. To find out more about the trade data, see “Timmer, Dietzenbacher, Los, Stehrer, and de Vries (2015).

data(TradeData0014) # loads data included with package
head(TradeData0014)
#>   exporter importer expcode impcode year        trade eu_enlargement other_fta
#> 1      AUS      AUS       1       1 2000 7.047775e+05              0         1
#> 2      AUS      AUS       1       1 2005 1.334877e+06              0         1
#> 3      AUS      AUS       1       1 2010 2.210350e+06              0         1
#> 4      AUS      AUS       1       1 2014 2.436575e+06              0         1
#> 5      AUS      AUT       1       2 2000 6.195446e+01              0         0
#> 6      AUS      AUT       1       2 2005 1.407591e+02              0         0
#>   FTA
#> 1   1
#> 2   1
#> 3   1
#> 4   1
#> 5   0
#> 6   0

Suppose the researcher wishes to use this data set to quantify general equilibrium trade and welfare effects of the EU enlargements that took place between 2000-2014. To first obtain the partial effects of these enlargements on trade flows, a PPML (Poisson pseudo-maximum likelihood) gravity specification may be used. Specifically, to obtain the “partial” estimates of the effects of EU enlargements on trade, we can use the following three-way gravity specification:

\[X_{ijt} = \exp\bigg[\ln(\alpha_{it}) + \ln(\alpha_{jt}) + \ln(\alpha_{ij}) + \beta \cdot FTA_{ijt}\bigg] + e_{ijt}\]

The Stata equivalent of this package uses the ppmlhdfe command created by Correia, Guimarães, & Zylkin (2019) to achieve this. In our example, we will use the ‘alpaca’ library by Amrei Stammann to do the same via the (Fixed-Effect Generalized Linear Model) function.

library(alpaca)  # Needed for partial coefficients

# Generate foreign trade subset
f_trade <- TradeData0014[TradeData0014$exporter != TradeData0014$importer,]

# classify FEs for components to be absorbed (finding variable interactions)
f_trade$exp_year <- interaction(f_trade$expcode, f_trade$year)
f_trade$imp_year <- interaction(f_trade$impcode, f_trade$year)
f_trade$pair     <- interaction(f_trade$impcode, f_trade$expcode)

# Fit gravity model using PPML with exporter-time, importer-time, and exporter-importer FEs
partials <- feglm(
  formula = trade ~ eu_enlargement + other_fta | exp_year + imp_year + pair,
  data    = f_trade,
  family  = poisson()
)$coefficient  # We just need the coefficients for computation

print(round(partials, 3))
#> eu_enlargement      other_fta 
#>          0.224         -0.037

In addition to estimating the effects of EU enlargements on new EU pairs, this example also controls for any other FTAs signed during the period. Each of these variables is coded as a dummy variable that becomes 1 when the agreement goes into effect for a given pair. The estimated coefficient for is 0.224, implying that the expansion of the EU had an average partial effect of \(e^{0.224} − 1 = 25.1\%\) on trade between new EU members and existing members. With clustered standard errors, this estimate is statistically significant at the \(p < .01\) significance level.

Before proceeding, some further pre-processing is needed. Specifically, we need to supply the function with the partial effects of joining the EU estimated above for each of the appropriate pairs. To do this, we create a new variable whose value is either the eu_enlargement partial computed above or 0. We will use the year 2000 as the baseline for the general equilibrium counterfactual, so we assign these partial effects in the year 2000.

For Stata users, the Stata equivalent of this next step is:

sort exporter importer year
by exporter importer: gen new_eu_pair = (eu_enlargement[_N]-eu_enlargement[1])                   
by exporter importer: gen eu_effect = _b[eu_enlargement] * new_eu_pair

In R, we do the following:

# Sort matrix to make it easier to find imp/exp pairs
t_trade <- TradeData0014[order(
  TradeData0014$exporter,
  TradeData0014$importer,
  TradeData0014$year),
]

t_trade$new_eu_pair <- NA
t_trade$eu_effect   <- NA   # this creates a new column that will contain partial effect of EU membership for new EU pairs
i <- 1
# Effect of EU entrance on country based on partial, if entry happened
invisible(by(t_trade, list(t_trade$expcode, t_trade$impcode), function(row) {
  # Was a new EU pair created within time span?
  t_trade[i:(i+nrow(row)-1), "new_eu_pair"] <<- diff(row$eu_enlargement, lag=nrow(row)-1)
  i <<- i + nrow(row)
}))

# If added to EU, give it the computed partial eu_enlargement coefficient (0.224) as the effect
t_trade$eu_effect = t_trade$new_eu_pair * partials[1]

To finalize the data for the counterfactual, we will use the year 2000 as the baseline year. The data we will feed to the ‘ge_gravity’ command looks like this:

# Data to be finally fed to the function (we base the counterfactual on the year 2000.)
ge_baseline_data <- t_trade[t_trade$year == 2000,]   # In example, 1892 Entries, 5676 removed

# head(data) # First 10 rows
ge_baseline_data[sample(1:nrow(ge_baseline_data), 10),]  # 10 random rows
#>      exporter importer expcode impcode year        trade eu_enlargement
#> 6749      SVK      FRA      39      16 2000 3.525703e+02              0
#> 2361      EST      HRV      14      19 2000 4.696484e-01              0
#> 4373      JPN      RUS      25      38 2000 7.188399e+02              0
#> 2877      GBR      FRA      17      16 2000 3.085328e+04              1
#> 2801      FRA      SWE      16      41 2000 5.576677e+03              1
#> 4781      LUX      CHN      28       8 2000 1.050690e+02              0
#> 7553      TWN      SWE      43      41 2000 6.227414e+02              0
#> 2205      ESP      ITA      13      24 2000 1.172864e+04              1
#> 7165      SWE      NLD      41      32 2000 3.927551e+03              1
#> 5913      POL      LTU      34      27 2000 3.654052e+02              0
#>      other_fta FTA new_eu_pair eu_effect
#> 6749         1   1           1  0.224249
#> 2361         0   0           1  0.224249
#> 4373         0   0           0  0.000000
#> 2877         0   1           0  0.000000
#> 2801         0   1           0  0.000000
#> 4781         0   0           0  0.000000
#> 7553         0   0           0  0.000000
#> 2205         0   1           0  0.000000
#> 7165         0   1           0  0.000000
#> 5913         1   1           1  0.224249

Now that we have the data we need, we can run the ‘ge_gravity’ function as follows:

ge_results <- ge_gravity(
  exp_id = ge_baseline_data$expcode,    # Origin country associated with each observation
  imp_id = ge_baseline_data$impcode,    # Destination country associated with each observation
  flows  = ge_baseline_data$trade,      # Observed trade flows in the data for the year being used as the baseline
  beta   = ge_baseline_data$eu_effect,  # “Partial” change in trade, obtained as coefficient from gravity estimation
  theta  = 4,               # Trade elasticity
  mult   = FALSE,           # Assume trade balance is an additive component of national expenditure
  data   = ge_baseline_data
)
ge_results[sample(1:nrow(ge_results), 10),c(1:2,5:6,12:16)] # 10 random rows
#>      exporter importer year      trade  new_trade   welfare real_wage  nom_wage
#> 1913      DEU      SVK 2000 2723.95149 3269.19970 1.0007249 1.0007204 1.0002367
#> 3685      IDN      TUR 2000  206.00501  205.85536 0.9999865 0.9999931 0.9998933
#> 2501      FIN      CZE 2000  217.43389  264.63294 1.0004615 1.0004125 1.0008138
#> 4709      LTU      POL 2000  169.35396  207.88813 1.0065244 1.0064727 0.9993151
#> 3053      GRC      FRA 2000  271.52443  271.85529 1.0002792 1.0002622 0.9998333
#> 949       CAN      GRC 2000  207.26002  206.92770 0.9999912 0.9999941 0.9999349
#> 4301      JPN      HUN 2000  890.71506  862.04180 0.9999971 0.9999982 0.9999063
#> 2973      GBR      SVN 2000  283.02320  333.17404 1.0001393 1.0001392 1.0001324
#> 5749      NOR      MEX 2000   96.39469   96.31901 1.0002589 1.0002422 1.0001253
#> 6025      PRT      DEU 2000 3186.03737 3182.00629 1.0000813 1.0000725 0.9998933
#>      price_index
#> 1913   0.9995167
#> 3685   0.9999002
#> 2501   1.0004012
#> 4709   0.9928884
#> 3053   0.9995712
#> 949    0.9999408
#> 4301   0.9999082
#> 2973   0.9999932
#> 5749   0.9998831
#> 6025   0.9998208

This assumes a standard trade elasticity value of \(\theta = 4\). The input for \(\beta\) is given by the variable we created called eu_effect, which is equal to 0.224 for new EU pairs formed during the period and equal to 0 otherwise. Because of the small size of the sample, it solves almost instantly. Sample results for the first 10 rows in the data are shown above for the counterfactual trade level and the associated changes in welfare, real wages, nomimal wages, and the local price index. Click on the right arrow to scroll to the right if they are not all displayed above. For the latter four variables, the number shown is the result computed for the exporting country. Unsurprisingly, the new EU members (Bulgaria, Croatia, Czech Republic, Estonia, Hungary, Latvia, Lithuania, Malta, Poland, Romania, Slovakia, and Slovenia) realize the largest welfare gains from their joining the EU, with existing EU countries also gaining. All countries not included in the agreement experience small losses due to trade diversion, with the largest losses accruing to Russia.

We can also change how trade imbalances enter the model. The default (exibited above with mult = FALSE) is to assume that they enter expenditure additively (i.e., \(E_j = Y_j + D_j\)), but one can also change the model so that expenditure is instead a fixed multiple of income (i.e., let \(E_j = \delta_j Y_j\).) by setting mult = TRUE. While using multiplicative imbalances instead of additive balances changes the results slightly, they are still qualitatively very similar.

An important point about the above exercises is that the initial partial effect is estimated with some uncertainty. The GE results that were calculated may paint a misleading picture because they do not take this uncertainty into account. For this reason, it is considered good practice to use a bootstrap method to construct confidence intervals for the GE calculations. This type of procedure can be made easier using ge_gravity function:

library(data.table)  # needed below for bootstrap procedure

# Helper function for shuffling the data with replacement.
# This allows us to shuffle by pair rather than treating each observation as independent.
get_bootdata <- function(data=list(),id) {
  uniq_id <- levels(factor(data[,id]))
  draw    <- sort(sample(uniq_id,replace=TRUE))
  draw  <- data.table(draw)[,.N,by=draw]  # count duplicate pairs in bootstrap sample
  colnames(draw)[1] <- id
  boot_data  <- merge(data,draw,"pair",all.x=FALSE,all.y=FALSE)
  boot_index <- rep(row.names(boot_data), boot_data$N)  # replicates pairs drawn multiple times after merge
  boot_data  <- boot_data[matrix(boot_index), ]
  boot_data$rep <- (as.numeric(rownames(boot_data)) %% 1)*10+1
  return(data.frame(boot_data))
}

# set up for pair bootstrap
set.seed(12345)
bootreps      <- 20
TradeData0014[,"pair"] <- interaction(TradeData0014$expcode,TradeData0014$impcode) # This is the ID we will use for resampling.

# Initialize matrices for saving results
x             <- TradeData0014[,c("eu_enlargement","other_fta")]
save_partials <- matrix(nrow = bootreps, ncol = ncol(x), dimnames = list(1:bootreps,colnames(x)))
GE_effects    <- ge_results[,11:15]
save_GE       <- matrix(nrow = bootreps, ncol = ncol(GE_effects), dimnames = list(1:bootreps,colnames(GE_effects)))

# generate bootstrapped gravity estimates using alpaca's feglm function (20 boot reps)
library(alpaca)
for (b in 1:bootreps) {
    
    # This step shuffles the data using the get_bootdata() function defined above
    boot_data <- get_bootdata(TradeData0014,"pair")
    
    # These next few steps are exactly the same as the ones we used above to estimate the partial effects
    
    # Generate foreign trade subset
    f_trade <- boot_data[boot_data$exporter != boot_data$importer,]

    # Normalize trade data to unit interval
    f_trade$trade <- f_trade$trade / max(f_trade$trade)

    # classify FEs to be absorbed
    f_trade$exp_year <- interaction(f_trade$expcode, f_trade$year)
    f_trade$imp_year <- interaction(f_trade$impcode, f_trade$year)
    f_trade$pair     <- interaction(f_trade$impcode, f_trade$expcode)
    
    # Estimate and save partial effects
    save_partials[b,] <- feglm(
    formula = trade ~ eu_enlargement + other_fta | exp_year + imp_year + pair,
    data    = f_trade,
    family  = poisson()
    )$coefficient  # We just need the coefficients for computation
}


# Obtain bootstrapped GE results based on bootstrapped partial effects
bootstrap_GE_results <- ge_baseline_data
for (b in 1:bootreps) {
  
  # set up baseline data using the estimated partial effect from bootstrap b
  boot_ge_data <- ge_baseline_data
  boot_ge_data$eu_effect <- save_partials[b,1] * boot_ge_data$new_eu_pair 
  
  # run GE_gravity
  temp <- ge_gravity(
  exp_id = boot_ge_data$expcode,    # Origin country associated with each observation
  imp_id = boot_ge_data$impcode,    # Destination country associated with each observation
  flows  = boot_ge_data$trade,      # Observed trade flows in the data for the year being used as the baseline
  beta   = boot_ge_data$eu_effect,  # “Partial” change in trade, obtained as coefficient from gravity estimation
  theta  = 4,               # Trade elasticity
  mult   = FALSE,           # Assume trade balance is an additive component of national expenditure
  data   = boot_ge_data
  )
  
  # store results
  bootstrap_GE_results[,paste0("welf",b)] <- temp[,"welfare"]
  bootstrap_GE_results[,paste0("trade",b)] <- temp[,"new_trade"]
}

# get bootstrapped means, SDs, and 95% CIs for partial effects
colMeans(save_partials)
#> eu_enlargement      other_fta 
#>     0.24045195    -0.03465284
apply(save_partials, 2, sd)
#> eu_enlargement      other_fta 
#>     0.07753165     0.05578904
apply(save_partials, 2, function(x) quantile(x, probs = .975))
#> eu_enlargement      other_fta 
#>     0.34988358     0.05572457
apply(save_partials, 2, function(x) quantile(x, probs = .025))
#> eu_enlargement      other_fta 
#>     0.09867022    -0.11837670

# Get 95% CIs for GE effects
temp <- bootstrap_GE_results[,12:51]
temp <- temp[,order(colnames(temp))]

bootstrap_GE_results[,"lb_welf"] <- apply(temp[,21:40], 1, function(x) quantile(x, probs = .025))
bootstrap_GE_results[,"ub_welf"] <- apply(temp[,21:40], 1, function(x) quantile(x, probs = .975))
bootstrap_GE_results[,"lb_trade"] <- apply(temp[,1:20], 1, function(x) quantile(x, probs = .025))
bootstrap_GE_results[,"ub_trade"] <- apply(temp[,1:20], 1, function(x) quantile(x, probs = .975))

disp_cols <- c("exporter","importer","year","trade","lb_welf","ub_welf","lb_trade","ub_trade")
bootstrap_GE_results[sample(1:nrow(ge_results), 10),disp_cols] # 10 random rows; GE welfare effects are for *exporter*
#>      exporter importer year        trade   lb_welf   ub_welf     lb_trade
#> 1149      CHE      ITA 2000 7.749205e+03 0.9999020 0.9999746 7.752873e+03
#> 4017      IRL      ROW 2000 1.639069e+04 1.0000977 1.0003829 1.635859e+04
#> 1557      CYP      RUS 2000 1.165478e+01 1.0035637 1.0141773 1.174513e+01
#> 3605      IDN      IND 2000 1.156623e+03 0.9999779 0.9999943 1.156626e+03
#> 5197      MEX      ITA 2000 3.174025e+02 0.9999928 0.9999981 3.175410e+02
#> 5449      MLT      TWN 2000 9.318681e-01 1.0063494 1.0255347 9.040972e-01
#> 1093      CHE      CZE 2000 3.707880e+02 0.9999020 0.9999746 3.564777e+02
#> 497       BEL      ROW 2000 2.142915e+04 1.0001526 1.0006022 2.138667e+04
#> 7581      USA      BGR 2000 1.377859e+02 0.9999968 0.9999992 1.225898e+02
#> 6737      SVK      ESP 2000 7.114036e+01 1.0038431 1.0153599 7.871396e+01
#>          ub_trade
#> 1149 7.761856e+03
#> 4017 1.638199e+04
#> 1557 1.193969e+01
#> 3605 1.156631e+03
#> 5197 3.178676e+02
#> 5449 9.254744e-01
#> 1093 3.670091e+02
#> 497  2.141733e+04
#> 7581 1.334064e+02
#> 6737 1.012978e+02

Note again that the displayed bounds on welfare estimates refer to welfare changes for the exporting country.

Tom Zylkin, Vadim Kudlay