Frequently Asked Questions (FAQs) about Data

INTERNATIONAL TRADE AND PRODUCTION DATABASE FAQ

How to reduce the size of the ITPD-E?

Release 2 of the ITPD-E is available for download without country or industry names, which significantly reduces the size of the data file. You can also download data from Release 2 by industry. Industry-specific files are small enough to be opened in Microsoft Excel.

For Release 1, you can remove the long string variables to reduce the size of the database in memory and on disk. For example, the complete database in Stata's .dta file format will be about 6.8GB. If long string variables (country names, industry name) are removed, the resulting .dta file can be as small as 1.1GB. Country and industry names will still be indicated by country and industry codes.

How to merge the ITPD-E with DGD Version 2?

The steps are (1) read in DGD for 2000-2016, (2) merge ITPD-E with DGD on exporter_iso3==iso3_o, importer_iso3==iso3_d, and year.

How to merge the ITPD-E with DGD?

The procedure is easy for the most recent releases of the ITPD-E and DGD since both identify countries by dynamic country codes. Merge the two datasets on exporter (origin), importer (destination), and year.

For the ITPD-E-R01 and DGD Version 1, the steps are (1) read in DGD for 1993-2004, (2) append DGD for 2005-2016, (3) rename all instances of ROM to ROU in DGD, (4) merge ITPD-E with DGD on exporter_iso3==iso3_o, importer_iso3==iso3_d, and year.

DYNAMIC GRAVITY DATASET FAQ

How many countries and years does the dataset cover?

The dataset provides annual records beginning in 1948 through 2019. It provides information for a total of 286 countries and territories. Within each year, however, the total number of countries varies based on historical changes to the global landscape such as the merging or separation of countries. For example, Slovakia will not appear in the data until the year in which it gained independence.

Does it include trade data?

No. The dataset was designed to be a companion for trade data but does not include any of that information itself. While constructing the Dynamic Gravity dataset, we worked to insure that it would match easily and without significant effort to standard sources of bilateral trade data, such as Comtrade. In most cases, the iso3_o and iso3_d codes should match without issue to bilateral trade data (see below for the exceptions). Additionally, it combines seamlessly with the ITPD-E, which includes international and domestic trade flows.

Will the dataset be updated in the future?

We intend to update the dataset regularly with recently reported data. Updates for some variables are contingent on the availability of third-party sources.

How do I keep track of the dataset versions?

The initial release of the dataset is version 1.0. The documentation associated with that release is documentation version 1.00.
Minor updates to the documentation associated with the dataset version 1.0 will receive version numbers 1.01, 1.02, etc.
Minor updates to the dataset version 1.0 will receive version numbers 1.1, 1.2, etc. Documentation associated with these updates to the dataset will be numbered version 1.10, 1.11 ... 1.20, 1.21, etc.
Major updates to the dataset will be released as version 2.0, 3.0, etc., with the associated documentation numbered version 2.00, 2.01, etc., followed by version 3.00, 3.01, etc.

Can I use the dataset with Stata?

Yes. The dataset is stored in comma-delimited format. Stata's import delimited command allows Stata to read comma-delimited files with ease.

How do I merge your Version 2 data with Comtrade or WITS trade data?

In most cases, the iso3_o and iso3_d codes should match without issue to bilateral trade data. The combination of iso3_o, iso3_d, and year uniquely identify an observation in the dataset. However, the table below describes several countries that we are aware of with different letter codes in the Dynamic Gravity, Comtrade, and WITS datasets.

Country name	Dynamic Gravity	Comtrade	WITS
Congo, Dem. Rep. of the	COD	COD	ZAR
East Timor	TLS	TLS	TMP
Montenegro	MNE	MNE	MNT
Neutral Zone	NTZ	n/a	NZE
Pacific Islands	PCI	PCI	PCE
Romania	ROU	ROU	ROM
Serbia	SRB	SRB	SER
Sikkim	SKM	n/a	SIK
Sudan	SDN	SDN	SUD
Taiwan	TWN	490	OAS
U.S. Miscellaneous Pacific Islands	PUS	n/a	USP
Vietnam, South	VNM	VNM	SVR
Yemen, South	YMD	YMD	YDR

How do I merge your Version 1 data with Comtrade or WITS trade data?

Country name	Dynamic Gravity	Comtrade	WITS
Congo, Dem. Rep. of the	ZAR (1971-1997) COD (1998-2016)	COD	ZAR
East Timor	TLS	TLS	TMP
Montenegro	MNE	MNE	MNT
Neutral Zone	NTZ	n/a	NZE
Pacific Islands	PCI	PCI	PCE
Romania	ROM (1948-2001) ROU (2002-2016)	ROU	ROM
Serbia	SRB	SRB	SER
Sikkim	SKM	n/a	SIK
Sudan	SDN	SDN	SUD
Taiwan	TWN	490	OAS
U.S. Miscellaneous Pacific Islands	PUS	n/a	USP
Vietnam, South	VNM	VNM	SVR
Yemen, South	YMD	YMD	YDR

How do I merge this dataset with other datasets?

The Dynamic Gravity dataset uses ISO alpha-3 codes to identify countries. A list of country names, iso alpha-3 codes, and years of existence of each country in the dataset can be downloaded here. Note that many international organizations such as the International Monetary Fund, the World Bank, and the United Nations, among others, use their own codes to identify countries that may not always match perfectly to the Dynamic Gravity dataset.

What is the difference between the identifiers iso3 and dynamic_code?

Both codes provide methods for identifying countries within the dataset. Both iso3 and dynamic_code (combined with year) uniquely identify records. The iso3 variables use the standard International Organization of Standards (IOS) ISO alpha-3 codes and are likely the most useful identifier for most purposes, such as merging with Comtrade data. The dynamic_code variables provide a slight refinement to the ISO codes by tracking situations in which countries undergo a significant alteration to their geography and composition but retain the same ISO code. In these cases (such as Pakistan's split into Pakistan and Bangladesh), the dynamic_code notes this change by altering the code to reflect the difference (post-split Pakistan is given the dynamic_code "PAK.X").

(This question is only applicable to Version 1) What countries should be given special consideration when merging with other data?

Democratic Republic of the Congo: The country of Democratic Republic of the Congo was known as Zaire between 1971 and 1997 and used ISO alpha-3 code “ZAR” during that period. Starting in 1998, the country was renamed to its current name and changed its iso alpha-3 code to “COD”. When combining with other data, it is worthwhile to check what code(s) the other dataset uses to insure that the Democratic Republic of the Congo is able to match.

Romania: The country of Romania has used two different ISO alpha-3 codes over time: "ROM" and "ROU". The first code, "ROM" was used until 2001. Beginning in 2002, the code "ROU" was used instead. The Dynamic Gravity dataset follows this standard, using "ROM" for 1948-2001 and "ROU" for 2002-present. When combining with other data, it is worthwhile to check what code(s) the other dataset uses to insure that Romania is able to match.

How does the Dynamic Gravity Dataset differ from the one provided by CEPII?

The Dynamic Gravity dataset seeks to make several improvements over the commonly used dataset made available by CEPII. Specifically, we have aimed to closely follow the ways in which countries and borders have changed between 1948 and 2019, provide more information about countries and their relationships, and have greater transparency with respect to sources and methods underlying the construction of the variables. For more details, please see the paper, which provides a more in-depth discussion of the differences and similarities between the two datasets.

CONTACT US

ABOUT

DATA

SOFTWARE

RESEARCH

Get USITC News in Your Inbox