EOIR Processed Case Data codebook

We produce a processed removal cases dataset derived from the EOIR Case Data published monthly in the EOIR FOIA Library. The goal is to provide a simplified, analysis-ready version of the dataset in formats compatible with standard data tools. We aim to update this dataset promptly after EOIR does so each month.

We provide a description of data processing steps and a link to the R code below.

Differences between ICE and EOIR data

We highlight key differences between this dataset from the EOIR immigration courts and the Immigration and Customs Enforcement (ICE) datasets. Both Immigration and Customs Enforcement (ICE) and EOIR record information on removal proceedings, but the datasets cover partly different populations and, when the populations overlap, the datasets are not matchable at the individual level and the record different pieces of information.

ICE arrests, detains, and deports many people whose cases never reach immigration court. When ICE conducts expedited removals (until recently, this meant only border removals, but more recently it may include people living in the United States for up to two years), those cases generally do not reach immigration court unless the noncitizen passes a credible fear interview. In addition, when ICE arrests people who already have a final order of removal, those arrests, detentions and deportations leave no trace in immigration court (although there may be older records of immigration court proceedings that led to that order). There are two limited exceptions. First, if the noncitizen, in reinstatement of removal proceedings, passes a reasonable fear interview, so-called withholding-only proceedings result. These are rare relative to normal removal proceedings, and we exclude them from this simplified dataset. Second, the noncitizen might file a motion to reopen or rescind their removal order. These are relatively rare as well, but they are not excluded from the dataset; advanced users can identify them by joining the motions table in the raw dataset.

The EOIR dataset also includes cases of people who never have contact with ICE. Both Customs and Border Protection and US Citizenship and Immigration Services can initiate removal proceedings, and if those proceedings do not result in arrest, detention or deportation, the ICE dataset will contain no records of them.

Even where the populations in the two datasets overlap, the datasets contain differing information. Most important, although both datasets include some location information, the information is not analogous. ICE tracks the states in which its arrests take place, and EOIR also tracks the state of the noncitizen. But if the noncitizen is detained and transferred to a detention center in a different state, the EOIR dataset will likely contain the state of that subsequent detention center rather than the state of the noncitizen’s residence. And the ICE detention dataset records the detention center in which each noncitizen is held, whereas the EOIR dataset records the immigration court and hearing location of the removal proceeding, which may not always correspond to the detention center.

There are other key differences in which pieces of information each dataset records. ICE records criminal convictions of noncitizens; EOIR does not. EOIR records the immigration charges—the civil grounds for deportation—for each person, but ICE only records these charges for people it actually deports, and only records one charge per person.

Finally, ICE records actual removals, whereas EOIR records immigration court outcomes. When a person is detained, a removal decision in the EOIR data almost always leads to an actual removal, but where a person is not detained, removal orders may not imply actual removals. And although EOIR records detention date information, that information is often missing even when EOIR’s dataset indicates that a person was detained throughout their proceedings.

Notes on how we processed the EOIR Case Data

The EOIR database is spread across multiple separate tables, and as a result it can be hard to use. We therefore have prepared a processed version that pulls limited information from several tables and combines it in a single table (spreadsheet) that has a single row for each removal case. We exclude case types other than removal cases for simplicity. This makes possible a range of analyses that would have required joining tables in the original dataset. For example, each case row now indicates whether an appeal was filed and what the outcome of that appeal was—information that previously was only available in the separate appeals table. This single composite table, with one row for each case, includes information on bond outcomes, applications for relief, immigration charges (the grounds on which the government is seeking deportation), appeals, and detention.

In addition, we have tried to make this dataset more user-friendly by renaming fields in order to make their meanings clearer and by joining the dataset’s so-called lookup tables, which decode EOIR’s difficult-to-understand abbreviations (for example, for court cities and nationalities).

In this simplified version of the data, we have necessarily omitted important information; we encourage users to consult the original data, the codebook for the original data, and the code that documents every step we took to create the simplified data from the original. We welcome feedback on these choices.

Each row in the processed dataset is a case, which corresponds to what an immigration lawyer would call a removal proceeding; a Notice to Appear initiates a case. In order to produce a single row for each case, we begin with the A_TblCase, which itself has one row per case. We keep all cases that have corresponding records in the B_TblProceeding table and a Notice to Appear date on or after October 1, 1997 (the beginning of fiscal year 1998, the first full year after the Illegal Immigration Reform and Immigrant Responsibility Act of 1996, known as IIRIRA, took effect). Because we only include cases initiated after IIRIRA, the dataset includes only removal case types and not the legacy exclusion/deportation case types. For each row, we keep several variables from A_TblCase, and we then add variables from B_TblProceeding, B_TblProceedCharges, D_TblAssociatedBond, tbl_Court_Appln, tbl_CustodyHistory, and tblAppeal.

In every table apart from A_TblCase, there can be more than one row for each case. For example, some cases have more than one bond hearing or more than one application for relief. We address this problem with different decision rules for each variable. For example, we keep the first and last immigration court base cities in each case (drawn from B_TblProceeding), and we keep information about the last appeal in each case if there was more than one appeal. In the codebook below, we display the name of each variable in the processed dataset, the name of the original variable and table from which it was derived, and the decision rule we applied to pick which row of that table to draw from (e.g. first, last, at least one, etc.).

Fields (variables)

We describe the fields (a.k.a. variables or columns) below. We include the name of each field, a description, and the type of data in the field (e.g., string, numeric, date). Expanding a row will show an indicator for whether the field is available in each table and the proportion missing.

Inspect code