ICE core data codebook

Authors
Affiliations

University of California–Los Angeles

University of California–Berkeley School of Law

Phil Neff

Published

December 10, 2025

We provide a codebook for the main ICE data tables and fields. The codebook is a work in progress; there are many things we do not understand in the data, and some of our educated guesses here may be mistaken. We will continue to update the codebook as we learn more, and we welcome feedback and corrections.

This codebook reproduces parts of a preprint article that describes the ICE data we have posted.

Data structure

In the arrests, detainers, encounters, and removals tables, each row represents a single enforcement action taken by ICE for a particular noncitizen. In some cases, noncitizens appear more than once in the data, because some are arrested, encountered, or removed more than once (if they are removed and reenter the country).

The detentions table is more complex. Each row in the original ICE detentions data represents a single “stint” in detention at a particular detention center. Most noncitizens have only one stint per stay in detention, but some are transferred between detention centers during a single stay.

To illustrate the data structure of the detentions table, we take two noncitizens as examples:

  • Noncitizen A is detained from March 19 to March 21. They have a single stint in Otero County Processing Center. They have one stay and one stint within the stay.
  • Noncitizen B is detained once, from June 6 to July 9, but during their stay in detention they are transferred between three facilities. They have a first stint at the Los Angeles ICE hold room (“LOS CUST CASE”) from June 6 to June 7, then they are transferred to Adelanto ICE Processing Center and held there from June 7 to July 25, and finally they are transferred to Desert View Annex and held there from July 25 to July 9. Noncitizen B has one stay with three stints.

These two citizens’ stays in ICE detention would be represented in the stint level dataset provided by ICE like this:

Noncitizen ID Stint ID
Stay dates
Stint dates
Detention facility
Book-in Book-out Book-in Book-out
Stay 1 A 1 Mar. -d Mar. -d Mar. -d Mar. -d OTERO CO PROCESSING CENTER
Stay 2 B 1 Jun. -d Jul. -d Jun. -d Jun. -d LOS CUST CASE
B 2 Jun. -d Jul. -d Jun. -d Jun. -d ADELANTO ICE PROCESSING CENTER
B 3 Jun. -d Jul. -d Jun. -d Jul. -d DESERT VIEW ANNEX

The Deportation Data Project provides a processed version of the detentions data at the stay level. The two citizens’ stays in ICE detention would be represented in the stay level dataset like this, as two rows:

Noncitizen ID
Stay dates
# of Stints
Book-in Book-out
Stay 1 A Mar. -d Mar. -d 1
Stay 2 B Jun. -d Jul. -d 3

Data processing

We post four datasets that are minimally-processed versions of original ICE data to simplify analysis: arrests, detentions at both the stay and stint level, and detainer requests. The stay-level detention dataset is the most heavily processed of these, while the others involve only adding a variable to flag duplicates and renaming variables for clarity.

Arrests, detainer requests, and detention stints

We post data with minimal changes to facilitate analysis using standard tools, including R, Python, Stata, SPSS, and Excel. We do not add or drop any rows.

For these minimally processed datasets, we make the following changes:

  • Add flag for likely duplicates. We add an indicator for rows that are possibly or likely duplicates. For arrests, we flag those within 24 hours of each other for the same noncitizen. For detainers, we flag rows with the same noncitizen ID and request date, because there is no time information. Analysts can use this flag to drop likely duplicates if desired. In some rare cases, these may reflect multiple enforcement actions within a 24-hour period. Most reflect duplicates recorded for administrative reasons, such as when a record is corrected after initial entry.
  • Drop blank or fully redacted columns. The column names can be found in the raw data, also available on the ICE data page.
  • Convert date-time fields to date when there is no time information. In some cases, date fields appear to have time information but every time is recorded as “00:00:00”. For ease of analysis, we convert these to date-only fields, dropping the time information.
  • Add an arrest date variable. The column is simple the date portion of the arrest date-time field, to facilitate analysis that does not require time information.

In a few cases, variable names are shortened to enable saving in Stata or SPSS format.

Detention stays

The detentions data are provided by ICE in a more complex format than the other tables. Our goal in posting this simplified dataset is to make analysis of ICE detention data more straightforward. In the original ICE data, there is a row for every book-in to a particular detention center, but most questions about detention concern what ICE calls an individual’s “stay” in detention — a single period of detention for a single person that often includes transfers between detention centers.

We create a dataset that has a single row for each stay in detention, preserving most (but not all) of the information in the original dataset. Note that individuals can also have more than one stay in detention if they are released and later detained again; in that case, this dataset includes a row for each stay, and those repeated stays are (anonymously) identifiable by the unique IDs in the data, which correspond to individuals’ A-numbers.

We describe those steps in general terms here:

  • Drop duplicate stints. We first identify duplicate detention stints in the data–stints with the same book-in date/time at the same detention center. There are a few thousand of these, a small number relative to the full dataset of over 1.3 million records. Nearly all of these reflect duplicated stints where an individual’s “initial bond set amount” changed. In order to eliminate these duplicates, we created a new variable called “lowest initial bond set amount” that reflects the lowest initial bond amount associated with a given stint.
  • Keep data from last stint. Then we preserve only the last stint in each stay. For most fields, this does not cause any loss of information because values in these fields (e.g. final order date, most serious conviction, and citizenship country) typically do not change within detention stays.
  • Join data from first, last, and longest stint. Finally, we add back in limited information from the stint level: for the first stint, the last stint, and the longest stint, we include the book-in date, book-out date, and detention facility. If a stay includes only one stint, then these are all identical.

Inspect code

Tables

We describe the main ICE data tables below.

Fields (variables) in latest data release

We describe the fields (a.k.a. variables or columns) in the latest ICE data release below. The table includes the name of each field, a description, and the type of data in the field (e.g., string, numeric, date). Expanding a row will show an indicator for whether the field is available in each table and the proportion missing.

Fields (variables) in previous data releases

We also provide a table of fields (a.k.a. variables or columns) that were available in previous ICE data releases but are not included in the most recent data. This table includes the name of each field, a description, and the type of data in the field (e.g., string, numeric, date).

Citation

BibTeX citation:
@article{blair2025,
  author = {Blair, Graeme and Hausman, David and Neff, Phil},
  title = {Immigration and Customs Enforcement Individual-Level Data: An
    Introduction},
  date = {2025-12-10},
  url = {https://deportationdata.org/docs/ice/ice-data-preprint.pdf},
  langid = {en}
}
For attribution, please cite this work as:
Blair, Graeme, David Hausman, and Phil Neff. 2025. Immigration and Customs Enforcement Individual-Level Data: An Introduction. accepted, December 10. https://deportationdata.org/docs/ice/ice-data-preprint.pdf.