Pipe-Delimited Flat Files Instructions

Overview

Pipe-delimited flat files are text files that contain AACT data in a format where fields are separated by the pipe character ("|"). These files are designed for users who cannot connect to the live PostgreSQL database or who prefer working with conventional text-based formats. The flat files can also be used to populate non-PostgreSQL databases or for analysis with statistical software like R, SAS, or SPSS.

The AACT system provides access to the 30 most recent daily snapshots along with permanent monthly archives that are created on the first day of each month. This ensures you have both current data and historical reference points.

General Information About Flat Files

File Structure: Each flat file corresponds to a single table in the AACT database schema. For example, the studies.txt file contains all data from the studies table in the database. To understand how these tables relate to each other, refer to the schema diagram. These relationships are essential for correctly joining data across multiple files.

File Format: The files use UTF-8 encoding with pipe characters ("|") separating each field. Consecutive pipes indicate a null or missing value. Each record ends with a line feed (LF) character.

Header Row: The first row of each file contains the column names, which match the field names in the database. The order of these names corresponds to the order of data in each row.

Data Format: Fields are typically not enclosed in quotes. However, text fields may contain embedded quotes (single or double) as part of their content.

Modifications to Source Content: In rare cases, the actual content of the study data contains an embedded pipe character ("|"). To prevent software from interpreting the embedded pipe as a delimiter, the entire string containing the pipe has been enclosed in double quotes. Additionally, line feed and paragraph break characters within fields have been removed to maintain record integrity.

Using the Flat Files

1

Download Snapshot File

Choose between the current snapshot on the Downloads page or explore past archives if you need older data.

2

Extract Contents of Zip File

After downloading, locate the ZIP file in your system’s Downloads folder. Unzip into a folder of your choice. When extraction completes, you will find a set of pipe‑delimited text files named after their corresponding database tables (for example, studies.txt, interventions.txt, etc.). Use these files directly with your analysis tools.

3

Access Content with Your Analysis Tool

The flat files can be read with many software tools. Statistical analysis packages are particularly well-suited for working with these files.

The most common tools for analyzing AACT flat files are R (a free, open-source statistical package) and SAS (widely used in clinical research). To learn how to import and analyze your data, see our detailed R instructions or SAS instructions.

Other analysis tools like Python (pandas), SPSS, and STATA also have functions for importing pipe-delimited files. When importing, always specify the pipe character ("|") as the delimiter and indicate that the first row contains header information.

Note about Excel: While it's possible to open these files in Excel (by specifying the pipe character as delimiter during import), this is not recommended for AACT data files. Many of the files exceed Excel's row limits, and Excel may not properly handle all special cases in the data.