Release Notes


AACT Site (April 28, 2022)

aact-admin 5.0.6

Bug Fixes

Features


AACT 6.2.2 (March 31, 2022)

aact-core release


AACT 6.2.0 (February 17, 2022)

aact-core release


AACT 6.1.1 (February 02, 2022)

aact-core release


AACT 6.0.0(November 15, 2021)

Added Airbrake notifications to StudyJsonRecord model

Setup code to check study statistics each time a load is done

Setup code to only use the Beta API when doing a load

Setup code to compare our data for the Beta API study statistics endpoint for data validation purposes


AACT 5.0.2.1(October 28, 2021)

Bug Fixes
Features

AACT 5.0.1(October 10, 2021)

Hotfix - Ability to process beta in parallel

Adds the ability to process the beta api in parallel to speed processing

Currently using 32 threads to speed processing

Quick estimates show the processing time of around 5 hours for the entire database


AACT 5.0.0(September 29, 2021)

Created a migration that renames ctgov_beta_group_code to ctgov_group_code

Updated the README so the setup steps are cleare

Added ability for the StudyJsonRecord.data_verification method write out the results from the comparison to a file

Added a method that loads a small number of studies for development


AACT 4.7.1 (September 18, 2021)

Renamed ctgov_beta_group_code to ctgov_group_code

Added mesh_type column for browse_conditions and browse_interventions tables

Edited code to save more MeSH types to the database

changed factory_girl to the factory_bot gem

added Airbrake for tracking errors

Added database explanation and visualization to the README

Edited how the downcase mesh terms are created so they would be updated regularly

Edited the way the loads are run to better switch between schemas

Added environmental variables for the ctgov_beta schema and database


AACT 4.6.1 (April 7, 2021)

Categories table renamed to SearchResults

We renamed "ctgov.categories" to "ctgov.search_results" to better relect that the table is a collection of search reseults saved from queries in the "ctgov.study_searches" table. We also added a categories view so any queries you have written for categories should still work. Please reach out if you have any issues.

The saved queries in the "ctgov.study_searches" table are used to search ClinicalTrials.gov.


AACT 4.6.0 (March 20, 2021)

Upgrade to Rails 6

We upgraded Rails to version 6 and updated all dependencies.

The source code was also updated to reflect this and security was strengthened.


AACT 4.5.0 (January 8, 2021)

Built alternate code that is compatible with the ClinicalTrials.gov Beta API

ClinicalTrials.gov has a beta API they will be changing over to in the future. We built alternate code to process information the way the that API delivers it. That API also provides additional information that we are capturing as well.

The code that uses the regular API is still in use. The regular and the beta code run in parellel in AACT but store data in different databases.

How the ctgov_beta schema diverges from the ctgov schema in this release:

AACT 4.4.1 (December 18, 2019)

Adapt RSS Reader to Correctly Handle a Change Introduced in the ClinicalTrials.gov RSS Feed

Typically, two to four thousand studies are updated each day in ClinicalTrials.gov. On Nov 20, 2019, the ClinicalTrials.gov RSS Feed implemented a feature to restrict the number of studies returned per RSS request; the maximum number of studies returned is now 1,000. Therefore, since the full load was run on Dec 1, 2019, AACT has been importing info for only 1,000 modified studies per night. (All new studies have been correctly imported into AACT; this problem only applies to studies that have changed.)

The AACT process that imports studies via the RSS Feed has been modified to send repeated requests to the RSS Feed until we have collected all studies that have been modified during the past specified number of days.


AACT 4.4.0 (December 8, 2019)

Refactor to Make it Easier for Others to Implement the AACT Application

We have continued to refactor code to make it easier for others to replicate the Ruby on Rails application that retrieves data from ClinicalTrials.gov to populate the AACT database. We replaced hard-coded references for database names & file locations with environment variables. We also removed some obsolete & unused code.


AACT 4.3.0 (July 2, 2019)

Upgrade to CentOS Linux release 7.6.1810 (Core)
Facilitate Replication of the AACT Application

To make it easier for others to replicate the AACT component that creates a relational database of ClinicalTrials.gov data, we've refactored code, simplified environment variables & modified the README file in the AACT git repository.

Fix Processs that Collects User Activity Statisitcs

When we upgraded to PostgreSQL 11.1, the format of the database log files changed slightly. As a result, the process that parses the log files failed to gather/summarize information. This has been fixed.

Block Nuisance IP Addresses

Someone setup unproductive bots that were consuming AACT database resources and dramatically slowing response time for others. We've implemented a feature that allows us to block IP addresses when we detect this type of activity.

Increase Maximum Database Connection Count

We increased the maximum number of connections allowed to the live database due to recent, significant increase in the number of people accessing it. We continue to investigate other ways to support this open database model as activity increases.

Rename AACT Projects to AACT Shared Data

We have renamed the AACT Projects feature to better describe the type of information it provides. We modified some navigation flows on the Data Share page to hopefully make it less confusing.

Use pg_dump v11.1 to Create Static Copies of Database

Since April, 2019 when we upgraded to PostgreSQL 11.1, the process that creates a static copy of the database has been unreliable. The static database copy serves two purposes: 1) it's used to refresh the live version of the AACT database that is available to the public and 2) it is made available for download so that others may use it to create their own personal copy of the database. The pg_restore comand used to refresh the public database has frequently failed since the 11.1 upgrade. Until now, we've been using the 9.6 version of the pg_dump command to create a copy of the database each night. To solve the problem with the nightly restore to the public database, we started using the 11.1 version of pg_dump to create the static database copy.


AACT 4.2.2 (May 27, 2019)

Upgrade jsgrid to 1.5.3

The table that lists & defines all AACT database tables/columns suddenly stopped appearing on the Data Dictionary page. We upgraded the jQuery plugin jsgrid so this data definition table would again appear on the page.


AACT 4.2.1 (April 17, 2019)

Upgrade to Ruby 2.4.5

The 3 AACT applications (AACT, AACT-Admin & AACT-Proj) have been upgraded to ruby v2.4.5.

Improve AACT Installation Documentation

For technical people who would like to create their own instance of the AACT core application (the part that pulls data from ClinicalTrials.gov and populates a relational database), we have updated documentation in the ReadMe file that appears on the AACT main page in github. This documentation describes how to clone AACT from github and create a local, working copy. The documentation has been improved, but it is a work-in-progress. We will add more information in the next release of AACT.


AACT 4.2.0 (April 14, 2019)

Added 'Study Characteristics' Project

AACT now includes the AACT-based research project: Characteristics of Clinical Trials Registered in ClinicalTrials.gov, 2007-2010. Datasets for this project are presented as tables in the proj_tag_study_characteristics schema. The tagged_terms table lists 2010 MeSH & free text terms that were determined to be associated with one (or more) of three clinical specialties: Mental Health, Oncology & Cardiovascular. (Each term is tagged with each related clinical specialty.) The new schema also includes a table of analyzed studies for each of the clinical specialties: mental_health_studies, oncology_studies & cardiovascular_studies. More information & downloadable datasets can be found on the Projects page.

Added Individual Project Pages to Provide More Details

To help us present more detailed information about each project, we've added a per-project page that may include the following information:

Click here to see the individual page for the newly added project: Characteristics of Clinical Trials Registered in ClinicalTrials.gov, 2007-2010.

Upgraded to PostgreSQL v11.1

The PostgreSQL public database was upgraded from version 9.6.6 to 11.1.

Added Database Constraints

To make the database more robust and easier to understand, foreign key constraints have been added for all table relationships and unique constraints have been added for the NCT ID in tables that should only have one row per study:

Upgraded the Tool We Use to Generate Schema Diagrams: pgModeler

We use pgModeler to produce the schema diagrams that appear on the AACT website. Because the previous version didn't work with PostgreSQL 11.1, we upgraded this tool to version 0.9.2-alpha1 & refreshed the schema diagrams. As a result, schema diagrams (such as the one for the main ctgov schema) now display additional information about primary keys, indexes, constrants, etc.

Loaded 2019 MeSH Thesaurus

The mesh_terms table now contains the 2019 set of MeSH terms. (We used the 11/1/18 version of mtrees2019.bin retrieved from NLM's FTP server: ftp://nlmpubs.nlm.nih.gov/online/mesh/MESH_FILES/meshtrees/) The 2018 MeSH terms are now available in the mesh_archive schema (table name: Y2018_mesh_terms).

Removed Leading Space in Calculated_Values.minimum_age_unit & maximum_age_unit Columns

The values in calculated_values.minimum_age_unit & maximum_age_unit all had a leading space, so 'Years' was presented as '_Years' & 'Months' as '_Months'. This has been fixed so that these values will no longer include leading spaces.


AACT 4.1.1 (March 16, 2019)

Resolve Security Issues Identified by GitHub

To address security vulnerabilities detected by github, we upgraded the sprockets, loofah & rubyzip gems. Several other unused gems were removed.

Add Provided_Documents table

On February 6, 2019, the National Library of Medicine (NLM) started making information about the documents provided with the study available through their API. AACT now includes this information in a table named provided_documents. The NLM briefly describes this information as follows:

"Data providers can submit documents including the Study Protocol, Informed Consent Forn (ICF), and Statistical Analysis Plan (SAP), possibly all in the same pdf document. These documents are archived, made available through the ClinicalTrials.gov site, and are now described in the Public XML."

Provide More Detailed Description about CDEK Standard Orgs Project

The CDEK Standard Orgs project documentation that appears on the Projects page did not note that the set of organizations included in the AACT database are those identified as the sponsor, overall official or responsible party of interventional drug trials. We have modified this documentation to better describe this project.

Revise Data Dictionary Documentation
Describe Different Schemas

We modified summary information on the Data Dictionary page to better describe the schemas recently added to the AACT database.

Document Set of 'all_' Views

We added a table that defines a set of views in the AACT database that had previously been undocumented, all of which have the prefix 'all_'. Information about these views is now available on the Data Defintions page. These 'all_ views' provide concatenated value strings for various one-to-many study relationships. The values are delimited with a bar character (|). For example, the study NCT00000146 has 3 rows in the browse_conditions table, so one row for this study is in the all_browse_conditions view that provides this value: 'Multiple Sclerosis|Neuritis|Optic Neuritis'. These views are useful for those who need to export a spreadsheet of studies where each row represents one study, and the row includes one-to-many data values. More information about these views can be found near the bottom of the Data Defintions page.

To be consistent, each of these 'all_' views now has a column called 'names' which presents the concatentated list of the values from the table represented by that view. In short - the name of the column containing this concatenated list of values in now 'names' in every one of these views. Previously, some of these views used other names for these columns - for example, the all_conditions view previously used the column name 'conditions' instead of 'names'. If people referenced these undocumented views in the past, they need to be aware that some of these column names have changed.

Add Enumeration for studies.plan_to_share_ipd

We'd like to see what percent of studies plan to share individual patient data (IPD), and how it might change over time, so we added the studies.plan_to_share_ipd attribute to our 'enumerations list'. After each nightly update, we recalculate the ratio/distribution of values found in each attribute in the 'enumerations list' and display this info in the Enumerations column of the Data Dictionary table. (You need to scroll to the right to see this column.) Twice a month we save the enumerations data so we can monitor how the values in these attributes might change over time. For more information about the Enumerations feature, see v3.0.2 release notes.

User Table Restore Instructions: Include Note to Escape Single Quotes

Some database users have a single quote in their last name. When restoring the Users table, the restore function will become confused and give up unless these single quotes are escaped. We have updated the instructions to remind the administrator to escape all single quotes that appear in user names before running the restore command.


AACT 4.1.0 (January 9, 2019)

Implement New Feature: AACT Projects

The AACT database now includes a set of supplemental schemas that present datasets collected & curated during previous AACT-based research. By including these data within the AACT database, the public can benefit from work that has been performed by other investigators. Since the information is directly accessible, it may be incorporated into queries on current clinical trials. It also serves to make previous research more transparent and help AACT users better understand assertions made by the previous investigators.

Database schemas are used to differentiate project-related data from ClinicalTrials.gov data. Data from ClinicalTrial.gov remain available in the ctgov schema and each project has a database schema in which the datasets for that project are available. All project schemas are prefixed with 'proj_'. With the release of AACT 4.1.0, all users of the live AACT database have immediate access to this information.

Datasets from the following three AACT-based research projects have been made available in this release:

Projects are described on the AACT website Projects Page. Definitions for each project's tables & columns are also defined in the Data Dictionary.

This feature will continue to be developed and your feedback is appreciated. Please email the AACT team with questions and suggestions.

This feature has been implemented as a separate Ruby on Rails application. AACT is now comprised of 3 applications: 1) AACT Core, 2) AACT Admin & 3) AACT Projects. All code for these three components is publicly available in github. Note: Implementing this feature required some changes to the way ClinicalTrials.gov data is loaded into AACT. Details about these changes are available upon request.

2010 & 2016 MeSH Terms Saved to Live Database in 'mesh_archive' Schema

The National Library of Medicine (NLM) updates the MeSH thesaurus each year. To facilitate access to the set of terms used by previous research projects, a new schema named mesh_archive has been added to the live AACT database. Tables in this schema are named yYYYY_mesh_terms where YYYY identifies the version of that set of terms. For example, the 2010 set of MeSH terms is available in mesh_archive.y2010_mesh_terms.

Three New Columns in Calculated_Values Table Provide 'Number of Intended Outcome Measures'

The Calculated_Values table has 3 new columns that provide the number of primary, secondary & other outcome measures:

These are integer columns. Values are calculated by summing the number of rows in the design_outcomes table per study where outcome_type is primary/secondary/other.

Changes Were Made to the Data Dictionary
Redesign 'Registered Users' Page so Info can be Sorted, Exported, Paginated, & Presents More Concisely

For AACT Admins Only: The AACT website page which lists information about all users has been enhanced; the information is now sortable & the table includes pagination. The option to download user information as a CSV or Excel file is selected, the content of the download only contains information that is of potential interest; attributes containing encrypted values no longer appear in the file. (This page is only accessible to AACT administrators.)

Assign All Database Users to a Read_Only Role

To simplify the management of user database accounts, we have created a role named 'read_only' in the AACT database and now assign all AACT users to this role. With this change, we are able to grant/revoke privileges to/from this one role rather than having to do it for each individual database user. (The search path must be specified for each individual user however, since it is not inheritable via the associated role.)


AACT 4.0.2 (September 12, 2018)

Monitor Database Activity

A process now records the total number of times each username submits a call to the public database. Currently, the process only collects information about the number of times a user makes a call to the database; it does not track the actual queries. The process uses a shell script to parse the public database logs every Sunday, counts the number of times each user posted a database event and saves this information to the db_user_activities table in the aact_admin database.

Four New Individual Participant Data (IPD) Columns Added to the Studies Table

Until now, the AACT database has provided two of the six data elements related to individual participant data (IPD) sharing: 1) a yes/no value indicating whether the study planned to share this information and 2) a description of the plan. On August 24, 2018, the National Library of Medicine (NLM) added the other four IPD-related attributes to the ClinicalTrials.gov API, so they are now available in the Studies table of the AACT database.

CalculatedValues.has_us_facility Now Includes US Territories

The 'has_us_facility' value saved to the CalculatedValues table is now set to 'true' for studies that have at least one facility in the United States or a US Territory. The decision to include US Territories was based on NIH's 'Checklist for Evaluating Whether a Clinical Trial or Study is an Applicable Clinical Trial (ACT) Under 42 CFR 11.22(b) for Clinical Trials Initiated on or After January 18, 2017' (A country is considered a US Territory if it is one of those defined by the World Atlas.)

Automatically Delete List of Daily Database Copies on the First of Each Month

After the full database refresh that happens on the first of each month, we delete the previous month's set of daily static database copies and pipe-delimited file sets. Until now, this process has been manual. A process has been implemented to automatically remove these files on the first of the month.


AACT 4.0.1 (August 8, 2018)

Unnecessary Code Removed from AACT & AACT-ADMIN

With the release of AACT 4.0.0, we divided AACT into two separate applications: AACT & AACT-ADMIN. Some unnecessary code was left over in both apps. We've gone through and cleaned up the apps to remove superfluous code.

Reimplement Process to Backup User Info

Previously, user information was backed up as one of the final steps in the nightly data load process. Now administrative tasks are performed by the AACT-ADMIN application, so user info backups are no longer a part of the data load process. (The AACT-ADMIN application is now responsible for backing up user info and the AACT application for loading the database.) A cron job has been setup to backup user information every morning at 4am.

Improve Instructions for Recovering User Information

When user information is backed up each morning, AACT administrators receive an email message that includes the backup file attachments and instructions about how to recover info from these files. The instructions in this email have been improved.

Implement Scripts to Grant/Revoke Access to the Public Database

To simplify the process to grant/revoke user access to the public database, shell scripts have been created that can be quickly run to perform these tasks. The scripts are also used by rspec tests to confirm user maintenance functionality.

Fix Error in Instructions About Creating a Local Static Copy of the AACT Database

A user noted a critical error in the website documentation that describes how to create a local copy of the AACT database. The command to restore the database from a dump file downloaded from the website identified the default database 'postgres' rather than the aact database. The command has been corrected:

      

-> pg_restore -e -v -O -x --dbname=aact --no-owner --clean --create ~/Downloads/postgres_data.dmp


AACT 4.0.0 (July 29, 2018)

Divide AACT into Two Applications

AACT has been divided into two applications: one solely dedicated to populating the AACT relational database with data from ClinicalTrials.gov and the other to manage all other supporting functionality such as maintaining user accounts and hosting this website. Both applications use Ruby on Rails and PostgeSQL, and are publicly available on github:

Users will not be directly affected by this change; it simply makes it easier to support the system and positions AACT to be more easily replicated by other organizations/people.

Ensure that 'Removed Users' are Completely Removed

To comply with Article 17 of the General Data Protection Regulation (aka 'The Right to be Forgotten'), we have verified that AACT does not save any information about a user who has chosen to be removed from AACT.

Correct Data Dictionary

Tables added to AACT in version 3.1.2 (Documents & Pending_Results) are now defined in the Table Definition table on the Data Dictionary page of the AACT website.

Prevent Mixed-Case Database Usernames

PostgreSQL recognizes mixed-case objects and requires double quotes when managing such objects. To avoid confusion and complexity, we now prevent the creation of mixed-case database usernames.

Add Technical Documentation

Added a page for technical documentation. (Accessible to AACT administrators only)

Add Instructions for Installing PostgreSQL on Windows 10

Added a page of instructions to stand up an instance of AACT on a Windows 10 machine.


AACT 3.1.2 (May 28, 2018)

Pending_Results Table Added

On May 9, 2018, the National Library of Medicine (NLM) added data about 'pending results' to the ClinicalTrials.gov API. A Pending_Results table has been added to the AACT database to present this new information.

NLM provides result submission date(s) for studies that have results awaiting quality control (QC) review. The results themselves are not publicly posted until the review is complete. The dates for three types of events related to results submission are reported in the Pending_Results table:

The NLM reports that the following updates occur to this information when a study passes the quality control review:

Documents Table Added

The ClinicalTrials.gov API provides information about & links to documents related to a study. NLM provides the following information about these data:

The full study protocol and statistical analysis plan must be uploaded as part of results information submission, for studies with a Primary Completion Date on or after January 18, 2017. The protocol and statistical analysis plan may be optionally uploaded before results information submission and updated with new versions, as needed. Informed consent forms may optionally be uploaded at any time.

AACT now saves this information to the Documents table. Please refer to NLM Results Data Element Definitions and the AACT Data Dictionary for more detailed information about study documents.

Deprecated Date Attributes Removed from AACT

On May 3, 2018, NLM posted this comment to their API schema documentation:

As promised in 08/30/2017 entry above, old redundant date names have been retired and their tags removed. Please update systems to stop using the date on the left in favor of the date on the right.

obsolete tagreplacement tag
<firstreceived_date><study_first_submitted>
<firstreceived_results_date><results_first_submitted>
<firstreceived_results_disposition_date><disposition_first_submitted>
<lastchanged_date><last_update_submitted>

All these date attributes are stored in the Studies table. On January 22, 2018, the obsolete date tags/columns were identified as deprecated and new columns were added that mimic the new labels defined by NLM. The columns are:

Deprecated Column Replacement Column
first_received_date study_first_submitted_date
first_received_results_date results_first_submitted_date
first_received_results_disposit_date disposition_first_submitted_date
last_changed_date last_update_submitted_date

With this release, the deprecated columns have been removed.

Automated Tests Revised to Use Most Current Data from ClinicalTrials.gov

ClinicalTrials.gov has made changes to the API (adding new tags; removing deprecated tags), so we needed to update the studies used by automated test scripts; the tests need to use data that accurrately represents the current structure of the ClinicalTrials.gov API. The latest version of all test studies were downloaded and test scripts were updated to address all changes.

Automatically Email the User Restoration SQL to AACT Administrators

To ensure we're able to recover user account information if necessary, we have added a step to the nightly update process that extracts all data from user-related tables and user account information and emails this to AACT Administrators along with instructions about how to run the scripts to restore the information.

Web Page Listing Users Added (Admin Access Only)

A page to display all registered users has been added. It is only accessible to AACT administrators.

SAS Connection Documentation Modified

The documentation that explains how to use SAS to connect to AACT needed to be tweaked. The sample script was missing the line that identifies the user's password. We also fixed some awkward-looking fonts.

Process to Create 'ctgov' Schema Has Been Automated

All data retrieved from ClinicalTrials.gov is saved into a schema named 'ctgov'. Before, when standing up a new instance of the AACT database, we needed to manually create the ctgov schema, grant privileges to the database administrator and define 'ctgov' as the default schema. We have now modified the database initialization process so that the ctgov schema is automatically created so that the tables, views and indexes are saved there without requiring any extra manual steps.


AACT 3.1.1 (May 9, 2018)

Fix bug in 'Forgot your password?' Feature

If a user forgot their password and clicked the link to receive an email to reset it, the process raised an error after they entered their password and confirmation password. This bug has been fixed.

Backup User Account Information

Prior to Version 3.1.0, the AACT database did not own any data; all information in AACT was retrieved from ClinicalTrials.gov. The database could be (and frequently was) wiped out and recreated from this data source.

With the introduction of a user registration feature, AACT is now the system of record for user account information and must therefore ensure copies of user-related information are backed up and can be restored if necessary. We've setup a daily pg_dump process to create copies of the admin database (which contains a table of Users), and a pg_dumpall --globals-only process to save the database accounts (username/password/access rights) created in the publicly accessible AACT database.

As noted, the only reason to backup the public AACT database is to ensure we have restorable copies of user accounts. Since the actual content of the database can be recovered from ClinicalTrials.gov, only account usernames, encrypted passwords and ACL information are backed up.


AACT 3.1.0 (May 7, 2018)

Implement User Registration Feature

With this release, users of the live AACT database will need to register and receive an individual user account to access the database. Individual accounts will replace the single common login-name/password (aact/aact) that has been used until now. To register and get a database account, please visit the AACT website and click Sign-Up in the upper right corner of any page.

Sign up screenshot

The registration process is automated, using standard methods to verify the email address you provide. This should take about 5 minutes. If you have questions or encounter problems, please send email with the word 'registration' in the subject line to [email protected].

While your login-name & password will change from aact/aact to the login-name/password you define, all other connection information (hostname, database name, and port number) will remain the same.

The previous login-name/password (aact/aact) will remain active for several weeks while people become aware of this new requirement and have the chance to create and test their new database account.

User registration will allow us to contact people about scheduled downtimes and other events. It also helps us monitor and manage database activity.

You can download static copies of the database and the pipe-delimited flat file sets without creating an account; if you only use these resources, you need not register unless you wish to receive email notifications.

All AACT tables are now available in 'ctgov' schema (No longer in 'public' schema)

In preparation for future enhancements that will provide supplemental information to enhance/annotate ClinicalTrials.gov data, all current AACT tables (ie. tables containing only data retrieved from ClinicalTrials.gov) have been moved to a schema named 'ctgov'. All database user accounts will define 'ctgov' as the default schema, so SQL queries need not specify this.

Queries created to run against the previous version of AACT that do not explicitly prefix table names with 'public.' should continue to run without needing any change. If however, your queries have prepended 'public.' to the table names, you will need to either remove these prefixes or change them to 'ctgov.'

Note: This change has no impact on users of the pipe-delimited flat file extracts.


AACT 3.0.2 (April 2, 2018)

Post Daily Downloadable Versions of Static Database Copy and Pipe-Delimited File Set

Until now, downloadable copies of the AACT database (a static pg_dump copy and a set of 40 pipe-delimited flat files) have been created once a month and made available on the download page of the AACT website. Several people have expressed interest in getting these downloadable resources more frequently. As of this release, a static copy of the database and a set of pipe-delimited files are created & published to the download page after each nightly load.

To prevent the accumulation of hundreds of copies of the database through the year, these daily copies will be available for download only until the end of the month. Downloadable copies made on the first of the month will continue to be archived and made permanently available via the website. Both daily and monthly downloadable files can be retrieved from the download page of the AACT website.

(Prior to January, 2018, downloadable copies were created monthly, but not on the first. Going forward, these should be consistently created and dated on the first of each month.)

Publish the Database Update Schedule

A page displaying the schedule for updating the database has been added to the AACT website; it is accessible from a display card at the bottom of the 'Learn More' section of the site

Retain History of Enumerations

Some columns contain a limited number of possible values; several such columns are enumerated in the Data Dictionary, displaying the total number of rows with each value and the percent distribution. For example: on Februrary 28, 2018 the enumeration summary for Designs.primary_purpose was:

Sample enumeration

We are now saving enumeration information to an administrative table so that trends can be identified with the passage of time. This information will also help us verify the accuracy of updates by comparing current percent distributions to previous distributions. If values change dramatically, an alert is sent to AACT administrators.


AACT 3.0.1 (March 12, 2018)

Made Improvements to Database Updater

We have improved the process that updates the AACT database by making the following changes:

Indexed NCT_ID Columns

Every table has an NCT_ID column that serves as the foreign key to the Studies table. These columns need to be indexed so that queries run within a reasonable amount of time. Until now, these indexes were missing.

Increased Diskspace on Database Server

The database server's 60 GB of diskspace is inadequate - usage exceeding 90%. We have upgraded the server's resources as follows:

Enhanced Release Notes Page

This 'Release Notes' page has been enhanced to include past release notes and facilitate documentation of future updates.

Switched Month/Year Date Conversion to Use Last Day of Month

If a date value includes only the month & year (no day), we save that value as a string in a column - these string-type columns have the suffix: 'month_year'. The value is also saved to as a date-type value in a column with a _date suffix. (Example: Studies.start_month_year & Studies.start_date) We have been setting the day to the first day of the month in these date-type conversions. A user noted that the last day of the month is a perferred value. They noted: “these dates (start, completion, primary_completion) define when registration & results are due. A missing day value that defaults to the 1st of the month is the most restrictive and the last of the month is the most generous – for the purposes of compliance assessments” To be consistent we made this change for all data elements that can provide just month/day.

Added anticipated_posting_date to Outcomes Table

While changing the data value for month_year data elements, we noticed that the date-type value for Outcomes.anticipated_posting_date was not being provided. We have added this column to the Outcomes table.

Removed Previous Implementation of AACT

On February 9th, we decommissioned the AACT database hosted on Amazon Web Service and the AACT website hosted on Heroku.


AACT 3.0 (January 22, 2018)

The primary objective for this release is to move the website, database and related code to servers hosted by Duke University and DigitalOcean in order to provide users with a static IP address for the database and to reduce monthly costs for hosting platforms. Below is a more detailed list of changes.

New Server Platforms

The previous version of the AACT database was hosted on the Amazon Web Services (AWS) Relational Database Service (RDS); the AACT website and data processes were hosted on a Heroku server. As of January 22, 2018, the AACT public database will now be hosted on a DigitalOcean server and the website, supporting databases and all system software will reside on virtual Linux servers maintained by Duke University's Office of Information Technology.

Website and Data Processing Server:

Database Server:

Advantages of this configuration:

Updates Performed as Background Process

In the previous version of AACT, the public database was taken down each evening for about one hour to apply all the changes that had been made in ClinicalTrials.gov that day. Periodically, a full refresh of the database was conducted; this process took approximately 15 hours during which time the database was inaccessible. To minimize such downtime, the load process has been reconfigured so that a background database is updated while the publicly accessible AACT database remains available. When the process completes, the publicly accessible version of the database is restored (via pg_restore) which takes less than 5 minutes. This model also allows us to verify that the load process was successful before the public database is updated.

New Data Elements

On August 30, 2017, the National Library of Medicine (NLM) began providing a new set of dates for each clinical trial via the ClinicalTrials.gov API. The Studies table in AACT has been adapted to include these new date-type data elements:

String-type data elements added:

NLM deprecated four date elements (displayed in the left column of the table below) and recommended that users start using the alternative date element (on the right). NLM wrote: "Some existing dates are now redundant. They will be kept for some time to provide an opportunity for users of the XML to update their systems before being removed at a later date, probably in 2018."

first_received_date study_first_submitted_date
first_received_results_date results_first_submitted_date
first_received_results_disposit_date disposition_first_submitted_date
last_changed_date last_update_submitted_date

AACT continues to provide the deprecated data elements. They will continue to be available in AACT until NLM removes them from their API.

Software Upgrade

AACT has been upgraded to Ruby 2.4.0 & Rails 4.2.9 (Previously: Ruby 2.2.3 & Rails 4.2.7.1)


AACT 2.0.5 (July 26, 2017)

We now reboot the database before launching the full load to disconnect user connections. Previously, the full load would hang if active sessions were running, waiting for a quiet database before it would start.


AACT 2.0.4 (April 14, 2017)

A Use Case Gallery has been added to the AACT website.

References on the website to static copies of the AACT database are now called 'static database copies' instead of 'snapshots'. Using 'snapshots' to refer to static copies of the database was confusing because this term has always been used to refer to the annual set of visualizations that summarize (snapshot) the 'state of clinical trials'.

The database refresh failed when executing the final step that retrieved logging information from AWS. When it tried to look at log file: error/postgresql.log.2017-03-08-20, AWS raising error: This file contains binary data and should be downloaded instead of viewed. (Service: AmazonRDS; Status Code: 400; Error Code: InvalidParameterValue; Request ID: c3ff20fc-05a1-11e7-96d9-2dc5508b92a3) We now catch this error and skip over it.


AACT 2.0.3 (March 4, 2017)

We reviewed database activity to identify suspicious activity and created a preliminary instance of the AWS suppression list to block potential hackers.

The footer on each page of the AACT website includes: 'Read our Citation Policy here', but the actual link (https://www.ctti-clinicaltrials.org/briefing-room/citation-policy) was missing. This has been fixed.


AACT 2.0.2 (February 17, 2017)

A Public Announcement feature has been added to provide AACT administrators with the ability to dynamically publish temporary information on the AACT website. For example, when the database is temporarily down because it's being refreshed, we now notify users by posting a public announcement for the duration of the downtime.

A feature to interrogate AWS database log files has been added which saves information about database activity to an administrative table in AACT. We are now better able to monitor database use.

All administrative tables have been moved out of the public AACT database and into a separate database (aact_admin) which is accessible to AACT administrators only. Admin tables are:

CalculatedValue.has_us_facility was incorrectly set to false during incremental/nightly loads. This has been fixed.


AACT 2.0.1 (February 7, 2017)

The nightly incremental load was not finding all the added & changed trials from the ClinicalTrials.gov RSS feed. We now send 2 RSS calls to ClinicalTrials.gov to get them all. Also, if a call to the ClinicalTrials.gov API times out, it now tries 5 times before giving up.

The set of pipe-delimited files was not getting generated as expected because the process aborted when it tried to create an index on a non-existent column: Calcuated_Values.sponsor_type. This problem has been fixed.

We have added a table to the Data Dictionary page to summarize all AACT database tables and provide their current row counts.

An enhancement has been made to the Data Dictionary page: the enumerations column in the table now displays the percentage for each element in the dropdown.

The Guide for Researchers now provides the effective date (January 18, 2017) for the NIH's recently published policy.

Mailgun was re-configured to belong to CTTI. It had previously been registered under StudyCo.


AACT 2.0 (January 31, 2017)

This release represents a significant upgrade that aims to make AACT easier to access and use. Since 2010, the AACT database has been published twice a year as a package that would be current as of a particular date: March 27 for the first annual installation, and September 27 for the second.

The package contained the content of ClinicalTrials.gov as 1) an Oracle database instance, 2) a set of SAS cport files & 3) a set of pipe-delimited files. It also included documentation in spreadsheets. Each package was made available to the public on the CTTI website. These packages remain available here. Until now, the use of AACT involved download/setup that required relatively sophisticated technical skills. AACT users also reported that the information was not current enough and the documentation difficult to use.

The code that generates the database has been proprietary and inaccessible to others who might want to replicate the process. The code used to create the AACT database and website is now publicly available in github. In summary, we have rewritten AACT to make it easier to access and understand, and to encourage others to replicate and make use of any aspect of it.

Direct Access to Cloud-based Version of the Database

The AACT database is immediately accessible in the cloud, eliminating the need for users to download and install the data.

Static Database Copies

Each month, a static copy of the AACT database is saved and made available for download. The database platform, Postgres is a popular free open source database platform and requires relatively less technical know-how to setup than other larger platforms such as Oracle.

Simplified Schema Design

The database schema has been simplified and employs consistent naming and design conventions.

Online Documentation

Documentation has been moved from spreadsheets to this website, and provides instructions about how to access and use AACT with instructions on how to access and use AACT with a variety of popular desktop applications including SAS, R, Tableau, and PostgreSQL tools.

Calculated Values Table

A 'Calculated Values' table provides commonly-used, pre-computed values for each study such as total number of facilities and number of months to report results.

All Open-Source

The public is free to download and recreate the full system or any part of it. All related code (Ruby on Rails) is available in github. This includes the processes that pull data from ClinicalTrials.gov and populates the postgreSql database.

Providing the public with direct, query-able access to a database in the cloud is not a common model and we have yet to determine how well it will serve hundreds or thousands of simultaneous users, however AWS cloud services provides the most promising alternative for scalable solutions. Another notable challenge has been the time required (~15 hours) to load 220,000+ studies. With recent regulatory changes, it’s likely the amount of data in ClinicalTrials.gov will grow at a faster rate; therefore CTTI continues to investigate ways to improve performance and reliability.

A beta version was released on October 1, 2016. Existing AACT users were asked to test the new version and their advice/suggestions were considered and implemented through the end of 2016. The official launch occurred January 31, 2017, just in time for the HHS ‘final rules’ to take effect.