REDCapDM - Data reading and processing
Source:vignettes/Data-reading-and-processing.Rmd
Data-reading-and-processing.Rmd
This vignette provides a summary of the straightforward and common use of REDCapDM to interact with REDCap data.
Read data
To import data from REDCap, you can use the redcap_data()
function, which provides two primary methods: importing data from local
files or establishing an API connection.
Local files
Before starting, ensure you have the required R and CSV files exported from REDCap, including the instrument-event mappings file. All these files should be in the same directory for the package to work correctly.
Use the data_path
and dic_path
arguments to
indicate the paths to your R data file and REDCap project’s dictionary
file, respectively. If your REDCap project is
longitudinal, you’ll additionally need to supply the event-form mapping
file using the event_path
argument.
dataset <- redcap_data(data_path = "C:/Users/username/example.r",
dic_path = "C:/Users/username/example_dictionary.csv",
event_path = "C:/Users/username/events.csv")
API connection
If you opt for an API connection, you can provide the
uri
(uniform resource identifier) and token
(user-specific password) for your REDCap project. This method
will automatically retrieve the event-form mapping if your project is
longitudinal.
Use both arguments to set up the API connection and import the data:
dataset_api <- redcap_data(uri = "https://redcap.idibell.cat/api/",
token = "55E5C3D1E83213ADA2182A4BFDEA")
Output
The redcap_data()
function returns a list with three elements:
Imported data: Contains the data from your REDCap project
Dictionary: Provides information about variables and their associated labels.
Event-form mapping (only available for longitudinal projects): Describes the correspondence between events and forms in your project.
Process data
Having successfully imported our data into R, you can now use the rd_transform()
function to start processing the data.
This function performs several transformations:
Elimination of selected variables
Elimination of variables containing certain patterns such as ’_complete’ and ’_timestamp’
Recalculation of REDCap calculated fields
Checkbox transformation by changing their names to the names of their options
Replacement of the original variables with their factor version
Branching logic transformation, converting REDCap logic to R logic.
Standard
The only essential elements that must be supplied are the dataset to
be transformed and the corresponding dictionary. In the case of a
longitudinal project, it is advisable to also specify the event form
dataset to take full advantage of this function. These elements can be
directly specified using the output of the redcap_data()
function or separately using distinct arguments:
#Option A: list object
covican_transformed <- rd_transform(covican)
#Option B: separately with different arguments
covican_transformed <- rd_transform(data = covican$data,
dic = covican$dictionary,
event_form = covican$event_form)
This function returns a list containing the transformed dataset, dictionary, event_form and the results of each transformation. To retrieve the results of the transformation, use the following code block:
#Print the results of the transformation
covican_transformed$results
1. Removing selected variables
2. Deleting variables that contain some patterns
3. Recalculating calculated fields and saving them as '[field_name]_recalc'
| Total calculated fields | Non-transcribed fields | Recalculated different fields |
|:-----------------------:|:----------------------:|:-----------------------------:|
| 2 | 0 (0%) | 1 (50%) |
| field_name | Transcribed? | Is equal? |
|:-------------------:|:------------:|:---------:|
| age | Yes | FALSE |
| screening_fail_crit | Yes | TRUE |
4. Transforming checkboxes: changing their values to No/Yes and changing their names to the names of its options. For checkboxes that have a branching logic, when the logic is missing their values will be set to missing
Table: Checkbox variables advisable to be reviewed
| Variables without any branching logic |
|:-------------------------------------:|
| type_underlying_disease |
5. Replacing original variables for their factor version
6. Converting every branching logic in the dictionary into R logic
By event
If the REDCap project is longitudinal, you can further adjust the structure of the transformed dataset. For example, it can be split by event:
dataset <- rd_transform(covican,
final_format = "by_event")
Where the transformed dataset is a tibble object, containing data frames for each event in the REDCap project.
dataset$data
#> # A tibble: 2 × 3
#> events vars df
#> <chr> <list> <list>
#> 1 baseline_visit_arm_1 <chr [31]> <df [190 × 36]>
#> 2 follow_up_visit_da_arm_1 <chr [4]> <df [152 × 9]>
By form
Or, alternatively, it can be split by form:
dataset <- rd_transform(covican,
final_format = "by_form")
Where the tibble object is composed by data frames corresponding to each form in the REDCap project.
dataset$data
#> # A tibble: 7 × 4
#> form events vars df
#> <chr> <list> <list> <list>
#> 1 inclusionexclusion_criteria <chr [1]> <glue [7]> <df [190 × 11]>
#> 2 demographics <chr [1]> <glue [4]> <df [190 × 9]>
#> 3 comorbidities <chr [1]> <glue [5]> <df [190 × 10]>
#> 4 cancer <chr [1]> <glue [11]> <df [190 × 16]>
#> 5 vital_signs <chr [2]> <glue [2]> <df [342 × 7]>
#> 6 laboratory_findings <chr [2]> <glue [2]> <df [342 × 7]>
#> 7 microbiological_studies <chr [1]> <glue [1]> <df [190 × 6]>
For more information, consult the complete vignette available at: https://bruigtp.github.io/REDCapDM/articles/REDCapDM.html