The purpose of this package is to simplify the reprocessing of accelerometer data for project The relationship between sleep and physical activity across the lifespan. This project wraps functions from GGIR, along with functions to make collating the results easier.
This page will guide you through the process. How you reprocess the data will depend on a number of factors.
The project is led by Dr Timothy Hartwig, and this package is developed by Dr Taren Sanders. Please reach out if you need any support.
One of the goals for this package was to make reprocessing data as
easy as possible. If your study meets all of the criteria below, you
should be able to reprocess your data with just a single call to
reprocess()
.
Criteria
.bin
or .csv
,
Actigraph raw .csv
, or Axivity .cwa
).reprocess()
First, install the package. Since the package is not on CRAN, you
will need devtools::install_github()
or
remotes::install_github()
. You will need to install one of
these packages if you do not already have one. I recommend
devtools
.
# install.packages("devtools") # uncomment if needed
devtools::install_github("Motivation-and-Behaviour/sleepIPD")
Next, provide sleepIPD with the information about how your data are stored, and information about the sample. The essential information is:
Parameter | Description |
---|---|
datadir | Where are the data stored? |
outputdir | Where can we safely store the outputs? Note - do not use the same location and data previously processed in GGIR. You risk overwriting your existing outputs. A new output folder just for this study would be sensible. |
studyname | What should we refer to this sample as? You can always use your last name. |
ages | How old is the sample? Can be either "children" (for
those 18 or less), "adults" (for >18) or both (i.e.,
c("children", "adults") ). If you do not specify, both are
used. |
device | What device did you use? Can be either "geneactiv" ,
"actigraph" or (in the unlikely event two devices were
used) both (i.e., c("geneactiv", "actigraph") ). If you do
not specify, both are used. Note that you can use
"geneactiv" if your device was Axivity. |
wear_location | How did the sample wear the device? Can be either
"wrist" (the default) or "hip" . In the
unlikely event multiple locations were used within the same study, you
need to run the function twice, once with each location. Note that you
would also need to ensure that they have different study names or output
locations so they don’t collide. |
An example call might look like:
library(sleepIPD)
datadir <- "C:\\Documents\\My Study\\Data"
outputdir <- "C:\\Documents\\My Study\\Output"
studyname <- "Hartwig 2021"
ages <- "children"
device <- "geneactiv"
wear_location <- "wrist"
reprocess(datadir = datadir,
outputdir = outputdir,
studyname = studyname,
ages = ages,
device = device,
wear_location = wear_location)
Note that if you discover you have made a mistake and need to run
this again, you’ll need to additionally set
overwrite = TRUE
.
reprocess()
Do?
reprocess()
performs three steps, and moves the data
between each step.
This is the largest step. Here, sleepIPD calls
process_files()
, which does the following:
GGIR::g.shell.GGIR()
to reprocess the data. This
is where >99% of the time will be spent.GGIR can produce a lot of output files, but we only really need the
summaries. Even so, depending on the configuration, there could be more
than 1000 summary files. In this second step, sleepIPD calls
collate_outputs()
, which finds all of the output files and
collapses them into just three files. These three files are stored in a
new folder called sleepIPD_output
, which will be located
under the folder you specify in outputdir
.
For example, if outputdir
is
"C:\\Documents\\My Study\\Output"
, the collated files will
be stored in
"C:\\Documents\\My Study\\Output\\sleepIPD_output"
.
To perform the integrated analysis, we will need additional
information about the sample such as demographics. To try to make this
simpler, sleepIPD calls build_meta()
which will create the
metadata template file and pre-fill it with the names of the files, and
the wear location. You can then add the remaining information in, either
by matching it to an existing spreadsheet or filling it manually.
The metadata template will be located in the folder you specify as
the outputdir
, and will be named
<studyname>-metadata.xlsx
. The Metadata
sheet contains the information to fill out, and the other sheets explain
the expected format.
sleepIPD
wraps GGIR::g.shell.GGIR()
for the
processing. If you are manually making calls to GGIR (i.e., not using
this package), the settings are provide below. You must
use GGIR Version 2.5-1 for this project. There is some evidence that
different versions of GGIR can produce slightly different results.
Please check your version before you begin
(utils::packageVersion("GGIR")
).
The full GGIR::g.shell.GGIR()
call used in
sleepIPD
is:
GGIR::g.shell.GGIR(
mode = c(1, 2, 3, 4, 5),
datadir = datadir,
outputdir = outputdir,
do.report = c(2, 4, 5),
overwrite = overwrite,
do.enmo = TRUE,
do.hfen = TRUE,
do.enmoa = TRUE,
studyname = studyname,
# =====================
# Part 2
# =====================
do.imp = TRUE,
strategy = 2,
maxdur = 0,
includedaycrit = 16,
qwindow = c(0, 6, 12, 18, 24),
mvpathreshold = threshold_mod,
bout.metric = 4,
excludefirstlast = FALSE,
includenightcrit = 16,
MX.ig.min.dur = 10,
iglevels = 50,
ilevels = seq(0, 4000, 50),
epochvalues2csv = FALSE,
winhr = c(16, # Most active 16 hours
c(60 / 60), # Most active hour
c(30 / 60), # Most active half hour
c(15 / 60), # Most active 15 min
c(10 / 60), # Most active 10 min
c(5 / 60) # Most active 5 min)
),
qlevels = c(
960 / 1440, # 2/3rds of the day
1320 / 1440, # Top 120min
1380 / 1440, # Top 60min
1410 / 1440, # Top 30min
1425 / 1440, # Top 15min
1435 / 1440 # Top 5min
),
# =====================
# Part 3 + 4
# =====================
def.noc.sleep = 1,
outliers.only = TRUE,
criterror = 4,
do.visual = TRUE,
# =====================
# Part 5
# =====================
threshold.lig = threshold_lig,
threshold.mod = threshold_mod,
threshold.vig = threshold_vig,
boutcriter = 0.8, boutcriter.in = 0.9, boutcriter.lig = 0.8,
boutcriter.mvpa = 0.8, boutdur.in = c(1, 10, 30), boutdur.lig = c(1, 10),
boutdur.mvpa = c(1),
includedaycrit.part5 = 2 / 3,
# =====================
# Visual report
# =====================
timewindow = c("WW", "MM"),
visualreport = FALSE,
...
)
Note that datadir
, outputdir
,
overwrite
, and studyname
are set elsewhere in
sleepIPD
. The thresholds (i.e., threshold_lig
,
threshold_mod
, and threshold_vig
) will depend
on the device you are using, the age of the sample, and where the
accelerometer was worn. See details below.
Study Design | Thresholds | Sources | ||||
---|---|---|---|---|---|---|
Age1 | Device2 | Worn on | Light | Moderate | Vigorous | |
children |
geneactiv |
wrist |
51.6 |
191.6 |
695.8 |
Light: Hurter et al., 2018 |
children |
actigraph |
wrist |
48.1 |
201.4 |
707 |
Light: Hurter et al., 2018 |
children |
geneactiv |
hip |
64.1 |
152.8 |
514.3 |
Light: Hildebrand et al., 2016 |
children |
actigraph |
hip |
32.6 |
142.6 |
464.6 |
Light: Hurter et al., 2018 |
adults |
geneactiv |
wrist |
45.8 |
93.2 |
418.3 |
Light: Hildebrand et al., 2016 |
adults |
actigraph |
wrist |
44.8 |
100.6 |
428.8 |
Light: Hildebrand et al., 2016 |
adults |
geneactiv |
hip |
46.9 |
68.7 |
266.8 |
Light: Hildebrand et al., 2016 |
adults |
actigraph |
hip |
47.4 |
61.1 |
258.7 |
Light: Hildebrand et al., 2016 |
1 Children: ≤18 years; Adults: > 18 years | ||||||
2 Geneactiv thresholds can also be applied to Axivity devices. |
If you have data that would fall into more than one row (e.g., your sample includes both children and adults), you can add multiple cut-points and we will extract the right values later based on the metadata.
As an example, if your study included children wearing geneactiv devices on their wrists, you would include the following lines in your GGIR call:
...= c(51.6),
threshold.lig = c(191.6),
threshold.mod = c(695.8),
threshold.vig ...
If your study included both children and adults wearing geneactiv devices on their wrists, your call would include:
...= c(45.8, 51.6),
threshold.lig = c(93.2, 191.6),
threshold.mod = c(418.3, 695.8),
threshold.vig ...
sleepIPD
Helper Functions
Even if you do not use sleepIPD
for the reprocessing,
you may still find the helper functions useful. The two main helper
functions are collate_outputs()
and
build_meta()
.
collate_outputs()
collate_outputs()
will find all of the Part 2, 4, and 5
files, collapse them so that there is a one file for each part, and move
that file to the top level of the output directory. This reduces the
number of files you need to submit for the analysis. You can use
collate_outputs()
like so:
collate_outputs(
outputdir = "C:/MyFiles", # The location you used in GGIR as outputdir
studyname = "Hartwig 2016" # A study name. Author surname and year is fine
)
build_meta()
build_meta()
will create a metadata template for you.
Additionally, it will search the output of
collate_outputs()
for the names of files, and pre-fill the
metadata template with this information. The goal is to reduce the rate
of unmatched data, which was a problem in previous pooled analyses. You
must use collate_outputs()
before using
build_meta()
.
build_meta(
outputdir = "C:/MyFiles", # Must be the same as that used in collate_outputs()
studyname = "Hartwig 2016" # Must be the same as that used in collate_outputs()
)
If you have only a single wear location, you can (optionally) provide
this (e.g., wear_location="wrist"
) in order to pre-fill
it.
If you have previously used build_meta
and need to run
it again, you need to specify overwrite = TRUE
. Be warned
that this will replace the existing metadata template.