Study setting
The DISTANCE study is designed as a multicentre observational cohort study. It is planned to run for four academic years (2022/2023–2025/2026) in multiple childcare centres in Jiangsu Province, China. Childcare centres are to be included in the study on a rolling basis and are expected to participate for at least one academic year (which consists of two semesters: the autumn semester, from September to January, and the spring semester, from February to June). We work collaboratively with local centres for disease control and prevention to screen and select potentially suitable childcare centres that have prior experiences in participating disease surveillance programmes or large-scale research projects. The study was initiated in September 2022 in one childcare centre in Wuxi City, Jiangsu; data collection and follow-up is ongoing. One more childcare centre in Nanjing City will start recruitment in September 2023.
There are three grades in typical childcare centre in China: the lower grade (age of attendees: 3–4 years), the middle grade (4–5 years) and the upper grade (5–6 years). Childcare centres typically provide childcare services during the daytime on weekdays; boarding childcare centres are not common in China and are not considered in the study. Depending on the overall size of a childcare, there can be as many as five classes per grade. The typical class size ranges from 20 to 40, which could vary by grade (lower grade tends to have a smaller class size), with 2–3 full-time teachers on duty per class. In DISTANCE, we expect to include at least one class from each grade (selected based on convenience and cooperation) to ensure all grades are represented.
Each class has a designated classroom for daily learning activities, which is equipped with dining, afternoon nap and toileting facilities. Classrooms with special functions (such as the ‘stage acting’ room) and outdoor playgrounds are shared but not often simultaneously occupied by different classes. Children attend childcare centres between 8:30 and 16.00 hours, although occasionally, some children do not attend the afternoon session. Children and all staff are not required to wear face masks in participating childcare centres.
Participants
Study participants that form the study cohort are childcare attendees (referred to as children hereafter), teaching staff and study onsite observers (see the contact behaviours section). For children, the following eligibility criteria need to be fulfilled for inclusion in the study:
Written informed consent is sought from the legal guardians of eligible children and from teaching staff and onsite observers for participating in the study.
For the childcare centre in Wuxi City, a total of 104 children and 12 teaching staff from 4 classes (2 in the lower grade, 1 in the middle grade and 1 in the higher grade) were enrolled. We expect to include 100 children and 10 teaching staff for the participating childcare centre in Nanjing City.
Data collection
For all study participants, basic personal information including date of birth and sex is collected. Detailed data collection specific to the study objectives is shown in figure 1, including the following four parts: (1) attendance, (2) behaviour, (3) testing of respiratory viruses in study subjects and environment and (4) follow-up of testing positives.
Figure 1Schematic figure showing data collection and follow-up of the study. A, attendance; Ba, behaviours through active observation; Bp, behaviours through passive observation; Pf, positive cases follow-up; Ve, viruses detected from environment; Vr, viruses detected from respiratory tract.
Attendance
Information on attendance from both children and teaching staff is obtained from Monday to Friday (excluding public holidays) and separately recorded for morning and afternoon, at the beginning of the session. Where available, reasons for absence are recorded.
Contact behaviours
We adopt an observation method for documenting contact behaviours of study participants. This requires onsite investigators, referred to as onsite observers, to attend the class all time during the school time and observe and record contact behaviours. The onsite observers should fulfil the criteria set out by the local participating childcare centres before being allowed to attend the class. During the study period, the onsite observers are managed by local childcare centres as full-time staff and are introduced as ‘assistant teachers’ to children. Each participating class has one designated onsite observer for the entire academic year. When a new observer is allocated to a class at the beginning of the academic year, the first 2 weeks are regarded as the ‘burn-in’ period when the new observer is expected to recognise all study participants in the class. At the end of the ‘burn-in’ period, a test is conducted to assess whether the onsite observer can recognise every study participant of the class. An additional 1 week of ‘burn-in’ might be applied if deemed necessary by the study team.
On each day, one child from each class is selected as the subject under active observation, from a predefined ordered list generated from a computer-generated random sampling without replacement until all children in the class are sampled once (ie, one round of observation); the remaining children and all teaching staff are defined as the subjects under passive observation. After the last child on the rank is selected, a different random rank will be applied for the next round. The onsite observer is expected to observe the contact behaviours of the study subject under active observation with other study subjects (under passive observation) in the childcare centre during the entire attendance of the day. If the subject under active observation only attends the morning session, then the next subject on the predefined list will be regarded as the subject under active observation for the afternoon session. If the subject under active observation is absent, then the subject will be moved to the end of the list, with the next subject on the list being regarded as the subject under active observation for the day.
During the observation, the onsite observer is expected to stand or sit at one corner of the education setting (eg, classroom), with a clear sight of the subject under active observation. The onsite observer is instructed to avoid drawing excessive attention from the subject under active observation or interrupting the normal learning activities; when interaction with the subject becomes inevitable, the onsite observer is instructed to direct the attention of the subject to other teachers or children (not recording any contact between onsite observers and others). This helps to ensure objective documentation of the contact characteristics with minimal impact by the onsite observer.
A contact recording form is used to document contact behaviours of the subject under active observation, which consists of two parts (online supplemental text S1 and S2): part A collects information on the activities undertaken (eg, reading books, riding bicycles, singing and dancing) by the class during the day, including the time, location, contents, number of attendees and whether a face mask is worn by the subject under active observation, as well as whether the subject under active observation has shown any respiratory symptoms during the day; part B collects information on contact behaviours for each activity, including the identity of the contact, contact mode and number of contacts. Here, we consider three contact modes: verbal non-physical contact, defined as face-to-face conversation of at least two words within 1 m from the other person; physical non-skin contact, defined as body-to-body contact without direct skin-to-skin contacts (eg, grabbing the other person’s sleeve, pushing the back of the other person); and skin-to-skin contact. For each contact mode, we also consider the direction of the contact, which is whether the subject under active observation actively contact the other person or is contacted by the other person. As contacts can be of instantaneous nature (eg, several brief physical contacts during chasing games) or continuous nature (eg, extensive physical contacts or verbal conversation), we decided to consolidate the different natures of contacts by defining that any number of instantaneous contacts that occur within 30 s is considered as one contact and that the duration of any continuous contacts divided by 30 s gives the number of contacts (rounded up).
When a contact is made from or towards a child who is not eligible for inclusion or do not provide informed consent, or a person who is not a study participant (eg, cleaning staff and external visitors), we do not identify that person in the recording form but document that contact as a contact with ‘others’.
Testing for respiratory viruses
Every Monday morning, licensed health practitioners of the childcare centre sample throat swabs of the children, teaching staff and onsite observers. On the same afternoon after the dismissal of the class and before the daily cleaning and disinfection of classrooms by cleaning staff, onsite observers sample touch surfaces in the childcare centre. We refer to existing guidelines12 13 on prevention of respiratory infectious diseases in schools by health authorities and previous studies11 14 to select a list of commonly touched surfaces for sample collection, including various items in the classroom and surfaces in the communal areas and playground that are commonly touched by children as well as control surfaces that are believed to be less commonly touched by children. The exact list of touch surface samples will depend on the participating childcare centres. For the childcare centre in Wuxi, a total of 28 samples were taken from these touch surfaces every week (online supplemental table S1). The process of surface sampling is carried out with reference to the Guideline for Environmental Monitoring of SARS-CoV-2 in Agriculture Product Markets and Trade Markets of China (No. WS/T 776-2021).9 Briefly, we saturate the swab well with the virus-preserving solution and scrub the swab on the surface, and repeat the process for three or more times. Swabbed surfaces are wiped using tissues after sample collection to remove any nucleic acid remnants.
Both respiratory samples and environment samples are transported to the same central laboratory at 2°C–8°C in viral transport medium within 6 hours and subsequently stored at −80°C until nucleic acid extraction. Multiplex PCR is applied to test for influenza virus, respiratory syncytial virus, SARS-CoV-2, rhinovirus, parainfluenza virus, human metapneumovirus, endemic coronavirus and adenovirus. Viral RNA is extracted using the kit (Zongkang Bio, China) on the Roche MagNA Pure LC2.0 platform in accordance with the manufacturer’s instructions. Respiratory virus nucleic acid test kit is used for real-time fluorescent quantitative PCR reaction to detect respiratory viruses. Testing is conducted within 48 hours on receipt of samples.
Follow-up of testing positives
We expect to receive respiratory viral testing results by Wednesday of the week. Every Wednesday (occasionally Thursday if the testing results arrive on late Wednesday), parents or other legal guardians of children who test positive for any respiratory viruses will be contacted by telephone. We will record whether the subject has shown any respiratory or systemic symptoms between the previous week and present, and if so when the first symptom appears; whether healthcare is sought; and whether any self-medication is used. Teaching staff and onsite observers who are tested positive will be followed up in the same way as described above. Testing positive for any respiratory viruses per se in any of children, teaching staff or onsite observers does not determine whether they should attend the childcare; such decision is expected to be made by the childcare centre with respect to their regulations.
Data analysis plan
Overall, data analysis consists of three parts, each focusing on detection of respiratory viruses, contact behaviours and transmission risk. A specific data analysis plan for each part will be developed in mid-2023 when the data from the first semester of 2022–2023 are collected and cleaned.
Part 1: detection of respiratory viruses from study subjects and environmental samples
We will calculate the incidence rates of virus-specific infections, virus-specific symptomatic infections and medically attended virus-specific infections, by grade, to understand the burden and spectrum of respiratory viral infections in the study population. We will also compare the proportion of symptomatic infections of different respiratory viruses as well as the distribution of the time lag between onset of symptoms and date of specimen taken. We hypothesise that the incidence of respiratory viral infections originating from non-childcare transmission should be distributed equally among different classes and that any systematic differences across classes could be attributable to within-class transmission; based on the hypothesis, we could use generalised linear regression model with the number of infections as dependent variable and class as independent variable, with an offset term of class size, and then estimate the proportion of variations in the model that could be explained by class, as the proxy for the proportion of within-class transmission.
We will calculate the proportion of respiratory viruses detection on touch surfaces. We will explore the cross-correlation with proportion of respiratory viruses from study subjects with touch surfaces, allowing for time lags, through Pearson’s correlation analysis. We will identify the optimal time lag that maximises the absolute correlation coefficient to understand the temporal order of detection of respiratory viruses from respiratory specimens and from touch surfaces.
Part 2: contact behaviours
In this part, we focus on the behavioural aspect and aim to understand how the contact behaviours vary by individual, activity and time. Given that degree of contacts is quantified as the number of contacts, we will use a quasi-Poisson model that explicitly accounts for the person-to-person contact vectors (ie, heterogeneity in contacts), activity and time as model covariates, separately by class, grade and mode of contact. Where applicable, further stratified analysis by the category of contacts (eg, classmates, teachers and others) will be conducted. Based on the model results, we expect to understand the role of different daily activities in shaping the behavioural patterns of childcare attendees.
From the samples of participants whose contact patterns are actively observed, we will reconstruct an activity-specific individual-level contact matrix based on the model results by predicting the contact patterns for those not under active observation for each class and grade. The resulting individual-level contact matrix based on empirical data can be used as an important input for mathematic modelling to help understand the transmission of infectious diseases (not limited to respiratory viral infections) in childcare centres.
Part 3: transmission risk
This part builds on the first two parts and focuses on understanding the risk of transmission of respiratory viral infections in childcare settings.
We will develop generalised linear mixed-effects models to assess the role of different transmission modes in infections with any and each of the respiratory viruses, separately. We consider testing positive for any or certain respiratory virus as the dependent variable. We consider the variable childcare centre as a random effect in the model to allow between-centre variations in the risk of infections. For the model independent variables, the followings will be considered: the number of daily physical and verbal contacts made to the individual study subject (regardless whether the contact person is infected or not), separately as two variables; the number of daily physical and verbal contacts made to the individual study subject from known infected person(s) confirmed within 1 week, separately as two variables; and a binary variable indicating whether the same respiratory virus was detected from any of the environment samples (the time lag allowed would be dependent on the cross-correlation results in part 1). In addition, we will account for common confounders including calendar month (as a proxy for seasonality), age and sex in the model. Separate subgroup analyses by grade will also be conducted to explore whether any observed effects differ by grade; age will not be included as it is expected to correlate highly with grade.
As exploratory analysis, we will also conduct stochastic micro-simulations based on binomial chains to understand the role of contact heterogeneity in the transmission risk of infectious diseases (compared with homogeneity) in childcare centres, and further assess the effectiveness of different interventions (eg, vaccination, self-quarantine, face mask, class splitting). This is based on the individual-level contact matrix constructed in part 2. We will assume different probabilities of transmission per contact per day and simulate what would happen in the next 14 days under different scenarios regarding the initial number of infected cases on day 0 (ie, seeds), class size and implementation of any interventions, and compare the cumulative attack rate, peak attack rate and duration of epidemics among different simulations.
Power calculation
As described above, sample size in this study is determined based on assessment of feasibility and not based on prespecified statistical power. Therefore, we conducted several preliminary power calculations to help readers appreciate the anticipated statistical power of this study.
In the first calculation, we focus on understanding the viral positivity rate at one single follow-up. We set the sample size to 100 (ie, the number of children in one childcare centre). When assuming the viral positivity rate of 0.1, 0.2 and 0.3, the 95% CI for the viral positivity rate will be 0.05–0.18, 0.13–0.29 and 0.21–0.40, respectively.
In the second calculation, we focus on assessing the correlation between viral positive rate in respiratory samples and in environment samples. We set the sample size to 40 (ie, a total of 40 follow-ups in 1 academic year in 1 childcare centre). When applying an alpha of 0.05 and assuming the correlation coefficient of 0.3, 0.4 and 0.5, the power is estimated to be 0.48, 0.74 and 0.92, respectively.
In the third calculation, we focus on exploring the risk factor for infection of respiratory viruses in children. We set the overall sample size to 100 (ie, the number of children in one childcare centre) and the size of the exposure group to 30. When applying an alpha of 0.05 and proportion of infection in the non-exposure group of 0.2, and assuming the OR of 2, 3 and 4, the power is estimated to be 0.27, 0.61 and 0.83, respectively. If increasing the overall sample size to 200 (ie, pooling up two childcare centres), the power will then increase to 0.57, 0.95 and 0.997, respectively.
Patient and public involvement
None.
Access to data
The study principal investigator and statistician(s) will have access to the complete study dataset. The local investigator(s) (eg, study investigator from local centre for disease control and prevention) will have access to the study dataset of their corresponding childcare centre.
Confidentiality
We take several precautions to ensure confidentiality of the study participants. At the stage of selection of participating childcare centres and classes, we choose to include those with prior experiences of disease surveillance programmes or large-scale research projects to ensure the best compliance to the study protocol (including protection of confidentiality). While identification of study subjects is inevitable (and required by the study design) during the contact behaviour observation stage, such identifiable data are only accessible by onsite study observers. During the respiratory specimen collection and testing stage, a pseudo personal ID is used so any personnel other than onsite observers (eg, those who conduct laboratory testing) do not hold identifiable data of the study subjects; only study investigators have access to the mapping data of the pseudo personal ID. For the observation of contact behaviours, all onsite observers are strictly prohibited against disclosing any personal information of the subjects under observation, and against recording, in any forms, any personal information of the subjects under observation beyond the scope of the contact recording form; in particular, onsite observers are prohibited against taking any image, audio or video recordings of study participants. We communicate with those study participants testing positive for respiratory viruses on their testing results privately; we will share the viral testing results with the local public health authorities when it is legally required to (eg, during an outbreak investigation).
Dissemination plan
We plan to disseminate study results in local and international conferences, and primarily in peer-reviewed journals. The study findings will be communicated to the public for raising awareness of prevention and control of respiratory viral infections. Aggregated anonymised research data will be shared in an open-access data repository.