with the square root of the total variance, which gives an estimate of the standard deviation for use in the classical formula of the Bland Altman compliance limits. With the help of a well-illustrated example, we compare and contrast the five different methods of agreement mentioned above and give guidelines for the selection of them. Our example consists of measurements of the respiratory rate (breathing per minute) of 21 subjects with COPD, measured simultaneously by six devices (including a gold standard) worn simultaneously. This was the dataset used in the study by Parker and colleagues [15], which was made public through data sharing [15]. Several temporal measurements of respiratory rate were performed in each patient, so the participants` repeated observations pooled. Eleven different activities were carried out by the participants during a laboratory-based protocol, lasting 57 minutes. These were seating, sun loungers, standing, slow walking, brisk walking, sweeping, lifting objects, standing and walking, stairs, treadmill (flat walking) and treadmill (4% slope). The balance of activities was chosen to be representative of activities of daily living [16]. Not everyone performed exactly the same number of activities, as some tasks were too difficult for some participants (e.g.B.

the treadmill), and so this is an example of unbalanced study design. Most activities had only one breathing rate per participant, but "sitting" and "standing and walking" had 6-7 and 1-3 observations per participant respectively (see Figure 1 in the supplementary file), and so there was an accumulation of observations within activities and participants. Eight of the participants (38%) were women, with a mean age of 69 years (standard deviation (SD) 8) and an average body mass index (BMI) of 26 (SD 6). Full details of the study are reported elsewhere [16]. For simplicity, we consider in this article only the comparison of one of the devices (breast tape) with the gold standard (Oxycon mobile, Carefusion). Of the six devices used in the study, the tape recorder and gold standard were the only two devices that had no missing data. The tape recorder was also one of the devices that showed the best match with the gold standard. The five statistical methods for assessing conformity to repeated measurement data are described below using appropriate modelling formulas. As described above, linear models of mixed effects are particularly suitable for the analysis of data from unmasked and non-symmetrical designs, as they contain random effects. The linear mixing model has the shape:

