BITSS Reproducibility Training: London Trip Report!
The Berkeley Initiative for Transparency in the Social Sciences (BITSS) works to strengthen the integrity of social science research and evidence used for policy-making. From time-to-time BITSS run training programmes in reproducible research practices. I applied for a place on their first RT2 programme for Europe, and was delighted to be successful!
The BITSS team provided us with a training manual highlighting important reading to do in advance of the course. Some of the papers I had come across already, but many I had not. I spent the week beforehand reading on meta-analysis, replication as a teaching tool, and the p-value debate (should we use p < .005 instead?), before arriving in London. On the first day, I walked the few minutes to the “International Workplace” and met the other 30+ participants, from economics, psychology, accounting, sociology, philosophy (and more!) and facilitators. Sean Grant (RAND) introduced us to the roadmap for training, before talks on scientific misconduct, pre-registration, replication, and Open Science Framework.
First up on Day 2 was Meta-Lab's Arnaud Vaganay with an “emerging methods” talk on reproducible literature reviews. This was one topic that was completely new to me so I was coming to this session blind, so to speak. Arnaud opened by asking us: How many times do we read a research paper introduction that slowly crafts an argument, drawing on particular theories, particular lines of evidence, and particular elements of study findings? I often find myself reading (and writing, and frankly, teaching others to write) this type of introduction – but in reality we are only presenting the narrative that suits us (and that the reviewers and editor want to read). This in itself is a significant source of bias that hasn’t really been tackled, yet. So, it was fascinating to hear about Arnaud’s development of a protocol for conducting reproducible literature reviews that present the totality of relevant evidence. Arnaud gave examples from his own work that show how effective this is in providing useful information, and how it isn’t such a giant leap from our existing skillsets. This struck me as an excellent technique for PhD students getting a grasp on the research literature relating to their question. I also feel this is potentially a fantastic way to train early career scientists (from undergraduate onwards) in the principles of reproducibility. So, expect more on this @ the University of Limerick next semester!
Each training session was facilitated by an expert in that particular area, and it was clear that each brought their own experiences to the session. This was especially clear in the Data Management De-identification session, with Danae Roumis, Program Evaluator with Social Impact. In academia, we (hopefully) are striving to be more open in relation to data sharing – however, for Danae, this openness is often required by those who commission the research. So, her perspectives and experience with a wide range of research projects were especially valuable for this session. I have to admit, when I saw this session in the programme, I thought, OK, this should be easy, right? Use an anonymous ID number, don’t include personal information, collapse categories in cases where a characteristic is rare (e.g., where a person has a particular rare disease), upload the data, all fine! In fact, data de-identification is a much more complex process, and we were introduced to some serious de-identification blunders over the course of the talk, as well as the steps involved and the pros and cons of de-identification. There was certainly a lot of food for thought here to take back to Ireland with me.
BITSS Senior Associate Katie Hoeberling introducing BITSS preprints
The programme also included some practical software sessions and most of the presenters wore a t-shirt with the name of the software on it (think I heart R, JASP, etc.). I found myself wondering – does SPSS have a t-shirt, and would anyone even wear it? Somehow, I don’t think so. I’m definitely considering a move to the free software JASP introduced to us by EJ Wagenmakers, who described it as how SPSS would be, if it were good. It allows for the traditional statistics and the Bayesian approaches, and is much more intuitive than SPSS. I can see students feeling a lot less anxious about statistics using JASP, rather than SPSS. It also looks enough like SPSS for established users to feel OK about making the change. Besides the software end, EJ’s discussion on p-values was also especially interesting given I’d been following the ongoing twitter debate on that topic. You can read more on his blog,
We also had some hands-on GitHub practice with Garret Christensen. When WB Yeats wrote "The Second Coming", I'm sure he couldn't imagine it would be used in a lesson on Version Control.
Finally, the 3 days of training came to a close. Some of the BITSS team opened the floor to any constructive feedback on the training, the content, pacing, delivery, and so on. It says a lot about the quality of this programme that the request for feedback was initially met with silence - followed by a “well done” from the back - and applause, from everyone! Overall, it was a fantastic programme. I’m grateful to BITSS for offering me the chance to participate in the training and financially supporting my attendance. And, to all of the participants - for the discussions, about everything we were learning about, of course, but also simply for the chance to talk with so many people across disciplines and countries. If you have the opportunity to do this training, take it!