This summer the journal, Addiction became the first journal to start piloting Penelope. Within a month, Penelope had carried out 3000 checks on 53 articles and made 500 suggestions. You can see a summary of this post in this infographic.

We were excited to see what authors thought about it, so we asked them “How likely would you be to recommend Penelope to a friend or colleague?”. Eighteen responded, 5 gave us 10/10 and 12 rated it 8 or higher. Authors were impressed by the speed (they get feedback within 2 minutes) and were grateful for the mistakes Penelope caught.

When building Penelope we trained it on a set of manuscripts mostly written by experienced UK researchers. Addiction is a well-known journal and attracts submissions from all around the world, so we double checked half of the manuscripts by hand (n = 27) to see how Penelope’s algorithms perform when faced with non-native english authors writing in different styles, formats and versions of MS Word. The articles submitted ranged from reviews to clinical trials.

Penelope had correctly warned authors that disclosures of interest, acknowledgements and running heads were missing from many of the manuscripts, and 15 didn’t name a funder.

Of the research that involved animals or people, Penelope correctly warned around a third of authors that they hadn’t addressed ethical approval, informed consent or named an ethics board.

When checking statistics, Penelope found 7 incorrect ANOVAs, 4 incorrect Pearson correlations and 2 incorrect t-tests where the test statistic didn’t match the p-value. There were 8 tables and 6 figures that didn’t have legends, and all were found.

Penelope correctly found that 7 of the 27 papers used the wrong referencing style. There were 850 references amongst the articles, and Penelope correctly found 52 instances where a reference wasn’t cited (or vice versa).

We found some areas that we can definitely improve. Penelope didn’t check statistics that appear outside of the results section, and was sometimes too strict when checking citations and legends - it would ask authors to double check something that was already done correctly.

That said, authors seemed to be tolerant of its mistakes. They know that machines aren’t perfect. Instead, their main concern was a desire for more checks covering different report types.

Overall we were happy with the results. Authors liked the tool and found it easy to use, and Penelope made many useful suggestions. Addiction will continue to offer the tool to its authors, and we’ll be making tweaks and fixes continually over the coming weeks with the intention of doing a similar exercise this autumn.

Any other journals wishing to get involved should email [email protected]. It’s super easy to add Penelope to your website - you just need a link.

What we learned from our first pilot

james harwood