PaperQA2: Superhuman scientific literature search

Research
By Sam Rodriques
Published December 8th, 2023

Share to...

Today, we are announcing PaperQA2, the first AI agent to achieve superhuman performance on a variety of different scientific literature search tasks. PaperQA2 is an agent optimized for retrieving and summarizing information over the scientific literature. PaperQA2 has access to a variety of tools that allow it to find papers, extract useful information from those papers, explo`re the citation graph, and formulate answers. PaperQA2 achieves higher accuracy than PhD and postdoc-level biology researchers at retrieving information from the scientific literature, as measured using LitQA2, a piece of the LAB-Bench evals set that we released earlier this summer. In addition, when applied to produce wikipedia-style summaries of scientific information, WikiCrow, an agent built on top of PaperQA2, produces summaries that are more accurate on average than actual articles on Wikipedia that have been written and curated by humans, as judged by blinded PhD and postdoc-level biology researchers. PaperQA2 is described in our paper, https://paper.wikicrow.ai; and the code is available at https://github.com/Future-House/paper-qa.

WikiCrow

Enter a gene name below

ABCC8
ACAD10
ACOX2
ADH7
AHI1
ANGPT2
ANKLE1
ATP5PO
ATP6AP2
C1QL3
CAPN2
CD276
CD7
CDH10
CDK5RAP3
CFAP44
CHRNB4
CHTOP
CPM
CPQ
CPT1C
CTIF
CXXC4
CYP4F3
DHRS3
DRG1
EMP1
FBH1
FSTL4
GGT1
GPAT4
HBG2
HDGF
HMGN5
HUNK
INSL6
IYD
JOSD1
JPH2
KLHL41
KLK12
KRT15
LAP3
LGMN
LMOD1
MDM1
MIEF2
MKKS
MRNIP
MRPS27
MSL1
MT1B
MT1M
MTCL2
MTF1
MTMR6
MTRES1
NARS2
NBEAL1
NECAP1
NKIRAS2
NME7
NMU
NR2C1
NUP37
NXPH3
OSBPL5
PADI6
PPP1R13L
PRAMEF7
RASGRP2
REL
REM2
RGL2
RNF186
RSPH1
RXFP2
SAMD9L
SAR1A
SCAMP2
SCGB1A1
SLC25A51
SOX30
STOML2
SYCP2
SYT9
TAF12
TEX15
TFAM
TIMM10B
TMEM258
TMEM79
TTLL5
UBE2E2
UBXN6
UNC5D
USP12
VCF2
WDR47
WDR48
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
COLLAPSE
EXPAND
TMEM79

...

UBXN6

...

WDR48

...

WDR47

...

VCF2

...

USP12

...

UNC5D

...

UBE2E2

...

TTLL5

...

TMEM258

...

RGL2

...

TIMM10B

...

TEX15

...

TFAM

...

TAF12

...

SYT9

...

REL

...

RASGRP2

...

SYCP2

...

STOML2

...

SOX30

...

SCGB1A1

...

SLC25A51

...

SCAMP2

...

SAMD9L

...

SAR1A

...

RXFP2

...

RSPH1

...

RNF186

...

REM2

...

PRAMEF7

...

PPP1R13L

...

PADI6

...

NXPH3

...

OSBPL5

...

NUP37

...

NR2C1

...

MKKS

...

MDM1

...

NBEAL1

...

NME7

...

NMU

...

MTRES1

...

NKIRAS2

...

NECAP1

...

NARS2

...

MTMR6

...

MTF1

...

MTCL2

...

MT1M

...

MT1B

...

MSL1

...

MRPS27

...

KLK12

...

MRNIP

...

MIEF2

...

LMOD1

...

KRT15

...

LGMN

...

LAP3

...

KLHL41

...

C1QL3

...

JOSD1

...

JPH2

...

IYD

...

INSL6

...

HMGN5

...

HUNK

...

HDGF

...

HBG2

...

GPAT4

...

GGT1

...

EMP1

...

FSTL4

...

DHRS3

...

FBH1

...

DRG1

...

CYP4F3

...

CXXC4

...

CTIF

...

CPT1C

...

CPQ

...

CPM

...

CHTOP

...

CHRNB4

...

CDK5RAP3

...

CFAP44

...

CDH10

...

CD7

...

CD276

...

CAPN2

...

ATP6AP2

...

ANKLE1

...

ATP5PO

...

ACAD10

...

ANGPT2

...

ACOX2

...

ADH7

...

AHI1

...

ABCC8

...

No results

Sorry, we couldn’t find any results that matched your search terms.

Loading details...

PaperQA2 allows us to perform analyses over the literature at a scale that are currently unavailable to human beings. Previously, we showed that we could use an older version (PaperQA) to generate a Wikipedia article for all 20,000 genes in the human genome, by combining information from 1 million distinct scientific papers. However, those articles were less accurate on average than existing articles on Wikipedia. Now that the articles we can generate are significantly more accurate than Wikipedia articles, one can imagine generating Wikipedia-style summaries on demand, or even regenerating Wikipeda from scratch with more comprehensive and recent information. In the coming weeks, we will use WikiCrow to generate Wikipedia articles for all 20,000 genes in the human genome, and will release them at wikicrow.ai. In the meantime, wikicrow.ai contains a preview of 240 articles used in the paper.

In addition, we are very interested in how PaperQA2 could allow us to generate new hypotheses. One approach to that problem is to identify contradictions between published scientific papers, which can point the way to new discoveries. In our paper, we describe how ContraCrow, an agent built on top of PaperQA2, can evaluate every claim in a scientific paper to identify any other papers in the literature that disagree with it. We can grade these contradictions on a Likert scale to remove trivial contradictions. We find 2.34 statements per paper on average in a random subset of biology papers that are contradicted by other papers from anywhere else in the literature. Exploring these contradictions in detail may allow agents like PaperQA2 and ContraCrow to generate new hypotheses and propose new pivotal experiments.