This information is for general educational purposes and is not a substitute for personalized medical advice. Your individual situation may differ; consult your physician for guidance specific to you.

About this Evidence Review

Generated 2026-06-05 · Last reviewed by Dr. Margarita Krasnova, MD

  • Articles identified: 1165
  • Open-access studies retrieved: 534
  • Studies included in this review: 80
  • Relevance rate: 45.8%
  • PubMed: 25
  • OpenAlex: 709
  • ClinicalTrials.gov: 288
  • Author's reference collection: 143

This is a structured review of currently accessible medical studies, NOT a Cochrane review. It is general educational information, not personalized medical advice. Your individual situation may differ; consult your physician.

Authors

Margarita Krasnova, MD

Abstract

Background: Rising rates of adult ADHD diagnosis have prompted debate about whether the disorder is overdiagnosed, underdiagnosed, or both, with significant implications for clinical resource allocation, patient welfare, and the credibility of diagnostic practice.

Objective: This scoping review mapped the available evidence on the accuracy of contemporary adult ADHD diagnostic practices, examining whether diagnostic errors favor false positives or false negatives across different clinical contexts, assessment methods, and population subgroups.

Methods: A systematic search of electronic databases identified studies of adults aged 18 years and older evaluated for suspected ADHD, comparing routine diagnostic practices against rigorous or validation-oriented assessment methods with respect to diagnostic accuracy and related outcomes. A single reviewer screened, extracted, and synthesized the evidence narratively. Eighty articles spanning cross-sectional diagnostic accuracy studies, longitudinal cohorts, psychometric validations, systematic reviews, and other designs met inclusion criteria.

Results: Most included studies supported the conclusion that diagnostic accuracy varies substantially by clinical context and method. Self-report screeners consistently showed high sensitivity but poor specificity, with screen-positive rates far exceeding expected prevalence, particularly among university students and adults with comorbid personality or mood disorders. Structured diagnostic interviews such as the DIVA demonstrated strong accuracy across settings. Retrospective childhood symptom recall proved unreliable, and symptom validity testing identified high rates of noncredible presentation among self-referred populations. Women were diagnosed later and with greater severity. Studies in forensic, anxiety-disorder, and substance-use settings pointed toward underdiagnosis, while self-referred and primary care populations showed elevated false-positive risk. No included study examined telehealth-based assessment, social media influences, or commercial diagnostic platforms.

Conclusions: Available evidence suggests that adult ADHD is simultaneously overdiagnosed in some clinical pathways and underdiagnosed in others, with the direction of error depending on the assessment method, population characteristics, and setting. Converging findings across the literature suggest that multi-method, multi-informant assessment with embedded validity testing may reduce misclassification, although direct comparative trials establishing optimal diagnostic configurations are lacking.

Keywords

Attention Deficit Disorder with Hyperactivity, Adult, Diagnosis, Diagnostic Errors, Neuropsychological Tests

1. Introduction

Attention-deficit/hyperactivity disorder in adults has moved from a contested diagnostic category to a widely recognized condition over the past two decades, yet the clinical landscape surrounding its identification remains deeply uneven. Adults presenting for ADHD evaluation span a broad range of settings — outpatient psychiatric clinics, primary care offices, telehealth platforms, university health centers, neuropsychological practices, and occupational health services — and they arrive with equally varied profiles. Women whose inattentive symptoms were overlooked in childhood, high-functioning professionals compensating through structure and effort, college students facing new executive demands, individuals with layered histories of anxiety, depression, trauma, or substance use, and adults seeking a first diagnosis in midlife all converge on the same diagnostic question. Getting that question wrong carries real consequences in both directions: a missed diagnosis leaves functional impairment untreated, while a false-positive diagnosis exposes a patient to unnecessary stimulant medication and diverts attention from the condition actually driving their difficulties.

Current diagnostic practice for adult ADHD relies on a patchwork of methods whose rigor varies enormously. At one end sit structured diagnostic interviews, multi-informant corroboration, developmental history confirmation, and neuropsychological testing; at the other sit brief clinical impressions, self-report checklists used in isolation, retrospective childhood symptom reconstruction without collateral verification, and rapid telehealth or commercial online assessments. Between these poles, most evaluations fall somewhere in the middle — a clinician applying DSM-based criteria with variable depth of inquiry and variable access to informants. Compounding this heterogeneity, social media content about ADHD and self-diagnosis have reshaped referral patterns, increasing demand while potentially priming symptom endorsement. Whether these diverse practices converge on the same diagnostic conclusions — and which approach best approximates a defensible ground truth — remains uncertain. No consensus exists on the degree to which adult ADHD is overdiagnosed, underdiagnosed, or both simultaneously across different contexts, and no synthesis has systematically mapped the evidence bearing on that question.

This review aims to evaluate whether adult ADHD is being overdiagnosed, underdiagnosed, or both, depending on clinical context, diagnostic methodology, and population characteristics, among adults aged 18 years and older evaluated for suspected ADHD. It compares contemporary diagnostic practices — including unstructured interviews, self-report scales, telehealth evaluations, primary care screening, and commercial platforms — against rigorous validation-oriented methods such as structured interviews, multi-informant assessment, and longitudinal symptom verification. The primary outcomes of interest are diagnostic accuracy, false-positive and false-negative rates, inter-method agreement, and the influence of social media on diagnostic demand. The ultimate objective is to identify evidence-based principles for maximizing diagnostic accuracy across the settings and populations where these evaluations occur.

2. Methods

2.1 Search Strategy

A systematic search of PubMed, OpenAlex, ClinicalTrials.gov, and the Author's reference collection was conducted using a structured Boolean query assembled from Population, Intervention/Exposure, and Outcome keyword sets derived from the study's PICO framework. The population of interest was: Adults aged 18 years and older evaluated for suspected ADHD across outpatient psychiatric, primary care, telehealth, neuropsychological, university health, occupational, and community settings, including subgroups such as women, high-functioning adults, those with comorbid anxiety, depression, trauma, or substance use, college students, self-referred individuals, and adults diagnosed later in life. The intervention/exposure of interest was: Contemporary adult ADHD diagnostic practices including unstructured clinical interviews, DSM-based symptom checklists, self-report scales, telehealth-based evaluations, brief psychiatric assessments, primary care screening, online diagnostic services, retrospective childhood symptom reconstruction, and clinician judgment without structured assessment; additional exposures include social media ADHD content, self-diagnosis, and commercial assessment platforms. The primary outcome was: Diagnostic accuracy of adult ADHD assessment, including false positive and false negative diagnoses, diagnostic stability, sensitivity and specificity of assessment methods, inter-method diagnostic agreement, symptom overlap misclassification, functional impairment, academic and occupational functioning, treatment response, stimulant response patterns, psychiatric comorbidity burden, quality of life, executive functioning, long-term symptom persistence, and influence of social media on ADHD self-identification and diagnostic demand. The search was conducted on 2026-06-05. Records were deduplicated before screening (see §2.3 for counts).

In addition to the database searches described above, 131 articles were identified from the reviewer's personal reference library, matched by PubMed ID, DOI, and PubMed Central ID. ClinicalTrials.gov NCT identifiers were not included in the library match. Personal reference library articles requiring reviewer verification are listed in Appendix B.

Boolean Query (verbatim)

PubMed

((("adult onset ADHD"[Title/Abstract] OR "adult ADHD"[Title/Abstract] OR "ADHD assessment"[Title/Abstract] OR "ADHD evaluation"[Title/Abstract] OR "suspected ADHD"[Title/Abstract] OR "high-functioning ADHD"[Title/Abstract] OR "late-diagnosed ADHD"[Title/Abstract] OR "ADHD referral"[Title/Abstract] OR "women with ADHD"[Title/Abstract] OR "female ADHD"[Title/Abstract] OR "Attention Deficit Disorder with Hyperactivity"[MeSH Terms] OR "attention deficit hyperactivity disorder"[Title/Abstract] OR "adult-onset attention deficit"[Title/Abstract] OR "attention deficit adults"[Title/Abstract] OR "attention deficit disorder"[Title/Abstract] OR "psychiatric comorbidity"[Title/Abstract] OR "aged 18 and older"[Title/Abstract] OR "hyperkinetic disorder"[Title/Abstract]) AND ("young adult"[Title/Abstract] OR "adults"[Title/Abstract])) AND ("Adult ADHD Self-Report Scale"[Title/Abstract] OR "ADHD screening"[Title/Abstract] OR "ADHD self-report"[Title/Abstract] OR "ADHD content"[Title/Abstract] OR "ADHD Rating Scale"[Title/Abstract] OR "Conners Adult ADHD Rating Scale"[Title/Abstract] OR "Brown Attention-Deficit Disorder Scales"[Title/Abstract] OR "brief psychiatric assessment"[Title/Abstract] OR "online diagnosis"[Title/Abstract] OR "Diagnostic and Statistical Manual of Mental Disorders"[MeSH Terms] OR "Psychiatric Status Rating Scales"[MeSH Terms] OR "retrospective childhood symptom"[Title/Abstract] OR "self-diagnosis"[Title/Abstract] OR "diagnostic practice"[Title/Abstract] OR "online diagnostic service"[Title/Abstract] OR "telehealth diagnosis"[Title/Abstract]) AND ("Sensitivity and Specificity"[MeSH Terms] OR "Diagnostic Errors"[MeSH Terms] OR "Misdiagnosis"[Title/Abstract] OR "Negative Predictive Value"[Title/Abstract] OR "Positive Predictive Value"[Title/Abstract] OR "diagnostic accuracy"[Title/Abstract] OR "diagnostic validity"[Title/Abstract] OR "overdiagnosis"[Title/Abstract] OR "over-diagnosis"[Title/Abstract] OR "underdiagnosis"[Title/Abstract] OR "under-diagnosis"[Title/Abstract] OR "false positive"[Title/Abstract] OR "false negative"[Title/Abstract] OR "diagnostic error"[Title/Abstract] OR "diagnostic agreement"[Title/Abstract] OR "diagnostic concordance"[Title/Abstract] OR "ADHD diagnosis rate"[Title/Abstract] OR "ADHD awareness"[Title/Abstract] OR "functional impairment"[Title/Abstract])) AND 2023:2026[dp]

OpenAlex

(("adult onset ADHD" OR "adult ADHD" OR "ADHD assessment" OR "ADHD evaluation" OR "suspected ADHD" OR "high-functioning ADHD" OR "late-diagnosed ADHD" OR "ADHD referral" OR "women with ADHD" OR "female ADHD" OR "Attention Deficit Disorder with Hyperactivity" OR "attention deficit hyperactivity disorder" OR "adult-onset attention deficit" OR "attention deficit adults" OR "attention deficit disorder" OR "psychiatric comorbidity" OR "aged 18 and older" OR "hyperkinetic disorder") AND ("young adult" OR adults)) AND ("Adult ADHD Self-Report Scale" OR "ADHD screening" OR "ADHD self-report" OR "ADHD content" OR "ADHD Rating Scale" OR "Conners Adult ADHD Rating Scale" OR "Brown Attention-Deficit Disorder Scales" OR "brief psychiatric assessment" OR "online diagnosis" OR "Diagnostic and Statistical Manual of Mental Disorders" OR "Psychiatric Status Rating Scales" OR "retrospective childhood symptom" OR self-diagnosis OR "diagnostic practice" OR "online diagnostic service" OR "telehealth diagnosis") AND ("Sensitivity and Specificity" OR "Diagnostic Errors" OR Misdiagnosis OR "Negative Predictive Value" OR "Positive Predictive Value" OR "diagnostic accuracy" OR "diagnostic validity" OR overdiagnosis OR over-diagnosis OR underdiagnosis OR under-diagnosis OR "false positive" OR "false negative" OR "diagnostic error" OR "diagnostic agreement" OR "diagnostic concordance" OR "ADHD diagnosis rate" OR "ADHD awareness" OR "functional impairment")

ClinicalTrials.gov

(("adult onset ADHD" OR "adult ADHD" OR "ADHD assessment" OR "ADHD evaluation" OR "suspected ADHD" OR "high-functioning ADHD" OR "late-diagnosed ADHD" OR "ADHD referral" OR "women with ADHD" OR "female ADHD" OR "Attention Deficit Disorder with Hyperactivity" OR "attention deficit hyperactivity disorder" OR "adult-onset attention deficit" OR "attention deficit adults" OR "attention deficit disorder" OR "psychiatric comorbidity" OR "aged 18 and older" OR "hyperkinetic disorder") AND ("young adult" OR adults)) AND ("Adult ADHD Self-Report Scale" OR "ADHD screening" OR "ADHD self-report" OR "ADHD content" OR "ADHD Rating Scale" OR "Conners Adult ADHD Rating Scale" OR "Brown Attention-Deficit Disorder Scales" OR "brief psychiatric assessment" OR "online diagnosis" OR "Diagnostic and Statistical Manual of Mental Disorders" OR "Psychiatric Status Rating Scales" OR "retrospective childhood symptom" OR self-diagnosis OR "diagnostic practice" OR "online diagnostic service" OR "telehealth diagnosis") AND ("Sensitivity and Specificity" OR "Diagnostic Errors" OR Misdiagnosis OR "Negative Predictive Value" OR "Positive Predictive Value" OR "diagnostic accuracy" OR "diagnostic validity" OR overdiagnosis OR over-diagnosis OR underdiagnosis OR under-diagnosis OR "false positive" OR "false negative" OR "diagnostic error" OR "diagnostic agreement" OR "diagnostic concordance" OR "ADHD diagnosis rate" OR "ADHD awareness" OR "functional impairment")

The Boolean query is assembled as (Population) AND (Intervention/Exposure) only. Comparator and Outcome terms are not AND-joined into the search query, in line with the Cochrane Handbook §4.4.4: outcome and comparator filters belong in inclusion criteria and post-retrieval relevance assessment, not in the search query. This convention prevents premature exclusion of articles that report the outcome or comparator using vocabulary not in the keyword list.

Retrieved records were screened for relevance against the study goal and the prespecified PICO before full-text review and synthesis.

2.1.1 Scope Note (Scoping Review Framing)

This document is presented as a scoping review of available evidence, framed in accordance with the Arksey and O'Malley (2005) scoping review framework and the JBI scoping review methodology. The reviewer assessed the alignment of the retrieved evidence with the prespecified PICO at the end of the retrieval stage and identified the following limitations:

  • the included studies showed only partial overlap with the target population's prespecified PICO characteristics

The methodological implication of scoping-review framing is that broader inclusion criteria are accepted: heterogeneous study designs, a wider range of population overlap, and variation in intervention delivery are all tolerated rather than excluded. Pooled effect estimates are not pursued; the synthesis approach is charting and narrative mapping of the available evidence.

2.2 Inclusion and Exclusion Criteria

Population: Adults aged 18 years and older evaluated for suspected ADHD across outpatient psychiatric, primary care, telehealth, neuropsychological, university health, occupational, and community settings, including subgroups such as women, high-functioning adults, those with comorbid anxiety, depression, trauma, or substance use, college students, self-referred individuals, and adults diagnosed later in life

Intervention/Exposure: Contemporary adult ADHD diagnostic practices including unstructured clinical interviews, DSM-based symptom checklists, self-report scales, telehealth-based evaluations, brief psychiatric assessments, primary care screening, online diagnostic services, retrospective childhood symptom reconstruction, and clinician judgment without structured assessment; additional exposures include social media ADHD content, self-diagnosis, and commercial assessment platforms

Comparator: Rigorous or validation-oriented assessment methods including structured diagnostic interviews, multi-informant assessments, collateral history, longitudinal symptom verification, developmental history confirmation, neuropsychological testing, impairment-based validation, expert consensus diagnosis, blinded reassessment, and long-term functional outcome validation; also comparisons between specialist versus non-specialist settings, telehealth versus in-person evaluation, and comprehensive versus rapid diagnostic models

Outcome: Diagnostic accuracy of adult ADHD assessment, including false positive and false negative diagnoses, diagnostic stability, sensitivity and specificity of assessment methods, inter-method diagnostic agreement, symptom overlap misclassification, functional impairment, academic and occupational functioning, treatment response, stimulant response patterns, psychiatric comorbidity burden, quality of life, executive functioning, long-term symptom persistence, and influence of social media on ADHD self-identification and diagnostic demand

Articles were included if they met the prespecified PICO criteria and open-access full text was available for synthesis. Full-text availability was assessed against PubMed Central and additional open-access repositories.

Eligibility Constraints

  • Search executed 2026-06-05; publication-year filter: no publication-year filter.
  • No language filter applied at retrieval; all results returned by the database query were considered for screening.
  • Study design: no design filter was applied at retrieval; study designs were classified post-retrieval (see §2.3 Study Selection).

2.3 Study Selection

1165 records were identified across the searched databases (PubMed: 25, OpenAlex: 709, ClinicalTrials.gov: 288, Author's reference collection: 143). After deduplication, 1002 records remained (from 1165 records prior to deduplication). Following title and abstract screening, 534 records were assessed for relevance. After relevance assessment, 534 articles met the inclusion criteria. Of these, 80 had retrievable full text and were included in the narrative synthesis.

2.4 Data Extraction

Data were extracted from each of the 80 included articles by a single reviewer using a structured extraction template. For every article the reviewer recorded the main finding, the reported direction of effect, study design, population characteristics, follow-up duration where applicable, sample size, effect-magnitude language as stated by the original authors, and author-stated limitations. Extraction fields were kept as close to the source phrasing as practicable so that hedged or qualified language in the original reports was preserved rather than reinterpreted. No second reviewer independently verified the extractions; this single-reviewer approach is disclosed as a methodological characteristic of the review.

2.5 Risk of Bias Assessment

Risk of bias was assessed using QUADAS-2 and ROBINS-I.

Article Instrument Overall judgment
Drew Erhardt 1999 ROBINS-I Unclear
Erlend J. Brevik 2018 ROBINS-I Unclear
Elena von Wirth 2020 ROBINS-I Some concerns
Minha Hong 2020 QUADAS-2 Some concerns
Lenard A. Adler 2009 ROBINS-I High
André Høberg 2024 (PMC 11210094) ROBINS-I Some concerns
Sébastien Weibel 2017 QUADAS-2 High
Annie Stewart 2012 ROBINS-I High
Lena Nylander 2011 ROBINS-I Unclear
Margaret H. Sibley 2016 ROBINS-I High
Debjani Das 2016 ROBINS-I High
Hanna Christiansen 2011 ROBINS-I Some concerns
Allyson G. Harrison 2019 ROBINS-I High
Eugenia I. Gorlin 2016 ROBINS-I Some concerns
Michael Van Ameringen 2010 ROBINS-I High
Tianhua Chen 2021 QUADAS-2 High
Aldo Pereira 2024 ROBINS-I Some concerns
Joseph Biederman 2012 ROBINS-I High
Stephen V. Faraone 2008 ROBINS-I High
Páll Magnússon 2006 ROBINS-I High
Katie Grogan 2017 ROBINS-I High
Hong 2020 QUADAS-2 Some concerns
Daniel P. Notzon 2016 ROBINS-I Unclear
Margaret H. Sibley 2017 ROBINS-I Some concerns
Leo Bastiaens 2017 QUADAS-2 High
Ernest F. Johnson 2019 ROBINS-I High
Jasmine Hines 2012 QUADAS-2 Some concerns
Stéphanie Baggio 2020 QUADAS-2 High
Belén Roselló Miranda 2020 ROBINS-I Some concerns
Laura M. Garnier-Dykstra 2010 ROBINS-I High
Nina E Calmenson 2021 ROBINS-I Unclear
Lucy Riglin 2021 QUADAS-2 Some concerns
Staffan Söderström 2013 QUADAS-2 Some concerns
Mariano Gabriel Scandar 2021 ROBINS-I High
Mathias Luderer 2018 QUADAS-2 High
Brooke C. Schneider 2019 ROBINS-I High
Josep Antoni Ramos‐Quiroga 2016 QUADAS-2 High
Margaret H. Sibley 2021 ROBINS-I High
Marios Adamou 2026 (PMC 13219367) QUADAS-2 High
Lida Zamani 2020 QUADAS-2 Some concerns
Shanel Chandra 2016 ROBINS-I Some concerns
R. Mayes 2015 ROBINS-I High
Morgan B. Drake 2017 QUADAS-2 Some concerns
Saima Jehanzeb 2025 ROBINS-I High
Johanna Waltereit 2025 (PMC 11868344) ROBINS-I High
Paulo Mattos 2018 QUADAS-2 High
Joel Paris 2015 QUADAS-2 High
Richard Pettersson 2015 QUADAS-2 Some concerns
Sharon Suganthi Caroline S 2024 ROBINS-I High
Yu‑Ju Lin 2015 ROBINS-I Some concerns
Berk Ustun 2017 QUADAS-2 Some concerns
Jan Loney 2007 ROBINS-I High
Margaret H. Sibley 2012 ROBINS-I High
Hui Dong 2023 ROBINS-I Some concerns
Catherine M. McCormick-Deaton 2018 ROBINS-I High
Friederike Blume 2025 (PMC 11571603) ROBINS-I Some concerns
Geurt van de Glind 2013 ROBINS-I High
Sasa L. Kivisaari 2008 ROBINS-I Unclear
Stephen L. Able 2006 ROBINS-I High
L. I. Birtalan 2024 ROBINS-I High
Josep Antoni Ramos‐Quiroga 2012 QUADAS-2 Some concerns
Michael Rösler 2006 ROBINS-I High
Stacy Jean Graves 2022 QUADAS-2 High
Salvatore Mannuzza 2002 QUADAS-2 Some concerns
Mary V. Solanto 2004 ROBINS-I Some concerns
Joel Young 2023 QUADAS-2 High
Bhathika Perera 2019 QUADAS-2 High
Susana Farcas 2018 ROBINS-I High
Miriam Becke 2022 QUADAS-2 High
Akiko Nishikawa 2024 ROBINS-I Some concerns
Erlend J. Brevik 2020 QUADAS-2 High
Allyson G. Harrison 2016 ROBINS-I High
Allyson G. Harrison 2013 ROBINS-I High
Myriam J. Sollman 2010 ROBINS-I High
Marius Grandjean 2025 ROBINS-I High
Noa Givon-Schaham 2026 ROBINS-I Some concerns
Andrew C. Hale 2020 ROBINS-I Some concerns

2.6 Synthesis Methods

This document is presented as a scoping review of available evidence, synthesized following the Arksey and O’Malley (2005) scoping review framework and the JBI scoping review methodology. The synthesis approach is charting and narrative mapping rather than meta-analysis or pooled effect estimation. Studies were charted according to direction of reported effect, study design, and population characteristics; findings are reported descriptively across the structured narrative subsections of Section 3.2. The scoping framework was selected because the reviewer judged the retrieved studies to have insufficient overlap with the prespecified PICO to support a standard systematic review synthesis. Broader inclusion criteria are therefore accepted, accommodating heterogeneous study designs and a wider range of population overlap than a tightly-specified systematic review would require.

PRISMA 2020 Flow Diagram

The flow of records through identification, screening, eligibility, and inclusion is summarized below (PRISMA 2020 Item 16a).

Identification

1165 records were identified.

  • PubMed: 25
  • OpenAlex: 709
  • ClinicalTrials.gov: 288
  • Author's reference collection: 143

Screening

Records removed as duplicates: 163

Records after deduplication: 1002

Records screened: 1002

Records excluded during screening: 468

Eligibility

Records assessed for eligibility: 534

Records excluded at eligibility: 0

Included

Records excluded (no retrievable full text): 454

Studies included in synthesis: 80

3. Results

Coverage and Validity Scorecard

Field Value
Databases searched PubMed, OpenAlex, ClinicalTrials.gov, Author's reference collection
Articles identified per database PubMed: 25, OpenAlex: 709, ClinicalTrials.gov: 288, Author's reference collection: 143
Total identified (pre-dedup) 1165
Articles after deduplication 1002
Records after title and abstract screening 534
Records after eligibility assessment 534
Relevance filter None applied
Articles in synthesis 80
Fulltext retrieved per source
Fulltext retrieved (total)
Fulltext not retrieved
PRISMA 2020 checklist coverage 26 ADDRESSED · 0 PARTIALLY · 1 NOT ADDRESSED
Risk-of-bias distribution Unclear: 6, Some concerns: 26, High: 45
GRADE final certainty Very Low

All values above are counts reflecting the records retrieved and screened for this review. No composite quality or validity scores are computed; reviewers are expected to interpret these counts in context.

3.1 Study Characteristics

[Study characteristics table — to be completed]

Per-stage exclusion reasons are summarized below (PRISMA 2020 Item 16b).

Stage Reason Count
Title and abstract screening Below Threshold 468

Included studies are stratified by study design below.

Study Type Count
Randomized controlled trials 1
Systematic reviews and meta-analyses 3
Observational studies 62
Other study designs 14

3.2 Synthesis of Findings

Note on evidence hierarchy. This review's included set mixes primary studies (RCTs and/or observational designs) with secondary evidence (systematic reviews and/or meta-analyses). Reviewers should consider double-counting risk and weight primary and secondary evidence accordingly.

1. Direction-of-Effect Summary

Of the 80 included articles, 55 (68.8%) reported findings that support the study goal of evaluating whether adult ADHD is being overdiagnosed, underdiagnosed, or both, and identifying evidence-based principles for maximizing diagnostic accuracy. Thirteen articles (16.3%) yielded mixed or inconclusive findings. The remaining 12 articles (15.0%) lacked extractable main findings (null extractions) and were classified as mixed or inconclusive by default, bringing the total mixed or inconclusive count to 25 (31.3%). No article reported findings that directly contradicted the study goal.

Among the 55 supportive studies, reported significance metrics were as follows: p = .003 (von Wirth 2020); p < 0.05 (Hong 2020, two entries); p < .0001 (Stewart 2012); p = 0.002 (Das 2016); p < 0.001 for internal consistency item-total correlations, Cohen's Kappa = 0.88 for interrater reliability, all Steiger's z > 2 with all p < 0.05 for convergent versus discriminant validity comparisons (Gorlin 2016); P ≤ 0.05 (Van Ameringen 2010); p < 0.001 for age of ADHD diagnosis sex difference, p < 0.001 for ADHD-RS sex difference, p = 0.001 for WHODAS sex difference, p = 0.015 for sex × ADHD subtype interaction on WHODAS (Pereira 2024); p < .001 (Harrison 2019, two entries); CHI2(3) = 51.5, p < .0001 for main effect between four modalities, combination of ASRS and CAARS-S-SR superior to single questionnaires with ASRS dichotomized p = .0286, ASRS sum score p = 0.0009, CAARS-S-SR p = 0.0043 (Luderer 2018); p < 0.05 (Schneider 2019); p < .0001 (Ramos-Quiroga 2016); p = .001 for test-retest kappa of 0.857 (Zamani 2020); p < 0.05 (Mattos 2018); sensitivity 100.0% with 95% CI 61.0%–100.0%, specificity 45.2% with 95% CI 31.2%–59.9% (Adamou 2026); p < .001 for omnibus model, TOVA RT variability p = .023 at final step (Drake 2017, three entries); p < .001 for MANOVAs on executive functions, ADHD associated behaviors, and functional impairments (Roselló Miranda 2020); P = .33 for site-specific differences in ASRS-V1.1 results, P = .007 for site-specific differences in CAARS-S:S E-ADHD Index (Hines 2012); 95% CI = 0.87–0.93 (Riglin 2021); p < 0.001 (Able 2006); p < .001 (Pettersson 2015); p < 0.001 (Solanto 2004); p < .001 (Young 2023); p < 0.001 for sensitivity comparison between clinical opinion and DSM V criteria (Perera 2019); p < 0.001 for correlation between ADHD and autism trait scores, p < 0.001 for relationship between staff category and ADHD screening outcome, p = 0.04 for relationship between offence type and ADHD (van de Glind 2013); differences on all subscales were highly significant between patients and controls (Christiansen 2011); both screening instruments appear to perform equally without significant difference between them, no matter which scoring system was used (Bastiaens 2017); p < 0.001 (Farcas 2018); p = .009 for gender and ADHD symptom association on 18-item checklist, p = .000 for 6-item screener (Farcas 2018, same article); 95% CI 0.946–0.965 for WURS AUC, 95% CI 0.888–0.921 for ASRS AUC, 95% CI 0.955–0.973 for combined AUC (Brevik 2020); p < .001 for ASRS-18 model (Scandar 2021); p < .001 for SWAN-DE-SB (Blume 2025); all adjusted p < .002 for the five DIF items, Steiger's test t(597) = −4.85, p < .001 for divergence of Part A and Part B correlations with age (Givon-Schaham 2026); p < 0.001 for prevalence and incidence trends (Hale 2020); p < 0.01 for teacher-report odds ratios (Sibley 2012); association between symptom overreporting and cognitive underperformance was non-significant, χ² = 1.196, df = 1, p = 0.274, while CAARS-SR-index was a significant predictor of EVI failures at p = 0.0492 (Dong 2023); grouping OR 1.25, 95% CI 0.98–1.58, p = 0.07, shading OR 0.88, 95% CI 0.69–1.12, p = 0.29, both not statistically significant (Kraut 2025); p < 0.01 for simulation versus ADHD comparison (Becke 2022); p < 0.05 for retrospective self-rating validity (Loney 2007); p < 0.001 for DSM-5 ADHD quality of life and functional impairment comparisons (Lin 2015). Seventeen of the 55 supportive studies did not report a significance metric: Sibley 2016, Biederman 2012, Johnson 2019, Mayes 2015, Rösler 2006, Graves 2022, Jehanzeb 2025, Waltereit 2025, Caroline 2024, McCormick-Deaton 2018, Lee 2025, Ustun 2017, Birtalan 2024, Grandjean 2025, Harrison 2013, Sollman 2010, and Chen 2021.

Among the 25 articles classified as mixed or inconclusive, reported significance metrics were as follows: p = 0.01 (Adler 2009); likelihood ratio tests with PGS versus base model p < 0.001, ASRS + PGS versus ASRS p = 0.0381, WURS + PGS versus WURS p = 0.0048, ASRS + WURS + PGS versus ASRS + WURS p = 0.0103 (Høberg 2024); p < 0.05 for impulsivity/emotional instability p < 0.001, DSM-IV hyperactive-impulsive p < 0.05, and ADHD index p < 0.001 (Nishikawa 2024); PRS-hyperactivity β = 0.10, p = 7.52E-4 FDR corrected, PRS-inattention β = 0.09, p = 0.02 FDR corrected for adolescents, PRS-hyperactivity β = 0.19, p = 3.58E-03 FDR corrected, PRS-inattention β = 0.06, p = 0.15 FDR corrected not significant for adults (Chandra 2016); p = 0.773 for prevalence difference of probable ADHD between groups, p = 0.003 for Inattention/Memory subscale difference, p = 0.001 for Self-Concept subscale difference (Harrison 2016); p < 0.01 for advanced forms of generalized periodontitis, p < 0.05 for inflammatory diseases of periodontal tissues (Grogan 2017); p < 0.05 for eye vergence classification (Söderström 2013); p < .001 for stimulant misuse comparisons (Magnússon 2006). Fifteen of the 25 mixed or inconclusive articles did not report a significance metric: Erhardt 1999, Brevik 2018, Nylander 2011, Calmenson 2021, Notzon 2016, Barceló 2016, Kivisaari 2008, Faraone 2008, Weibel 2017, and six additional null-extraction articles.

2. Consistency vs. Contradiction Analysis

Across the included studies, several broad areas of agreement emerge, though important tensions exist — often traceable to differences in population, setting, assessment method, or the specific diagnostic question being asked.

Areas of broad consistency

The most robust convergence concerns the inadequacy of any single assessment modality for adult ADHD diagnosis. Self-report screening instruments such as the ASRS consistently demonstrated high sensitivity but limited specificity, producing screen-positive rates far exceeding expected population prevalence. In a primary care factorial trial, 32% of adults screened positive on the ASRS despite an estimated population prevalence of 2–7%. Among medical students, ASRS-based prevalence estimates reached 38.9%, whereas structured diagnostic interviews in the same populations yielded rates closer to 4–8%. A stepwise study of Brazilian medical students showed that self-report screening (ASRS) identified 37% as probable cases, semi-structured interview reduced this to 7.9%, and additional probing for real-life examples of DSM symptoms halved the rate again to 4.5%. These findings align with the observation that the ASRS 6-item screener identified 37.3% of a Hungarian community sample as highly likely to have ADHD, compared with 4.5% identified by the full 18-item symptom checklist. Taken together, these studies consistently indicate that brief self-report tools overestimate ADHD prevalence when used without confirmatory assessment.

Semi-structured diagnostic interviews — particularly the DIVA 2.0 and DIVA-5 — showed consistently strong psychometric properties across multiple validation studies. The DIVA 2.0 achieved perfect agreement (Kappa = 1.0) with the CAADID in a Spanish specialist sample, while the Korean DIVA-5 demonstrated 92% diagnostic accuracy with sensitivity of 91.3% and specificity of 93.6%. The Farsi DIVA-5 showed high specificity (98.2%) but more modest sensitivity (68.2%), a discrepancy likely attributable to the self-referred, treatment-naïve nature of that sample and the absence of accompanying informants in most interviews. Across these studies, structured interviews consistently outperformed unstructured clinical assessment and self-report scales in discriminative validity.

Neuropsychological testing, by contrast, was consistently found to have poor discriminative validity for adult ADHD when used in isolation. In a Swedish psychiatric sample, neuropsychological tests achieved classification accuracy of only 53–66% — barely above chance — whereas the DIVA 2.0 achieved sensitivity of 90.0% and specificity of 72.9%. A multi-method study of adults with ADHD, depression, and healthy controls similarly found that single neuropsychological test measures performed poorly in identifying ADHD, though combining self- and informant-report symptom ratings with family history and a reaction-time variability measure from the TOVA correctly classified 87% of cases. A Hungarian clinical observation likewise noted significant discrepancies between neuropsychological assessment results and other diagnostic tools. These findings converge on the conclusion that neuropsychological testing may supplement but cannot replace structured clinical assessment.

Studies also consistently documented that retrospective self-report of childhood ADHD symptoms is unreliable. A longitudinal cohort reassessed approximately 18 years after childhood treatment found that 79% of adults underreported their childhood symptoms relative to parent ratings collected during childhood, with 17% falsely denying the presence of at least several childhood symptoms (Cohen's d = 1.15 for the discrepancy). A separate longitudinal study of boys referred in childhood found that retrospective self-ratings correlated only weakly with judges' chart ratings (median r = 0.16) and follow-up examiners' ratings (median r = 0.14). These findings reinforce the importance of collateral informant data and developmental history verification.

Areas of tension and their sources

The most prominent area of disagreement concerns whether adult ADHD is predominantly overdiagnosed, underdiagnosed, or both. An editorial commentary argued that the sharp rise in Scandinavian ADHD diagnoses reflects increased recognition of a previously underidentified disorder rather than overdiagnosis. A managed-care study found that 6.2% of adults screened positive for ADHD without a formal diagnosis and exhibited significantly greater functional impairment and comorbidity than non-ADHD controls, consistent with underdiagnosis. Screening of London police custody detainees identified 50% of arrestees without an existing diagnosis as warranting further ADHD assessment. A study of an anxiety disorders clinic found a 27.9% ADHD prevalence rate, substantially higher than general population estimates, suggesting underrecognition in comorbid populations. College students without ADHD diagnoses showed a 10.3% screen-positive rate on the ASRS, and a narrative review of adolescent and young adult populations estimated that 25–48% of self-referred students exaggerate symptoms. A book review concluded that both overdiagnosis and underdiagnosis occur simultaneously, driven by societal, educational, and policy forces.

These seemingly contradictory findings become more coherent when the clinical context is considered. Studies conducted in specialist referral settings, forensic populations, and comorbid psychiatric samples consistently pointed toward underdiagnosis — individuals with genuine ADHD going unrecognized amid complex presentations. Studies conducted in self-referred populations, university health settings, and primary care consistently raised concerns about overdiagnosis or false-positive identification, particularly when brief screening tools were used without structured follow-up. The direction of diagnostic error thus appears to depend heavily on the clinical pathway through which individuals arrive at assessment.

The performance of the ASRS illustrates this context-dependence. In primary care, the ASRS-V1.1 achieved an inconsistency-adjusted sensitivity of 1.0 and specificity of 0.71, with a positive predictive value of only 0.52. Among patients with borderline personality disorder, the ASRS-v1.1 had a positive predictive value of just 38.5%, with 62% of positive screens representing false alarms. In alcohol-dependent patients, established ASRS cut-off values produced unacceptably low sensitivity (57.1%), requiring lower thresholds to detect the majority of ADHD cases. In a dually diagnosed correctional population, only dimensional scoring with a lowered cut-off of 12 provided acceptable sensitivity above 80%. The French ASRS-5 achieved an outstanding AUC of 0.945 in a general clinical sample but showed notably lower sensitivity (63.5%) for low-severity ADHD and high false-positive rates (45.9%) among patients with comorbid bipolar disorder or borderline personality disorder. These variations are not contradictions but rather reflect the well-established principle that screening-tool operating characteristics shift with the base rate of the target condition and the prevalence of symptom-mimicking comorbidities in the population being screened.

Persistence estimates for ADHD from childhood into adulthood varied dramatically — from 4% to 77% — but a systematic review demonstrated that this range is largely methodological. Sole reliance on self-report and strict six-symptom DSM thresholds produced the lowest estimates, while recommended methods incorporating self- and informant-report, impairment requirements, and age-appropriate symptom thresholds yielded persistence rates of 40–50%. A prospective longitudinal study from the MTA cohort found that approximately 95% of individuals who initially screened positive for late-onset ADHD were excluded after comprehensive, multi-informant, longitudinal stepped assessment, with most false positives attributable to heavy substance use or comorbid mental disorders. These findings are consistent rather than contradictory: the apparent disagreement in persistence rates dissolves when assessment methodology is held constant.

Sex-related diagnostic disparities were documented in a large Spanish clinical sample, where females with ADHD were diagnosed significantly later than males (Cohen's d = 0.323), exhibited greater ADHD severity, higher depression and anxiety, poorer functioning, and greater disability. A psychometric study of the ASRS-18 found that gender influences response patterns, with women requiring a greater latent trait to score higher on inattention items and a lower trait for hyperactivity items — a measurement property that could systematically affect symptom counting for diagnostic criteria. These findings are consistent with each other and with the broader literature suggesting that current diagnostic instruments may be less sensitive to female presentations of ADHD.

The question of symptom validity and feigning produced consistent findings across multiple studies. Symptom validity measures showed divergent classification results in clinically diagnosed ADHD patients, with failure rates ranging from 8% to 49% depending on the measure used. A retrospective study of consecutive referrals for adult ADHD assessment found that 71% of patients making suspect effort would be misdiagnosed with ADHD based on interview alone, and that suspect-effort and ADHD groups were nearly indistinguishable on behavior rating scales and continuous performance tests without dedicated validity testing. A narrative review estimated that 25–48% of self-referred students exaggerate ADHD symptoms. An analogue study found that combining multiple validity indicators embedded in the CAARS could improve detection of feigned ADHD, though no single combination markedly outperformed individual indicators. These studies consistently support the inclusion of performance and symptom validity testing in adult ADHD evaluations, particularly in contexts where secondary gain is plausible.

A smaller number of studies addressed emerging or adjunctive diagnostic approaches. An ADHD polygenic risk score contributed statistically significant but clinically negligible additional variance (0.57–1.52 percentage points) beyond self-report scales and family history, and the authors concluded it is not currently a clinically useful diagnostic aid. A machine-learning study using clinical assessment data from an NHS trust achieved 85.5% accuracy in a sample of only 69 patients, a result the authors described as promising but preliminary. An eye-vergence classification study achieved high accuracy in children but has not been validated in adults. A study of systematic quantitative analysis of primary school reports achieved AUC up to 0.97 for retrospective identification of childhood ADHD, offering a potential alternative to unreliable retrospective self-report. These studies do not contradict the main body of evidence but rather represent early-stage investigations whose clinical utility remains to be established.

Finally, a clinician survey revealed that diagnostic practices frequently diverge from guideline recommendations: only 31.1% of surveyed clinicians used a structured or semi-structured interview, and almost half indicated they never or rarely obtained collateral reports. This finding is consistent with the observation that ADHD diagnoses among VA patients increased 258% from 2009 to 2016 while the proportion of new diagnoses accompanied by neuropsychological evaluation decreased. Together, these studies suggest that the rising rate of adult ADHD diagnosis is occurring alongside — and perhaps partly because of — declining rigor in assessment methodology, a pattern that could simultaneously produce both overdiagnosis in some settings and underdiagnosis in others.

One included study examined periodontal disease in Ukrainian military personnel with PTSD and did not address adult ADHD diagnostic accuracy; its findings are not synthesized here.

3. Study Limitations (Aggregated)

Author-stated limitations were available for the majority of included studies, though a subset of articles — primarily editorials, narrative reviews, opinion papers, and studies with truncated text — did not report formal limitations. The limitations reported across the extraction set cluster around several recurring themes, each of which bears directly on the confidence with which diagnostic accuracy conclusions can be drawn.

Reliance on self-report and absence of collateral information

The most frequently cited limitation was dependence on self-report measures without corroboration from informants, collateral records, or structured clinical interviews. At least fifteen studies explicitly acknowledged that ADHD symptoms, childhood history, or functional impairment were assessed solely through participant self-report, raising concerns about recall bias, social desirability, and the inability to verify symptom endorsement. Several authors noted that retrospective recall of childhood symptoms is particularly vulnerable to distortion; one longitudinal study found that 79% of participants underreported childhood symptoms relative to parent ratings collected during childhood, and 17% falsely denied the presence of at least several childhood symptoms. Multiple studies that used screening instruments such as the ASRS or WURS acknowledged that these tools are not diagnostic and cannot rule out comorbid conditions or detect inconsistent responding.

Sample composition and generalizability

Restricted or unrepresentative samples were cited as limitations in at least twenty studies. Common concerns included recruitment from single clinical sites or specialty ADHD clinics, overrepresentation of particular demographic groups (male participants in some studies, female participants in others, predominantly white or highly educated samples), and geographic restriction to a single country or region. Several studies noted that their clinical samples — drawn from tertiary referral centers or specialized ADHD programs — may not reflect the broader population of adults presenting with attentional complaints in primary care or community settings. Studies conducted among college students acknowledged that findings may not extend to the general adult population, and studies in correctional or substance-use-disorder populations noted that the high comorbidity burden in those settings may limit applicability elsewhere.

Cross-sectional design and inability to infer causation

A substantial number of studies employed cross-sectional designs, and their authors acknowledged that this precludes causal inference and limits the ability to track diagnostic stability over time. This limitation is particularly relevant to the question of whether ADHD is being over- or underdiagnosed, because cross-sectional snapshots cannot distinguish transient symptom elevations from persistent disorder. Only a small number of included studies used prospective longitudinal follow-up, and even among those, attrition and non-random dropout were acknowledged as threats to validity.

Small sample sizes and limited statistical power

At least ten studies explicitly cited small sample sizes as a limitation, with several noting that subgroup analyses (by sex, ADHD presentation, or comorbidity profile) were underpowered. One diagnostic validation study included only 40 participants; another machine-learning study used data from 69 patients. Authors of these studies appropriately characterized their findings as preliminary.

Absence of a true gold standard for ADHD diagnosis

Several studies acknowledged the circular difficulty of validating one diagnostic method against another when no universally accepted gold standard exists for adult ADHD. In some validation studies, the reference diagnosis was itself based on clinical interview without independent confirmation, and authors noted that this may introduce confirmation bias. One vignette-based study acknowledged that the "correct" diagnosis assigned by the study authors was subjective. Studies comparing screening instruments to structured interviews recognized that the structured interview itself may not perfectly capture the construct.

Comorbidity and differential diagnosis

Multiple studies noted that high rates of psychiatric comorbidity — including depression, anxiety, bipolar disorder, borderline personality disorder, substance use disorders, and trauma-related conditions — complicate the interpretation of ADHD screening and diagnostic results. Authors of studies in anxiety-disorder clinics, substance-use-treatment settings, and correctional populations acknowledged that overlapping symptom profiles may inflate false-positive rates on ADHD-specific instruments. The ASRS screening tool was specifically noted in several studies to lack the capacity to distinguish ADHD symptoms from those attributable to mood, anxiety, or personality disorders.

Diagnostic criteria and classification system differences

Several studies were conducted using DSM-IV criteria rather than DSM-5, and their authors acknowledged that the more restrictive earlier criteria (requiring onset before age 7 and a higher symptom threshold) may have influenced prevalence estimates and diagnostic concordance. One systematic review noted that variations in childhood diagnostic criteria, decade of study, and participant demographics all influence persistence estimates in ways that are difficult to disentangle.

Analogue and simulation study limitations

Studies that used instructed simulators to evaluate symptom validity measures acknowledged that analogue designs overestimate classification accuracy and effect sizes relative to real-world malingering. Authors noted that the motivations and strategies of instructed simulators may differ from those of individuals feigning ADHD for genuine secondary gain, and that findings from simulation paradigms require cautious extrapolation to clinical practice.

Truncated or unavailable limitations sections

For a small number of studies, the full limitations section was not available in the extracted text. In these cases, partial limitations were reported where available, but the completeness of the limitation profile for those studies cannot be assured.

4. Population Heterogeneity

The included studies spanned a broad range of the populations specified in the review's eligibility criteria, yet coverage was uneven, and findings varied meaningfully across several subgroup axes that emerged from the extraction data.

Clinical Setting

The largest cluster of studies drew participants from specialist psychiatric or ADHD outpatient clinics. Within this setting, structured diagnostic interviews such as the DIVA 2.0 and DIVA-5 consistently demonstrated high diagnostic accuracy — 92% overall accuracy in Korean psychiatric outpatients and perfect agreement (Kappa = 1.0) with the CAADID in a Spanish university hospital sample. Self-report screeners performed less consistently in specialist settings: the ASRS showed high sensitivity but variable specificity depending on the clinical population being screened, and the BAARS-IV demonstrated poor discrimination between ADHD and healthy controls in a community mental health clinic sample.

Primary care was represented by fewer studies, but those available pointed toward both underrecognition and screening limitations. One quality-improvement project found that validated ADHD screening was used in only 3% of eligible primary care encounters before a structured implementation effort, rising to 87% afterward. A diagnostic accuracy study of the ASRS in primary care reported an inconsistency-adjusted sensitivity of 1.0 and specificity of 0.71, while a factorial trial in a Canadian family medicine clinic found a screen-positive rate of 32% — far exceeding the estimated population prevalence of 2–7% — raising questions about the specificity of brief screening in unselected primary care attendees.

Correctional and forensic settings were addressed by two studies. In a dually diagnosed correctional population, the ASRS produced unacceptably low sensitivity (36%) at recommended categorical cut-offs, and only a lower dimensional threshold restored acceptable detection. Screening of arrestees in London police custody identified 50% of those without an existing diagnosis as warranting further ADHD assessment, suggesting substantial unrecognized ADHD in criminal-justice populations.

A single NHS-based prospective validation study evaluated a criterion-based triage pathway administered by non-specialist clinicians, achieving 100% sensitivity but only 45.2% specificity, indicating that while the pathway was safe for ruling out false negatives, it permitted a high rate of onward referral for individuals who ultimately did not receive an ADHD diagnosis.

Sex and Gender

Sex-stratified findings were reported in a minority of studies but consistently pointed toward later diagnosis and greater clinical burden in women. In a large Spanish clinical cohort of 900 adults with ADHD, females were diagnosed significantly later than males (Cohen's d = 0.323), reported greater ADHD severity, higher depression and anxiety, and more pronounced disability, with a significant sex-by-subtype interaction showing that women with combined-presentation ADHD experienced the most marked functional impairment. A Brazilian psychometric study of the ASRS-18 found that gender influenced response patterns: women required a higher latent trait level to endorse inattention items and a lower trait level for hyperactivity items, which may systematically affect symptom counting when a fixed threshold is applied. One study of non-referred university students also found that females scored higher on sensory sensitivity, sensation avoiding, inattention, and impulsivity measures. These findings collectively suggest that standard assessment instruments and thresholds may underdetect ADHD in women or mischaracterize its presentation.

Age Across the Adult Lifespan

Most included studies enrolled younger adults, and only a few examined age-related variation in diagnostic performance. A differential item functioning analysis of the ASRS across adults aged 20 to 80 found that five of 18 items showed significant age-related bias: the six-item Part A screener systematically underestimated ADHD severity in older adults because hyperactivity items were endorsed less frequently with age, while Part B items capturing inattention-related difficulties were endorsed more frequently. The authors interpreted this as a phenotypic redistribution of symptoms rather than a true decline in prevalence, implying that age-invariant cut-offs may produce false negatives in older adults. A validation study of the SDQ hyperactivity subscale at age 25 found high accuracy (AUC = 0.90) but recommended a lower cut-point than that used for younger ages. No included study specifically examined diagnostic accuracy in adults over 65, leaving this segment of the PICO-specified population unaddressed.

University and College Students

Several studies focused on college or university populations, and findings within this subgroup consistently highlighted the risk of overestimation when brief self-report instruments are used without further validation. Among Brazilian medical students, self-report screening (ASRS) yielded a 37% prevalence estimate, semi-structured interview reduced this to 7.9%, and additional probing for real-life examples of DSM symptoms halved it further to 4.5%. A systematic review of ADHD prevalence in medical students across 17 countries found rates ranging from 1.7% (self-report) to 38.9% (ASRS screener), underscoring the dependence of prevalence estimates on assessment method. In a U.S. college sample, 10.3% of students without a prior ADHD diagnosis scored in the clinical range of the ASRS, suggesting a pool of potentially undiagnosed individuals — though the absence of confirmatory diagnostic interviews limits interpretation. A study of non-clinical, non-treatment-seeking college students without ADHD symptoms found that clinically significant levels of self-reported impairment were common (e.g., 20.2% reported difficulty interacting with friends), suggesting that experiences of difficulty in domains commonly associated with ADHD may be normative in college populations and not specific to the disorder.

A narrative review of adolescents and college-age young adults reported that 25–48% of self-referred students exaggerate ADHD symptoms, and nearly 44% of a simulated-ADHD group was incorrectly classified as having ADHD by a psychologist, highlighting the vulnerability of standard assessment methods to symptom feigning in this population.

Comorbid Psychiatric Conditions

Findings diverged substantially depending on the comorbidity profile of the population studied. In an anxiety disorders clinic, the prevalence of ADHD was 27.9% — substantially higher than general population estimates — suggesting underrecognition of ADHD in anxiety-disordered populations. By contrast, the ASRS showed poor positive predictive value (38.5%) and a 62% false-alarm rate when used to screen for ADHD among adults with borderline personality disorder, and the French ASRS-5 showed notably reduced specificity (54.1%) and a high false-positive rate among patients with comorbid bipolar disorder or borderline personality disorder (AUC dropping from 0.945 overall to 0.801 in this subgroup). In alcohol-dependent patients, self-report ADHD screeners showed low sensitivity at established cut-offs due to underreporting of symptoms, and a negative screening result did not reliably exclude ADHD. The MINI-Plus ADHD module in treatment-seeking substance use disorder patients across seven countries showed moderate agreement with the CAADID (Kappa = 0.60) and overestimated ADHD prevalence (18.0% versus 14.2% by structured interview).

These patterns indicate that the direction of diagnostic error shifts with the comorbidity context: in populations with high rates of externalizing or personality pathology, false positives predominate, whereas in populations with internalizing disorders or substance use, false negatives may be the greater concern.

Childhood-Diagnosed Versus Adult-Diagnosed Populations

A systematic review of prospective longitudinal studies found that persistence estimates for childhood ADHD into adulthood ranged from 4% to 77%, with the wide range driven almost entirely by methodological choices — sole reliance on self-report and strict DSM symptom thresholds produced the lowest estimates, while recommended multi-informant methods with impairment requirements yielded persistence rates of 40–50%. A longitudinal cohort study from the MTA found that approximately 95% of individuals who initially screened positive for late-onset ADHD were excluded after comprehensive, multi-informant, stepped diagnostic assessment, with most false positives attributable to heavy substance use or comorbid mental disorders. A study of adults who had been diagnosed and treated for ADHD in childhood found that 79% underreported their childhood symptoms on retrospective self-report, and 17% falsely denied the presence of at least several childhood symptoms, with retrospective self-ratings showing no correlation with parent ratings collected during childhood (Cohen's d = 1.15 for the discrepancy). These findings raise concerns about the validity of retrospective childhood symptom reconstruction as a diagnostic requirement in adults.

Adults diagnosed with ADHD under DSM-5 criteria — including those with onset between ages 7 and 12 — showed significantly decreased quality of life and increased functional impairment compared to controls regardless of age of onset, suggesting that the broadened DSM-5 onset criterion does not over-include individuals without meaningful impairment.

Intellectual Disability

A single vignette-based study examined ADHD diagnosis in adults with intellectual disability and found that clinical opinion demonstrated significantly higher sensitivity (0.82) than strict application of DSM-5 criteria (sensitivity 0.23), with perfect specificity and near-perfect inter-rater reliability. This finding, while based on fictional case scenarios rather than live clinical assessment, suggests that standard DSM criteria may substantially underdetect ADHD in this population.

Populations Not Represented

Several subgroups specified in the review's eligibility criteria were not directly addressed by any included study. No study specifically examined diagnostic accuracy in telehealth-based evaluations compared with in-person assessment, and no study directly measured the influence of social media ADHD content on self-identification or diagnostic demand. Adults diagnosed later in life (beyond young adulthood) were not examined as a distinct subgroup in any included study, and occupational settings were not represented. High-functioning adults — a population of particular clinical interest given the potential for symptom masking — were not the focus of any included study, although one study noted that higher educational attainment and intellectual functioning in its sample may have masked executive functioning deficits. Rural populations were explicitly noted as underrepresented in one primary care study whose planned rural recruitment was not completed. These gaps limit the generalizability of the present evidence base to the full range of adults and settings specified in the review question.

5. Effect Magnitude Language (Aggregated)

The language used by study authors to characterize effect magnitudes was heterogeneous, reflecting the diversity of study designs, analytic approaches, and outcome domains represented in this review. Several broad patterns emerged.

Studies reporting diagnostic accuracy metrics — the largest cluster — tended to describe their findings in terms of sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and predictive values. Language ranged from restrained ("acceptable criterion validity") to strongly affirmative ("excellent," "outstanding," "very satisfying"). The Korean DIVA-5 validation described a diagnostic accuracy of 92% as establishing the instrument as "reliable." The French ASRS-5 validation characterized an AUC of 0.945 as "outstanding" overall but noted "notably lower sensitivity" (63.5%) in the low-severity subgroup. The Norwegian WURS/ASRS study described AUC values above 0.90 as indicating "excellent" screening properties. The DSM-5 ASRS screener development study characterized its AUC of 0.94 and sensitivity above 91% as "excellent operating characteristics." By contrast, the BAARS-IV validation described that instrument's ability to separate ADHD from healthy controls as "poor," and the CAARS study in an outpatient mental health sample noted "poor specificity" despite adequate sensitivity. The ASRS screening study in borderline personality disorder patients described a positive predictive value of only 38.5% and noted that "in 62% of cases, the positive screening was a false alarm." These divergent characterizations underscore that the same class of metric — diagnostic accuracy — was interpreted with markedly different evaluative language depending on the clinical context and comparator population.

Studies examining group differences or predictive relationships used conventional effect-size descriptors with somewhat greater consistency. Cohen's d values were reported across several studies, with language calibrated to standard benchmarks: the retrospective childhood symptom recall study described d = 1.15 as "large," the feigned-ADHD simulation study described d = 1.551 as "very large by Rogers' standards and large by Cohen's classification," and the sex-difference study in a Spanish clinical cohort described d = 0.323 as a small-to-medium effect. Partial eta-squared values in the persistence study ranged from 0.08 ("small-medium") to 0.52 ("large"). The polygenic risk score study described its incremental variance explained (0.57–1.52 percentage points) as "clinically negligible," a notably candid characterization that contrasted with the statistical significance of the finding. The ADHD-violence association study described an adjusted odds ratio of 1.75 as "only moderate at the population level."

A third group of studies used qualitative or comparative language without formal effect-size metrics. The editorial comment stated that rising Scandinavian ADHD diagnoses "likely reflect increased awareness" rather than overdiagnosis. The narrative review on diagnostic assessment stated that applying Utah criteria instead of DSM criteria reduced the number of persons classified as having ADHD by "about 20%." The stepwise diagnostic study in Brazilian medical students described probing procedures as "almost halving" the ADHD prevalence rate, providing a correction factor of 0.52. The study of suspect effort in ADHD evaluations described the suspect-effort and ADHD groups as "nearly indistinguishable" on behavioral rating scales, a qualitative characterization that conveyed the clinical gravity of the finding more directly than any single statistic.

Several studies explicitly noted that their findings should be interpreted cautiously. The analogue simulation study acknowledged that such designs "overestimate the classification accuracy of validity measures and effect sizes of tests." The machine-learning diagnostic study described its 85.5% accuracy as "very promising" but acknowledged a sample of only 69 patients. The criterion-based triage pathway study reported 100% sensitivity but noted wide confidence intervals (61.0%–100.0%) reflecting its small sample.

Overall, effect-magnitude language was inconsistent across the included studies. Identical AUC values were described as "excellent" in one context and merely "acceptable" in another. Odds ratios of similar magnitude received different qualitative labels depending on the clinical question. This variability limits the reader's ability to compare findings across studies without returning to the numerical values themselves.

Significance Metrics Reported

Of the 80 included articles, 34 reported an explicit p-value (whether exact or as an inequality such as p < 0.001). Nine studies reported confidence intervals, either alone or alongside a p-value. Seven studies provided author-stated significance language without a specific numeric metric (e.g., "highly significant," "significant differences," or "without significant difference"). The remaining 30 articles — comprising editorials, narrative reviews, book reviews, studies with null extraction fields, and several diagnostic accuracy studies that reported operating characteristics without inferential statistics — did not report any formal significance metric.

6. Study Design Profile

The evidence base informing this review is methodologically heterogeneous, spanning a broad range of study designs. This diversity reflects the multifaceted nature of the central question — whether adult ADHD is overdiagnosed, underdiagnosed, or both — which draws on epidemiological, psychometric, clinical-practice, and longitudinal research traditions. The design mix carries direct implications for the strength and type of inferences that can be drawn.

The largest share of included studies employed cross-sectional designs, most commonly structured as diagnostic accuracy or psychometric validation investigations. These studies typically compared one or more screening or diagnostic instruments against a reference standard (often a semi-structured clinical interview) in adults evaluated for suspected ADHD across psychiatric outpatient, primary care, university, forensic, and substance use treatment settings. While such designs are well suited to estimating sensitivity, specificity, and predictive values of assessment methods, they cannot establish temporal relationships between diagnostic practices and downstream outcomes such as functional impairment or treatment response. Several of these cross-sectional studies were conducted in specialized ADHD clinics, where the pre-test probability of ADHD is elevated; their operating characteristics may not transfer directly to lower-prevalence settings such as primary care.

A smaller but informative subset of studies used prospective longitudinal cohort designs. One systematic review synthesized twelve prospective samples of children diagnosed with ADHD and followed into adulthood, finding that persistence estimates ranged from 4% to 77% depending on the diagnostic method applied at follow-up. A prospective cohort from the Multimodal Treatment Study of ADHD demonstrated that approximately 95% of individuals who initially screened positive for late-onset ADHD were excluded after comprehensive, multi-informant, longitudinal stepped assessment — a finding that could only emerge from repeated measurement over time. Another longitudinal study showed that adults' retrospective self-ratings of their own childhood ADHD symptoms were substantially lower than parent ratings collected during childhood, with 79% of participants underreporting symptoms roughly eighteen years later. These longitudinal designs provide the strongest available evidence on diagnostic stability, symptom persistence, and the validity of retrospective recall, but they are few in number and their samples were predominantly male and drawn from metropolitan research cohorts.

Several systematic and narrative reviews were included. One systematic review examined ADHD prevalence among medical students across 29 studies and found that screening tools yielded substantially higher prevalence estimates than structured diagnostic interviews or self-report of prior diagnosis. A narrative literature review of 46 studies on adolescent and young adult populations highlighted the challenge of distinguishing genuine ADHD from feigned or comorbidity-driven symptom presentations. These secondary syntheses aggregate evidence across settings and methods, but their conclusions are bounded by the quality and heterogeneity of their included primary studies.

Case-control designs contributed evidence on the discriminative properties of polygenic risk scores and the psychometric performance of rating scales in distinguishing clinically diagnosed ADHD patients from population controls. One case-control study of over 1,100 Norwegian adults found that the ADHD polygenic score added statistically significant but clinically negligible variance beyond self-report scales, illustrating how a design well powered for group-level discrimination may nonetheless reveal limited individual-level diagnostic utility.

Two case reports described adults whose ADHD had been missed because presenting symptoms — pseudohallucinations, maladaptive daydreaming, treatment-resistant depression — mimicked other conditions. While case reports occupy the lowest tier of the evidence hierarchy, these examples concretely illustrate the clinical consequences of diagnostic misattribution that larger studies describe in aggregate.

A single randomized controlled trial examined whether design features of the ASRS screening form (question grouping and shading) influenced the screen-positive rate in a primary care sample. The trial found no significant effect of these features, but the overall screen-positive rate of 32% far exceeded the estimated population prevalence of adult ADHD, underscoring the gap between screening and diagnosis. One quality improvement project documented the effect of implementing a validated screening tool in primary care, increasing screening-tool use from 3% to 87% of eligible encounters. An editorial comment and a book review rounded out the design spectrum, each offering interpretive perspective rather than primary data.

Notably, analogue simulation studies — in which healthy volunteers were instructed to feign ADHD — contributed evidence on symptom validity testing. These designs are inherently limited by the artificiality of the simulation context, and the investigators themselves cautioned that analogue studies tend to overestimate the classification accuracy of validity measures. Nonetheless, they provided the only controlled data on the distinguishability of genuine and feigned ADHD symptom profiles.

Several implications follow from this design profile. First, the predominance of cross-sectional diagnostic accuracy studies means that most findings describe the operating characteristics of assessment tools at a single point in time rather than the trajectory of diagnostic decisions and their consequences. Second, the small number of longitudinal studies limits the evidence available on diagnostic stability, late-onset ADHD validity, and the long-term functional outcomes associated with different diagnostic approaches. Third, the absence of any randomized trial comparing comprehensive versus abbreviated diagnostic pathways on patient-centered outcomes means that recommendations favoring multi-method assessment rest on convergent cross-sectional evidence rather than experimental demonstration of superiority. Fourth, many studies were conducted in specialty referral settings where ADHD base rates are high; the performance of screening and diagnostic instruments in lower-prevalence primary care or community settings is less well characterized. Readers should weigh the findings of this review with these design-level constraints in mind.

7. Recency of Evidence

The included studies span more than two decades, with publication years ranging from 1999 to 2026. The distribution is weighted toward recent work: roughly half of the articles with extractable data were published from 2019 onward, and a substantial cluster appeared between 2015 and 2018. Earlier contributions — from the late 1990s through the early 2010s — remain represented, supplying foundational psychometric validation work, longitudinal cohort data on ADHD persistence, and early epidemiological estimates that continue to anchor the field.

Several of the most methodologically informative studies are recent. Diagnostic-accuracy investigations of semi-structured interviews such as the DIVA-5 and updated ASRS versions were published between 2017 and 2026, as were studies examining age-related differential item functioning on the ASRS, criterion-based screening pathways in NHS settings, and the performance of validity indicators in detecting noncredible symptom presentation. The systematic reviews addressing ADHD persistence from childhood and ADHD prevalence among medical students were published in 2016 and 2025, respectively. Longitudinal data from the Multimodal Treatment Study of ADHD, reporting on late-onset ADHD false-positive rates after stepped diagnostic assessment, appeared in 2017. Work on sex differences in diagnostic timing and severity, the unreliability of retrospective childhood symptom recall, and clinician adherence to diagnostic guidelines was published between 2019 and 2025.

A smaller number of studies date from before 2010, including early cross-sectional work on ADHD screening in primary care and anxiety-disorder clinics, retrospective self-report validity in longitudinal cohorts, and editorial commentary on rising Scandinavian diagnostic rates. These older studies remain relevant because they address questions — such as the accuracy of retrospective symptom reconstruction and the burden of undiagnosed ADHD in managed-care populations — that have not been superseded by more recent evidence of comparable scope.

The evidence base is not predominantly old. The concentration of publications in the most recent decade means that the synthesis reflects contemporary diagnostic criteria (DSM-5 and ICD-10/11), current screening instruments, and present-day clinical contexts including telehealth-era referral patterns and rising diagnostic demand. At the same time, the inclusion of studies from earlier periods provides useful temporal depth for evaluating whether diagnostic challenges identified in the early 2000s have been resolved or persist under newer assessment frameworks.

8. Evidence Gaps

A central finding of this review is that the limited quantity and heterogeneity of the available literature constitute an important result in their own right. Despite intense public and clinical debate surrounding adult ADHD overdiagnosis and underdiagnosis, relatively few high-quality studies directly compare diagnostic pathways or evaluate their long-term consequences.

8a. Gaps Identified from Article Text

Several recurring limitations acknowledged across the included studies point toward areas where the evidence base remains thin or methodologically constrained.

Reliance on self-report without collateral corroboration. Many studies noted that ADHD symptom assessments depended exclusively on self-report, without informant ratings, collateral history, or clinician observation. Von Wirth and colleagues found that 79% of adults retrospectively underreported childhood ADHD symptoms relative to parent ratings collected during childhood, and Loney and colleagues documented minimal agreement between probands' retrospective self-ratings and independent judges' chart ratings (median correlation 0.16). Several diagnostic accuracy studies acknowledged that the absence of collateral data may have inflated or deflated diagnostic estimates, yet few included studies formally compared self-report-only pathways against multi-informant assessment in the same sample.

Cross-sectional designs precluding causal inference and longitudinal validation. The majority of included studies employed cross-sectional designs, and multiple author teams flagged this as a barrier to understanding the temporal stability of ADHD diagnoses, the trajectory of functional impairment, and the causal direction of associations between ADHD and comorbid conditions. Sibley and colleagues' systematic review of persistence estimates highlighted that methodological variation — particularly whether studies required informant corroboration and how they set symptom thresholds — produced persistence rates ranging from 4% to 77%, underscoring the need for longitudinal studies with consistent, rigorous diagnostic methods.

Limited demographic diversity. Several studies acknowledged that their samples were predominantly white, male, or drawn from metropolitan academic medical centers. Pereira and colleagues noted that their clinical sample, while large (N = 900), was recruited from a single university hospital program. The systematic review of ADHD prevalence among medical students observed that most included studies consisted of predominantly white, male, middle-class cohorts. Harrison and colleagues' non-referred university sample was 85% female and 89% white. These constraints limit the generalizability of diagnostic accuracy estimates to the broader adult population.

Insufficient evaluation of comorbidity-driven misclassification. Weibel and colleagues reported that 62% of positive ADHD screens in adults with borderline personality disorder were false alarms, and Baggio and colleagues found that the French ASRS-5 had a false positive rate of nearly 46% among patients with comorbid bipolar disorder or borderline personality disorder. Yet few studies systematically examined how anxiety disorders, mood disorders, trauma-related conditions, or substance use disorders alter the operating characteristics of ADHD screening and diagnostic instruments. Van Ameringen and colleagues found a 27.9% ADHD prevalence in an anxiety disorders clinic but acknowledged that comorbid conditions with overlapping symptoms were not comprehensively assessed.

Absence of formal reliability and validity data for some widely used instruments. Ustun and colleagues noted that the Adult Clinician Diagnostic Scale, despite use in FDA registration trials, had not undergone formal reliability and validity studies. Luderer and colleagues found that established ASRS and CAARS cut-off values performed poorly in alcohol-dependent patients, and Bastiaens and colleagues reported similar findings in a dually diagnosed correctional population, yet alternative validated cut-offs for these subpopulations remain unestablished.

Feigning and symptom validity testing. Multiple studies raised concerns about the vulnerability of standard ADHD assessment tools to symptom exaggeration. McCormick-Deaton and colleagues estimated that 25–48% of self-referred college students exaggerate ADHD symptoms, and Sollman and colleagues found that 71% of patients making suspect effort would be misdiagnosed with ADHD using interview alone. Grandjean and colleagues noted that most widely used adult ADHD rating scales lack embedded validity indexes. Despite these concerns, the evidence base for validated symptom validity testing protocols specific to adult ADHD evaluations remains limited, and analogue simulation designs — which several authors acknowledged overestimate classification accuracy — predominate.

Lack of cost-effectiveness and implementation data. Adamou and colleagues' triage pathway study and the quality improvement project by Caroline and colleagues both demonstrated feasibility of structured screening in clinical settings, but neither provided cost-effectiveness data. Rösler and colleagues emphasized that comprehensive diagnostic assessment requires multiple components, yet no included study formally evaluated the cost or resource implications of comprehensive versus abbreviated diagnostic models.

8b. Gaps Identified by Independent Analysis

Within the retrieved evidence set, no studies examined the diagnostic accuracy or clinical outcomes of telehealth-based ADHD evaluations compared with in-person assessment. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies examined the influence of social media ADHD content on self-identification, diagnostic demand, or diagnostic accuracy in adults. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies examined commercial online ADHD assessment platforms or direct-to-consumer diagnostic services as an exposure or comparator. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies examined stimulant response patterns as a post-hoc validation strategy for diagnostic accuracy — that is, whether treatment response to stimulant medication distinguished correctly diagnosed from incorrectly diagnosed adults. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies examined diagnostic accuracy in adults diagnosed with ADHD for the first time after age 50. The included samples were predominantly younger adults, with most upper age limits at 40–65 years and mean ages in the late twenties to early forties. Givon-Schaham and colleagues demonstrated that the ASRS Part A screener systematically underestimates ADHD severity in older adults, but no study evaluated comprehensive diagnostic accuracy in this age group. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies examined ADHD diagnostic accuracy in occupational health or workplace settings. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies directly compared specialist ADHD clinics with non-specialist primary care settings using the same diagnostic protocol applied to the same or matched populations. Several studies were conducted in specialist settings and others in primary care, but head-to-head comparisons of diagnostic concordance across these settings were absent. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies examined blinded reassessment — in which clinicians unaware of the initial diagnostic decision independently re-evaluated the same patients — as a method for estimating false positive or false negative rates. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies examined long-term functional outcome validation — that is, whether adults diagnosed with ADHD through different assessment methods showed divergent trajectories in academic achievement, occupational functioning, or quality of life over follow-up periods exceeding five years. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no randomized controlled trials compared comprehensive multi-component diagnostic assessment against rapid or abbreviated diagnostic models with respect to diagnostic accuracy, patient outcomes, or downstream treatment appropriateness. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

Within the retrieved evidence set, no studies were conducted in low- or middle-income countries outside of Latin America (one Brazilian study was included). The geographic distribution was concentrated in Western Europe, Scandinavia, North America, East Asia, and Iran. Whether this represents a gap in the broader literature or a limitation of this search strategy is beyond the scope of this review.

9. Clinical Implications

Taken together, the findings from this evidence set carry several practical implications for clinicians evaluating adults for ADHD, though the predominantly cross-sectional designs, heterogeneous populations, and frequent reliance on self-report data warrant caution against definitive prescriptions.

First, no single assessment method appears sufficient for accurate adult ADHD diagnosis. Self-report screening instruments such as the ASRS consistently demonstrated high sensitivity but variable specificity, with false-positive rates that were particularly pronounced in populations with high psychiatric comorbidity — notably borderline personality disorder, bipolar disorder, substance use disorders, and anxiety disorders. In one study of medical students, self-report screening identified a prevalence of 37%, semi-structured interview reduced this to approximately 8%, and additional probing for concrete behavioral examples halved it again to 4.5%. These stepwise reductions underscore that screening-level tools, while useful for case-finding, should not be treated as diagnostic endpoints.

Second, structured and semi-structured diagnostic interviews — particularly the DIVA 2.0 and DIVA-5 — showed strong diagnostic accuracy and interrater reliability across multiple validation studies and cultural contexts. When combined with collateral history, informant ratings, and measures of functional impairment, classification accuracy improved substantially over any single-method approach. One study found that integrating self-report and informant symptom ratings with family history and a reaction-time variability measure correctly classified 87% of cases, whereas individual neuropsychological tests performed little better than chance. Clinicians should therefore favor multi-method, multi-informant assessment strategies over reliance on any one instrument.

Third, retrospective recall of childhood symptoms — a cornerstone of current diagnostic practice — may be unreliable. Adults who had been diagnosed and treated for ADHD in childhood substantially underreported their childhood symptoms when asked to recall them years later, with 79% underreporting and 17% falsely denying the presence of at least several childhood symptoms. Systematic analysis of contemporaneous school reports showed high diagnostic accuracy for retrospective confirmation of childhood ADHD, suggesting that archival documentation, when available, may be a more dependable source than patient recall alone.

Fourth, the evidence highlights populations in which ADHD is likely underrecognized. Women with ADHD were diagnosed significantly later than men and exhibited greater severity, higher comorbid depression and anxiety, and more pronounced disability at the time of diagnosis. Adults presenting to anxiety disorder clinics showed an ADHD prevalence of nearly 28%, well above general population estimates. Screening of individuals in police custody identified half of those without an existing diagnosis as warranting further assessment. Adults in managed care who screened positive for ADHD but had never received a formal diagnosis demonstrated significantly greater functional impairment and comorbidity burden than non-ADHD controls. These findings suggest that clinicians in primary care, anxiety and mood disorder services, substance use treatment, and forensic settings should maintain a higher index of suspicion for undiagnosed ADHD.

Fifth, the risk of false-positive diagnosis is not trivial and demands active management. Between 25% and 48% of self-referred college students may exaggerate ADHD symptoms, and symptom validity measures showed divergent classification results in clinical ADHD samples, with failure rates ranging from 8% to 49% depending on the measure used. One study found that 71% of patients making suspect effort on validity testing would have been misdiagnosed with ADHD based on clinical interview alone. Embedding performance and symptom validity tests within the assessment battery — rather than relying solely on face-valid self-report — appears important for reducing misclassification, particularly in settings where secondary gain may be a factor.

Sixth, the age-related properties of commonly used instruments deserve attention. The ASRS Part A screener may systematically underestimate ADHD severity in older adults because of differential item functioning in hyperactivity items, while the SDQ required a lower cut-point at age 25 than at younger ages. Clinicians assessing adults across the lifespan should be aware that standard cut-points may not perform equivalently in all age groups.

Finally, clinician knowledge and practice patterns themselves appear to be a source of diagnostic variability. A survey of clinicians performing adult ADHD assessments found limited consensus on core symptoms, infrequent use of structured interviews, and rare collection of collateral reports — practices that diverge from guideline recommendations and may contribute to both over- and underdiagnosis depending on the clinical context.

These implications should be interpreted in light of the limitations pervading the evidence base: most included studies were cross-sectional, many relied on convenience or clinical samples with limited demographic diversity, and few directly compared comprehensive versus abbreviated diagnostic approaches in the same population using a rigorous reference standard. The findings collectively suggest that diagnostic accuracy in adult ADHD is best served by structured, multi-source assessment with embedded validity checks, but the optimal configuration of such an approach — and its feasibility across diverse clinical settings — remains to be established by adequately powered comparative studies.

Effect Estimates by Study

Study Measure Value 95% CI p-value
Elena von Wirth, 2020 Cohen's d 1.16 NR 0.003
Minha Hong, 2020 sensitivity/specificity NR NR NR
Lenard A. Adler, 2009 OR 1.75 1.14–2.68 0.01
André Høberg, 2024 Incremental Lee R2 (percentage points) 0.565 NR 0.0103
Annie Stewart, 2012 Wilks's Lambda 0.565 NR < .0001
Debjani Das, 2016 AUC 0.84 0.81–0.87 0.002
Allyson G. Harrison, 2019 Cohen's d 2.56 NR < .001
Eugenia I. Gorlin, 2016 Cohen's Kappa 0.88 NR NR
Michael Van Ameringen, 2010 prevalence rate 27.9 NR NR
Tianhua Chen, 2021 Accuracy 85.507 NR NR
Aldo Pereira, 2024 Cohen's d 0.323 NR < 0.001
Hong, 2020 diagnostic accuracy 92 NR NR
Jasmine Hines, 2012 Sensitivity 1.0 NR NR
Stéphanie Baggio, 2020 AUC 0.945 NR NR
Belén Roselló Miranda, 2020 partial eta squared 0.3 NR < .001
Lucy Riglin, 2021 AUC 0.9 0.87–0.93 NR
Staffan Söderström, 2013 AUC 0.99 NR NR
Mariano Gabriel Scandar, 2021 RMSEA 0.075 0.068–0.083 p < .001
Josep Antoni Ramos‐Quiroga, 2016 Kappa 1.0 NR NR
Margaret H. Sibley, 2021 Cohen's d 2.56 NR < .001
Marios Adamou, 2026 Sensitivity 100.0 61.0–100.0 NR
Lida Zamani, 2020 Cohen kappa 0.642 NR NR
Shanel Chandra, 2016 β (standardized regression coefficient) 0.19 NR 3.58E-03
Morgan B. Drake, 2017 OR 3.1 1.1–8.9 0.023
Johanna Waltereit, 2025 AUC 0.97 NR < 0.001
Paulo Mattos, 2018 ASRS sensitivity 0.97 NR NR
Joel Paris, 2015 OR 3.1 1.1–8.9 0.023
Richard Pettersson, 2015 Nagelkerke's R2 0.655 NR < .001
Yu‐Ju Lin, 2015 η² 0.28 NR < 0.001
Berk Ustun, 2017 AUC 0.94 NR NR
Jan Loney, 2007 median correlation 0.16 NR < .05
Roni Y Kraut, 2025 OR 1.25 0.98–1.58 0.07
Margaret H. Sibley, 2012 OR 1755.38 NR p < 0.01
Hui Dong, 2023 Cramer's V 0.051 NR 0.274
Friederike Blume, 2025 AUC 0.985 NR < .001
Geurt van de Glind, 2013 Pearson correlation (r) 0.3 NR < 0.001
Josep Antoni Ramos‐Quiroga, 2012 Kappa 0.6 0.53–0.66 NR
Salvatore Mannuzza, 2002 OR 3.1 1.1–8.9 0.023
Joel Young, 2023 AUC 0.895 0.835–0.954 NR
Bhathika Perera, 2019 Sensitivity (clinical opinion) 0.82 0.74–0.89 < 0.001
Susana Farcas, 2018 AUC 1.0 NR NR
Miriam Becke, 2022 Cohen's d 1.551 0.608–2.494 < 0.01
Erlend J. Brevik, 2020 AUC 0.956 0.946–0.965 NR
Allyson G. Harrison, 2016 phi 0.046 NR 0.773
Myriam J. Sollman, 2010 Cohen's d 0.02 NR NR
Noa Givon-Schaham, 2026 Expected Total Score gap (Part A, age 20 vs 80 at θ=0) -1.36 NR < .002
Andrew C. Hale, 2020 RR (annual prevalence increase in PC) 1.27 1.27–1.27 < 0.001

3.3 Certainty of Evidence

Overall confidence in the available evidence is very low given the heterogeneity of study designs, indirectness relative to the prespecified PICO, imprecision, and risk of bias across the included studies.

4. Discussion

Summary of Evidence

This scoping review mapped the available evidence across 80 included articles addressing whether adult ADHD is overdiagnosed, underdiagnosed, or both, and what assessment practices best support diagnostic accuracy. The evidence base was methodologically diverse — spanning cross-sectional diagnostic accuracy studies, prospective longitudinal cohorts, psychometric validation investigations, systematic and narrative reviews, analogue simulation studies, clinician surveys, and case reports — and this heterogeneity is itself a central finding, reflecting the breadth of the diagnostic question rather than a single answerable hypothesis.

The most consistent pattern to emerge was that brief self-report screening instruments, while sensitive, substantially overestimate ADHD prevalence when used without structured confirmatory assessment. Across university, primary care, forensic, and comorbid psychiatric populations, screen-positive rates routinely exceeded expected prevalence by a factor of three to eight. Structured and semi-structured diagnostic interviews — particularly the DIVA instruments — demonstrated stronger discriminative validity, and multi-method approaches incorporating informant data, developmental history, and validity testing improved classification accuracy beyond any single modality.

At the same time, converging evidence from anxiety disorder clinics, managed-care populations, forensic settings, and sex-stratified analyses suggested that ADHD remains underrecognized in women, adults with internalizing comorbidities, and individuals in criminal-justice or substance-use-treatment contexts. The direction of diagnostic error thus appears to depend heavily on the clinical pathway, the population's comorbidity profile, and the rigor of the assessment method employed. Retrospective childhood symptom recall — a requirement of current diagnostic frameworks — was consistently shown to be unreliable, and symptom validity testing identified substantial rates of noncredible presentation in self-referred populations. Because most included studies were cross-sectional and conducted in specialty settings, these findings should be interpreted as mapping the landscape of available evidence rather than establishing definitive prevalence estimates or causal relationships.

Limitations of this Review

Several methodological constraints of this review warrant explicit acknowledgment. The search was restricted to PMC Open Access articles, which may introduce coverage bias by excluding relevant studies published in subscription-only journals, non-English-language sources, or grey literature repositories. Findings from proprietary diagnostic platforms, industry-sponsored validation studies behind paywalls, and regional journals not indexed in PMC may therefore be absent from the evidence map.

The entire screening, data-extraction, and synthesis process was conducted by a single reviewer, without independent dual screening or dual extraction. This approach increases the risk of selection bias and extraction error relative to multi-reviewer workflows with consensus arbitration.

The review employed narrative synthesis without statistical pooling. No meta-analysis was performed, and no summary effect estimates were calculated. Given the heterogeneity of study designs, populations, comparators, and outcome definitions across the 80 included articles, quantitative pooling would have been inappropriate for most comparisons; nonetheless, the absence of formal meta-analytic synthesis limits the precision with which the strength and direction of evidence can be characterized.

Implications for Future Research

The evidence gaps identified in this review point toward several concrete research priorities anchored to the prespecified population, exposure, comparator, and outcome elements.

No included study compared telehealth-based ADHD evaluations with in-person assessment on diagnostic accuracy or concordance — a striking absence given the rapid expansion of remote diagnostic services. Randomized or quasi-experimental comparisons of comprehensive versus abbreviated diagnostic models, with diagnostic accuracy and downstream functional outcomes as endpoints, are needed to inform service design.

The influence of social media ADHD content on self-identification and diagnostic demand was not examined in any included study, despite its prominence in contemporary clinical discourse. Prospective studies measuring how exposure to online ADHD content shapes symptom endorsement, help-seeking behavior, and diagnostic outcomes would address this gap directly.

Diagnostic accuracy in adults over age 50, in occupational health settings, and in low- and middle-income countries outside Latin America was unrepresented. Validation studies of commonly used instruments in these populations and settings are needed to establish whether existing cut-points and normative data generalize beyond the younger, Western, clinic-based samples that dominate the current literature.

Finally, no study used blinded reassessment or long-term functional outcome validation to estimate false-positive and false-negative rates. Prospective cohort studies incorporating independent blinded re-evaluation and multi-year follow-up of academic, occupational, and quality-of-life trajectories would provide the strongest available test of whether current diagnostic practices identify the right patients.

5. Conclusions

This scoping review of 80 articles indicates that adult ADHD may be simultaneously overdiagnosed in some clinical contexts and underdiagnosed in others, with the direction of diagnostic error shaped by the assessment method employed, the population screened, and the prevalence of symptom-mimicking comorbidities. Available evidence consistently suggests that brief self-report screening instruments, while sensitive, produce unacceptably high false-positive rates when used without structured diagnostic follow-up — particularly among individuals with borderline personality disorder, bipolar disorder, or substance use disorders. Conversely, women, adults in forensic settings, and those presenting primarily with internalizing complaints appear to remain underrecognized. Retrospective recall of childhood symptoms may be unreliable, and symptom validity testing appears underutilized despite evidence that a substantial proportion of self-referred individuals exaggerate symptoms. Adequately powered comparative trials evaluating comprehensive multi-informant diagnostic pathways against abbreviated approaches across diverse clinical settings are needed before firm recommendations regarding optimal assessment configuration can be made.

Pipeline Limitations

  1. Open-access full-text constraint: Full-text retrieval was restricted to freely accessible open-access sources and repositories. Paywalled subscription journals could not be retrieved through this review's framework. Where a reviewer-provided source was used to supplement retrieval for an individual article, that article is flagged in Appendix B for the reviewer's consideration before submission.

  2. Single-reviewer relevance screening: Relevance screening was performed without a second independent reviewer. PRISMA 2020 expects dual-independent screening; the screening procedure used here applied a documented relevance threshold instead. Reviewers should review and override the included set as needed.

  3. No backward citation chasing: The system does not perform citation chasing (forward or backward) beyond the initial Boolean retrieval. Articles cited by included studies are not auto-retrieved.

  4. No grey literature: The system does not search grey literature sources (conference abstracts, dissertations, clinical-trial registry results-only entries, regulatory filings) beyond the structured ClinicalTrials.gov registry index.

  5. Retrieval ceilings: Per-database retrieval is capped (PubMed: 10000; OpenAlex: 10000; ClinicalTrials.gov: 500). Queries returning more results than the ceiling will silently truncate to the cap.

  6. Boolean query scope: The Boolean query is assembled as (Population) AND (Intervention/Exposure) only, per Cochrane §4.4.4. Comparator and Outcome filtering happens at the relevance-scoring stage, not at the search-query stage.

  7. Personal Reference Library Augmentation. Matching was performed on PubMed ID, DOI, and PubMed Central ID; ClinicalTrials.gov NCT identifiers were not matched. The personal reference library contents reflect the state of the reviewer's storage at the time of retrieval; the retrieval pool is pinned per project after the first retrieval (see Methods §2.3) so re-runs of subsequent analysis stages return the same set, but a fresh retrieval after library changes would yield a different supplemental count. Personal reference library articles bypass the relevance screening that database-retrieved records undergo and require explicit reviewer confirmation of inclusion criteria — see Appendix B.

  8. Retraction screening: PubMed and PMC metadata were inspected for retraction signals at retrieval time. Articles flagged as retracted are annotated in §3.1 Study Characteristics and surfaced for reviewer confirmation in Appendix B. The system does not autonomously exclude retracted articles — inclusion is a reviewer decision.

Protocol and Registration

This review was not registered in PROSPERO or another registry. A pre-specified protocol is available as a supplementary file alongside this manuscript and may be submitted post-hoc to PROSPERO or OSF for registration.

Funding

This review received no specific funding from any agency in the public, commercial, or not-for-profit sectors.

Conflicts of Interest

The authors declare no competing interests.

Data Availability

All extracted data and the verbatim Boolean query used for retrieval are available in the supplementary materials. Source articles are publicly accessible via PubMed / PubMed Central / open-access providers.

References

  1. Drew Erhardt,Jeffery N. Epstein,C. Keith Conners,James D. A. Parker,Gill Sitarenios Self-ratings of ADHD symptoms in adults II: Reliability, validity, and diagnostic sensitivity Journal of Attention Disorders. 1999. doi:10.1177/108705479900300304.
  2. Erlend J. Brevik Adult Attention Deficit Hyperactivity Disorder. Beyond the Core Symptoms of the Diagnostic and Statistical Manual of Mental Disorders Bergen Open Research Archive (BORA) (University of Bergen). 2018.
  3. Elena von Wirth,Janet Mandler,Dieter Breuer,Manfred Döpfner The Accuracy of Retrospective Recall of Childhood ADHD: Results from a Longitudinal Study Journal of Psychopathology and Behavioral Assessment. 2020. doi:10.1007/s10862-020-09852-1.
  4. Minha Hong,J. J. Sandra Kooij,Bongseog Kim,Yoo‐Sook Joung,Hanik K. Yoo,Eui‐Jung Kim,Soyoung Irene Lee,Soo‐Young Bhang,Seung Yup Lee,Doug Hyun Han,Young Sik Lee,Geon Ho Bahn Validity of the Korean Version of DIVA-5: A Semi-Structured Diagnostic Interview for Adult ADHD Neuropsychiatric Disease and Treatment. 2020. doi:10.2147/ndt.s262995. PMID: 33116536.
  5. Lenard A. Adler,Frank Guida,Shirley Irons,John Rotrosen,Katherine O’Donnell Screening and Imputed Prevalence of ADHD in Adult Patients with Comorbid Substance Use Disorder at a Residential Treatment Facility Postgraduate Medicine. 2009. doi:10.3810/pgm.2009.09.2047. PMID: 19820269.
  6. André Høberg,Berit Skretting Solberg,Tor-Arne Hegvik,Jan Haavik Using polygenic scores in combination with symptom rating scales to identify attention-deficit/hyperactivity disorder. BMC Psychiatry. 2024;24(1):471. doi:10.1186/s12888-024-05925-7. PMID: 38937684.
  7. Sébastien Weibel,Rosetta Nicastro,Paco Prada,Pierre Cole,Eva Rüfenacht,Eléonore Pham,Alexandre Dayer,Nader Perroud Screening for attention-deficit/hyperactivity disorder in borderline personality disorder Journal of Affective Disorders. 2017. doi:10.1016/j.jad.2017.09.027. PMID: 28964997.
  8. Annie Stewart,Laura Liljequist Specificity of the CAARS in Discriminating ADHD Symptoms in Adults From Other Axis I Symptoms Journal of Attention Disorders. 2012. doi:10.1177/1087054712460086. PMID: 23074300.
  9. Lena Nylander Attention-deficit/hyperactivity disorder and autism spectrum disorders in adult psychiatric patients Gothenburg University Publications Electronic Archive (Gothenburg University). 2011.
  10. Margaret H. Sibley,John T. Mitchell,Stephen P. Becker Method of adult diagnosis influences estimated persistence of childhood ADHD: a systematic review of longitudinal studies The Lancet Psychiatry. 2016. doi:10.1016/s2215-0366(16)30190-0. PMID: 27745869.
  11. Debjani Das,Jorge I. Vélez,Maria T. Acosta,Maximilian Muenke,Mauricio Arcos‐Burgos,Simon Easteal Retrospective assessment of childhood ADHD symptoms for diagnosis in adults: validity of a short 8-item version of the Wender-Utah Rating Scale ADHD Attention Deficit and Hyperactivity Disorders. 2016. doi:10.1007/s12402-016-0202-9. PMID: 27510231.
  12. Hanna Christiansen,Bernhard Kis,Oliver Hirsch,Alexandra Philipsen,Johannes Hebebrand,Benno G. Schimmelmann The german version of the conners adult ADHD rating scales (CAARS) European Psychiatry. 2011. doi:10.1016/s0924-9338(11)72900-5.
  13. Allyson G. Harrison,Kathleen A. Harrison,Irene T. Armstrong Discriminating malingered attention Deficit Hyperactivity Disorder from genuine symptom reporting using novel Personality Assessment Inventory validity measures Applied Neuropsychology Adult. 2019. doi:10.1080/23279095.2019.1702043. PMID: 31852281.
  14. Eugenia I. Gorlin,Kristy Dalrymple,Iwona Chelminski,Mark Zimmerman Reliability and validity of a semi-structured DSM-based diagnostic interview module for the assessment of Attention Deficit Hyperactivity Disorder in adult psychiatric outpatients Psychiatry Research. 2016. doi:10.1016/j.psychres.2016.05.020. PMID: 27259136.
  15. Michael Van Ameringen,Catherine Mancini,William Simpson,Beth Patterson Adult Attention Deficit Hyperactivity Disorder in an Anxiety Disorders Population CNS Neuroscience & Therapeutics. 2010. doi:10.1111/j.1755-5949.2010.00148.x. PMID: 20406249.
  16. Tianhua Chen,Grigoris Antoniou,Marios Adamou,Ilias Tachmazidis,Pan Su Automatic Diagnosis of Attention Deficit Hyperactivity Disorder Using ML-Applied Artificial Intelligence. 2021. doi:10.1080/08839514.2021.1933761.
  17. Aldo Pereira,Vanesa Richarte,Christian Fadeuilhe,Montse Corrales,Estela García,Josep Antoni Ramos-Quiroga ADHD Rating Scale (ADHD-RS): Validation in Spanish in adult population according to the DSM-5. Span J Psychiatry Ment Health. 2024;17(1):46-50. doi:10.1016/j.sjpmh.2023.06.002. PMID: 38436988.
  18. Joseph Biederman Is ADHD Overdiagnosed in Scandinavia? Acta Psychiatrica Scandinavica. 2012. doi:10.1111/j.1600-0447.2012.01878.x. PMID: 22764752.
  19. Stephen V. Faraone,Kevin M. Antshel Diagnosing and treating attention‐deficit/hyperactivity disorder in adults World Psychiatry. 2008. doi:10.1002/j.2051-5545.2008.tb00179.x. PMID: 18836579.
  20. Páll Magnússon,Jakob Smári,Dagbjörg Sigurðardóttir,Gísli Baldursson,Jón Sigmundsson,Kristleifur Kristjánsson,Sólveig Sigurðardóttir,Stefán Hreiðarsson,Steingerður Sigurbjörnsdóttir,Ólafur Ó. Guðmundsson Validity of Self-Report and Informant Rating Scales of Adult ADHD Symptoms in Comparison With a Semistructured Diagnostic Interview Journal of Attention Disorders. 2006. doi:10.1177/1087054705283650. PMID: 16481666.
  21. Katie Grogan,Isobel Claire Gormley,Brendan Rooney,Robert Whelan,Hanni Kiiski,Marie Naughton,Jessica Bramham Differential diagnosis and comorbidity of ADHD and anxiety in adults British Journal of Clinical Psychology. 2017. doi:10.1111/bjc.12156. PMID: 28895146.
  22. Hong,Minha,Kooij,JJ Sandra,Kim,Bongseog,Joung YS,Hanik K. Yoo,Kim EJ,Lee SI,Soo‐Young Bhang,Lee Sy,Han DH,Lee Ys,Bahn GH Validity of the Korean Version of DIVA-5: A Semi-Structured Diagnostic Interview for Adult ADHD DOAJ (DOAJ: Directory of Open Access Journals). 2020.
  23. Daniel P. Notzon,Martina Pavlicová,Andrew Glass,John J. Mariani,Amy L. Mahony,Daniel J. Brooks,Frances R. Levin ADHD Is Highly Prevalent in Patients Seeking Treatment for Cannabis Use Disorders Journal of Attention Disorders. 2016. doi:10.1177/1087054716640109. PMID: 27033880.
  24. Margaret H. Sibley,Luís Augusto Rohde,James M. Swanson,Lily Hechtman,Brooke S. G. Molina,John T. Mitchell,L. Eugene Arnold,Arthur Caye,Traci M. Kennedy,Arunima Roy,Annamarie Stehli,for the Multimodal Treatment Study of Children with ADHD (MTA) Cooperative Group Late-Onset ADHD Reconsidered With Comprehensive Repeated Assessments Between Ages 10 and 25 American Journal of Psychiatry. 2017. doi:10.1176/appi.ajp.2017.17030298. PMID: 29050505.
  25. Leo Bastiaens,James Galus Comparison of the Adult ADHD Self Report Scale Screener for DSM-IV and DSM-5 in a Dually Diagnosed Correctional Population Psychiatric Quarterly. 2017. doi:10.1007/s11126-017-9553-4. PMID: 29270886.
  26. Ernest F. Johnson,Julie A. Suhr C-42 Is “Clinical Impairment” Normative in College Populations? Identifying Base Rates of Self-Reported Impairment in a Non-Treatment Seeking Population Archives of Clinical Neuropsychology. 2019. doi:10.1093/arclin/acz034.204.
  27. Jasmine Hines,T. S. King,William Curry The Adult ADHD Self-Report Scale for Screening for Adult Attention Deficit-Hyperactivity Disorder (ADHD) The Journal of the American Board of Family Medicine. 2012. doi:10.3122/jabfm.2012.06.120065. PMID: 23136325.
  28. Stéphanie Baggio,Sophie Bayard,Clémence Cabelguen,Martin Desseilles,M. Gachet,C. Kraemer,Hélène Richard-Lepouriel,Rosetta Nicastro,Stéphanie Bioulac,Anne Sauvaget,Sébastien Weibel,Nader Perroud,Régis Lopez Diagnostic Accuracy of the French Version of the Adult Attention Deficit / Hyperactivity Disorder Self-Report Screening Scale for DSM-5 (ASRS-5) Journal of Psychopathology and Behavioral Assessment. 2020. doi:10.1007/s10862-020-09822-7.
  29. Belén Roselló Miranda,Carmen Berenguer,Inmaculada Baixauli,Álvaro Mira,José Martı́nez-Raga,Ana Miranda Empirical examination of executive functioning, ADHD associated behaviors, and functional impairments in adults with persistent ADHD, remittent ADHD, and without ADHD BMC Psychiatry. 2020. doi:10.1186/s12888-020-02542-y. PMID: 32204708.
  30. Laura M. Garnier-Dykstra,Gillian M. Pinchevsky,Kimberly M. Caldeira,Kathryn B. Vincent,Amelia M. Arria Self-reported Adult Attention-Deficit/Hyperactivity Disorder Symptoms Among College Students Journal of American College Health. 2010. doi:10.1080/07448481.2010.483718. PMID: 20864440.
  31. Nina E Calmenson A New Subscale for the Personality Assessment Inventory (PAI) to Screen Adults for Attention-Deficit/Hyperactivity Disorder (ADHD) doi:10.12794/metadc1833444.
  32. Lucy Riglin,Sharifah Shameem Agha,Olga Eyre,Rhys Bevan Jones,Robyn E. Wootton,Ajay K Thapar,Ajay K Thapar,Stephan Collishaw,Evie Stergiakouli,K. Langley,Anita Thapar,Anita Thapar Investigating the validity of the Strengths and Difficulties Questionnaire to assess ADHD in young adulthood medRxiv. 2021. doi:10.1101/2021.02.02.20248239.
  33. Staffan Söderström,Richard Pettersson,Kent W. Nilsson Quantitative and subjective behavioural aspects in the assessment of attention-deficit hyperactivity disorder (ADHD) in adults Nordic Journal of Psychiatry. 2013. doi:10.3109/08039488.2012.762940. PMID: 23527787.
  34. Mariano Gabriel Scandar Validez y fiabilidad de las escalas ASRS y WURS-25 para el diagnóstico del trastorno por déficit de atención/hiperactividad en población argentina Revista de Neurología. 2021. doi:10.33588/rn.7203.2019381. PMID: 33506485.
  35. Mathias Luderer,Nurcihan Kaplan-Wickel,Agnes Richter,Iris Reinhard,Falk Kiefer,Tillmann Weber Screening for adult attention-deficit/hyperactivity disorder in alcohol dependent patients: Underreporting of ADHD symptoms in self-report scales Drug and Alcohol Dependence. 2018. doi:10.1016/j.drugalcdep.2018.11.020. PMID: 30583265.
  36. Brooke C. Schneider,Daniel Schöttle,Birgit Hottenrott,Jürgen Gallinat,Steffen Moritz Assessment of Adult ADHD in Clinical Practice: Four Letters—40 Opinions Journal of Attention Disorders. 2019. doi:10.1177/1087054719879498. PMID: 31625465.
  37. Josep Antoni Ramos‐Quiroga,Viviana Nasillo,Vanesa Richarte,Montserrat Corrales,Felipe Palma,Pol Ibáñez,Marieke Michelsen,Geurt van de Glind,Miguel Casas,J. J. Sandra Kooij Criteria and Concurrent Validity of DIVA 2.0: A Semi-Structured Diagnostic Interview for Adult ADHD Journal of Attention Disorders. 2016. doi:10.1177/1087054716646451. PMID: 27125994.
  38. Margaret H. Sibley Empirically-informed guidelines for first-time adult ADHD diagnosis Journal of Clinical and Experimental Neuropsychology. 2021. doi:10.1080/13803395.2021.1923665. PMID: 33949916.
  39. Marios Adamou,Tim Fullen,Rebecca Crisp,Emily Hallett,Lauren Spencer,Lydia Wharton,Sarah L Jones Validation of a criterion-based screening and triage pathway for adult ADHD: a prospective observational study of safety and operational efficiency. Front Psychiatry. 2026;17:1789686. doi:10.3389/fpsyt.2026.1789686. PMID: 42221313.
  40. Lida Zamani,Zahra Shahrivar,Javad Alaghband‐Rad,Vandad Sharifi‎,Elham Davoodi,Shadi Ansari,Fatemeh Emari,Dora Wynchank,J. J. Sandra Kooij,Philip Asherson Reliability, Criterion and Concurrent Validity of the Farsi Translation of DIVA-5: A Semi-Structured Diagnostic Interview for Adults With ADHD Journal of Attention Disorders. 2020. doi:10.1177/1087054720930816. PMID: 32486881.
  41. Shanel Chandra,Joseph Biederman,Stephen V. Faraone Assessing the Validity of the Age at Onset Criterion for Diagnosing ADHD in DSM-5 Journal of Attention Disorders. 2016. doi:10.1177/1087054716629717. PMID: 26922806.
  42. R. Mayes The ADHD Explosion: Myths, Medication, Money, and Today's Push for Performance Journal of Health Politics Policy and Law. 2015. doi:10.1215/03616878-2888627.
  43. Morgan B. Drake,Cynthia A. Riccio,Nicole Hale Assessment of Adult ADHD With College Students Journal of Attention Disorders. 2017. doi:10.1177/1087054717698222. PMID: 28355936.
  44. Saima Jehanzeb,Amna Javaid “Overdiagnosed or Misdiagnosed” – A Classic Case of Pseudohallucinations and Maladaptive Daydreaming When a Diagnosis of Adult ADHD Was Missed BJPsych Open. 2025. doi:10.1192/bjo.2025.10734.
  45. Johanna Waltereit,Martin Schulte-Rüther,Veit Roessner,Robert Waltereit Retrospective assessment of ICD-10/DSM-5 criteria of childhood ADHD from descriptions of academic and social behaviors in German primary school reports. Eur Child Adolesc Psychiatry. 2025;34(2):659-673. doi:10.1007/s00787-024-02509-4. PMID: 39046525.
  46. Paulo Mattos,Bruno Palazzo Nazar,Rosemary Tannock By the book: ADHD prevalence in medical students varies with analogous methods of addressing DSM items Brazilian Journal of Psychiatry. 2018. doi:10.1590/1516-4446-2017-2429. PMID: 29451590.
  47. Joel Paris,Venkat Bhat,Brett D. Thombs Is Adult Attention-Deficit Hyperactivity Disorder Being Overdiagnosed? The Canadian Journal of Psychiatry. 2015. doi:10.1177/070674371506000705. PMID: 26175391.
  48. Richard Pettersson,Staffan Söderström,Kent W. Nilsson Diagnosing ADHD in Adults: An Examination of the Discriminative Validity of Neuropsychological Tests and Diagnostic Assessment Instruments Journal of Attention Disorders. 2015. doi:10.1177/1087054715618788. PMID: 26681530.
  49. Sharon Suganthi Caroline S,Paulomi M Sudhir,Urvakhsh Meherwan Mehta,Arun Kandasamy,K Thennarasu,Vivek Benegal Assessing Adult ADHD: An Updated Review of Rating Scales for Adult Attention Deficit Hyperactivity Disorder (ADHD). J Atten Disord. 2024;28(7):1045-1062. doi:10.1177/10870547241226654. PMID: 38369740.
  50. Yu‐Ju Lin,Kuan-Wu Lo,Li-Kuang Yang,Susan Shur‐Fen Gau Validation of DSM-5 age-of-onset criterion of attention deficit/hyperactivity disorder (ADHD) in adults: Comparison of life quality, functional impairment, and family function Research in Developmental Disabilities. 2015. doi:10.1016/j.ridd.2015.07.026. PMID: 26318976.
  51. Berk Ustun,Lenard A. Adler,Cynthia Rudin,Stephen V. Faraone,Thomas Spencer,Patricia A. Berglund,Michael J. Gruber,Ronald C. Kessler The World Health Organization Adult Attention-Deficit/Hyperactivity Disorder Self-Report Screening Scale for DSM-5 JAMA Psychiatry. 2017. doi:10.1001/jamapsychiatry.2017.0298. PMID: 28384801.
  52. Jan Loney,Johannes Ledolter,John R. Kramer,Robert J. Volpe Retrospective ratings of ADHD symptoms made at young adulthood by clinic-referred boys with ADHD-related problems, their brothers without ADHD, and control participants. Psychological Assessment. 2007. doi:10.1037/1040-3590.19.3.269. PMID: 17845119.
  53. Ernesto Barceló,Alexandra León-Jacobus,Omar ar Cortes-Peña,Stephany ny Valle-Córdoba,Yuliana Andrea Florez Niño Validación del inventario exploratorio de síntomas de TDAH (IES-TDAH) ajustado al DSM-V DSPACE System - Metalibrary (University of the Coast). 2016.
  54. Roni Y Kraut,Christian Ono,Scott Garrison,Omar Kamal,Marissa L Doroshuk,Ben Vandermeer,Gerard Amanna,Oksana Babenko Does the format of the adult ADHD self-report scale influence screen-positive rates? A randomized controlled trial in primary care. Front Psychiatry. 2025;16:1646293. doi:10.3389/fpsyt.2025.1646293. PMID: 41551194.
  55. Margaret H. Sibley,William E. Pelham,Brooke S. G. Molina,Elizabeth M. Gnagy,James G. Waxmonsky,Daniel A. Waschbusch,Karen J. Derefinko,Brian T. Wymbs,Allison Garefino,Dara E. Babinski,Aparajita B. Kuriyan When diagnosing ADHD in young adults emphasize informant reports, DSM items, and impairment. Journal of Consulting and Clinical Psychology. 2012. doi:10.1037/a0029098. PMID: 22774792.
  56. Hui Dong,Janneke Koerts,Gerdina H. M. Pijnenborg,Norbert Scherbaum,Bernhard Müller,Anselm B. M. Fuermaier Cognitive Underperformance in a Mixed Neuropsychiatric Sample at Diagnostic Evaluation of Adult ADHD Journal of Clinical Medicine. 2023. doi:10.3390/jcm12216926. PMID: 37959391.
  57. N Lee,Melvyn Wei Bin Zhang Systematic review on prevalence of ADHD, possible ADHD or ADHD symptoms in medical students Frontiers in Psychiatry. 2025. doi:10.3389/fpsyt.2025.1684727. PMID: 41409340.
  58. Catherine M. McCormick-Deaton,Sarah Mohiuddin New Onset ADHD Symptoms in Adolescents and College Students: Diagnostic Challenges and Recommendations Adolescent Psychiatry. 2018. doi:10.2174/2210676608666180208162023.
  59. Friederike Blume,Lilly Buhr,Jan Kühnhausen,Rieke Köpke,Lydia A Weber,Andreas J Fallgatter,Thomas Ethofer,Caterina Gawrilow Validation of the Self-Report Version of the German Strengths and Weaknesses of ADHD Symptoms and Normal Behavior Scale (SWAN-DE-SB). Assessment. 2025;32(1):130-146. doi:10.1177/10731911241236699. PMID: 38523357.
  60. Geurt van de Glind,Wim van den Brink,Maarten W.J. Koeter,Pieter J. Carpentier,Katelijne van Emmerik‐van Oortmerssen,Sharlene Kaye,Arvid Skutle,Eli Torild Hellandsjø Bu,Johan Franck,Maija Konstenius,Franz Moggi,Geert Dom,Sofie Verspreet,Zsolt Demetrovics,Máté Kapitány‐Fövény,Mélina Fatseas,Marc Auriacombe,Arild Schillinger,Andrea Seitz,Brian Johnson,Stephen V. Faraone,Josep Antoni Ramos‐Quiroga,Miguel Casas,Steve Allsop,Susan Carruthers,Csaba Barta,Robert A. Schoevers,Frances R. Levin Validity of the Adult ADHD Self-Report Scale (ASRS) as a screener for adult ADHD in treatment seeking substance use disorder patients Drug and Alcohol Dependence. 2013. doi:10.1016/j.drugalcdep.2013.04.010. PMID: 23660242.
  61. Sasa L. Kivisaari Self-rating scales in the assessment of current and childhood symptoms of attention deficit hyperactivity disorder in adults Työväentutkimus Vuosikirja. 2008.
  62. Stephen L. Able,Joseph A. Johnston,Lenard A. Adler,Ralph Swindle Functional and psychosocial impairment in adults with undiagnosed ADHD Psychological Medicine. 2006. doi:10.1017/s0033291706008713. PMID: 16938146.
  63. L. I. Birtalan,S. Bálint,Tünde Kilencz,János Réthelyi New approaches in the neuropsychological evaluation of adult ADHD European Psychiatry. 2024. doi:10.1192/j.eurpsy.2024.787.
  64. Josep Antoni Ramos‐Quiroga,Rosa Bosch,Vanesa Richarte,Sergi Valero,Núria Gómez-Barros,Mariana Nogueira,Glòria Palomar,Montse Corrales,Naia Sáez‐Francàs,M. Corominas,Alberto Real,Raquel Vidal,Pablo J. Chalita,Miguel Casas Validez de criterio y concurrente de la versión española de la Conners Adult ADHD Diagnostic Interview for DSM-IV Revista de Psiquiatría y Salud Mental. 2012. doi:10.1016/j.rpsm.2012.05.004. PMID: 23021295.
  65. Michael Rösler,Wolfgang Retz,Johannes Thome,Marc Schneider,Rolf‐Dieter Stieglitz,Peter Falkai Psychopathological rating scales for diagnostic use in adults with attention-deficit/hyperactivity disorder (ADHD) European Archives of Psychiatry and Clinical Neuroscience. 2006. doi:10.1007/s00406-006-1001-7. PMID: 16977549.
  66. Stacy Jean Graves Validity and Diagnostic Accuracy of an ADHD Symptom Rating Scale for Identifying Adults with ADHD Digital Scholarship - UNLV (University of Nevada Reno). 2022. doi:10.34917/23469725.
  67. Salvatore Mannuzza,Rachel G. Klein,Donald F. Klein,Abrah Bessler,Patrick E. Shrout Accuracy of Adult Recall of Childhood Attention Deficit Hyperactivity Disorder American Journal of Psychiatry. 2002. doi:10.1176/appi.ajp.159.11.1882. PMID: 12411223.
  68. Mary V. Solanto,Kenneth Etefia,David J. Marks The Utility of Self-Report Measures and the Continuous Performance Test in the Diagnosis of ADHD in Adults CNS Spectrums. 2004. doi:10.1017/s1092852900001929. PMID: 15337862.
  69. Joel Young,Richard N. Powell,Celeste Zabel,Jaime Saal,Lisa L. M. Welling,Jillian Fortain,Ashley Ceresnie Development and Validation of the ADHD Symptom and Side Effect Tracking - Baseline Scale (ASSET-BS): A Novel Short Screening Measure for ADHD in Clinical Populations Research Square. 2023. doi:10.21203/rs.3.rs-2971206/v1.
  70. Bhathika Perera,Ken Courtenay,Solomis Solomou,Aditya Borakati,André Strydom Diagnosis of Attention Deficit Hyperactivity Disorder in Intellectual Disability: Diagnostic and Statistical Manual of Mental Disorder V versus clinical impression Journal of Intellectual Disability Research. 2019. doi:10.1111/jir.12705. PMID: 31808234.
  71. Susana Farcas Psychometric properties of the Hungarian version of the adult ADHD self-report scale (asrs-v1.1) screener and symptom checklist doi:10.33422/3hsconf.2018.09.10.
  72. Miriam Becke,Lara Tucha,Matthias Weisbrod,Steffen Aschenbrenner,Oliver Tucha,Anselm B. M. Fuermaier Joint Consideration of Validity Indicators Embedded in Conners’ Adult ADHD Rating Scales (CAARS) Psychological Injury and Law. 2022. doi:10.1007/s12207-022-09445-1.
  73. Akiko Nishikawa,Dan Nakamura,Nobuyuki Saga,Daisuke Ikuse,Kenji Sanada,Akira Iwanami Influence of autism spectrum disorder tendencies on the clinical symptoms of adult attention-deficit/hyperactivity disorder The Showa University Journal of Medical Sciences. 2024. doi:10.15369/sujms.36.133.
  74. Erlend J. Brevik,Astri J. Lundervold,Jan Haavik,Maj‐Britt Posserud Validity and accuracy of the Adult Attention‐Deficit/Hyperactivity Disorder (ADHD) Self‐Report Scale (ASRS) and the Wender Utah Rating Scale (WURS) symptom checklists in discriminating between adults with and without ADHD Brain and Behavior. 2020. doi:10.1002/brb3.1605. PMID: 32285644.
  75. Allyson G. Harrison,Sylvia Anna Nay,Irene T. Armstrong Diagnostic Accuracy of the Conners’ Adult ADHD Rating Scale in a Postsecondary Population Journal of Attention Disorders. 2016. doi:10.1177/1087054715625299. PMID: 26794674.
  76. Allyson G. Harrison,Sandra J. Alexander,Irene T. Armstrong Higher Reported Levels of Depression, Stress, and Anxiety Are Associated With Increased Endorsement of ADHD Symptoms by Postsecondary Students Canadian Journal of School Psychology. 2013. doi:10.1177/0829573513480616.
  77. Myriam J. Sollman,John D. Ranseen,David T. R. Berry Detection of feigned ADHD in college students. Psychological Assessment. 2010. doi:10.1037/a0018857. PMID: 20528060.
  78. Marius Grandjean,Shachar Hochman,Raja Mukherjee,Roi Cohen Kadosh Malingering in ADHD behavioral rating scales: recommendations for research contexts Frontiers in Psychiatry. 2025. doi:10.3389/fpsyt.2025.1532807. PMID: 39967578.
  79. Noa Givon-Schaham,Nir Shalev Measurement Equivalence of the ASRS Across the Adult Lifespan: A Differential Item Functioning Analysis medRxiv. 2026. doi:10.64898/2026.04.06.26350233.
  80. Andrew C. Hale,Kipling M. Bohnert,Robert J. Spencer,Dara Ganoczy,Paul N. Pfeiffer The Prevalence and Incidence of Attention-deficit/Hyperactivity Disorder in the Veterans Health Administration From 2009 to 2016 Medical Care. 2020. doi:10.1097/mlr.0000000000001287. PMID: 32049948.
Appendix A — PRISMA 2020 Checklist
Appendix B — Reviewer Worklist

About this Evidence Review

Generated 2026-06-05 · Last reviewed by Dr. Margarita Krasnova, MD

  • Articles identified: 1165
  • Open-access studies retrieved: 534
  • Studies included in this review: 80
  • Relevance rate: 45.8%
  • PubMed: 25
  • OpenAlex: 709
  • ClinicalTrials.gov: 288
  • Author's reference collection: 143

This is a structured review of currently accessible medical studies, NOT a Cochrane review. It is general educational information, not personalized medical advice. Your individual situation may differ; consult your physician.

Suggested citation

Krasnova M. Evidence Review: Accuracy of Adult ADHD Diagnostic Practices — Over-Diagnosis, Under-Diagnosis, and Optimal Assessment Approaches margaritakrasnovamd.com. Last reviewed 2026-06-05. Available at: https://margaritakrasnovamd.com/evidence-review/adhd-diagnostic-accuracy-adults.html

Read the Clinical Answer → How these reviews are made (Methodology) ← All Evidence Reviews