What Is XPath? Querying XML and HTML Documents

Short answer

XPath (XML Path Language) is a query language for selecting nodes in XML or HTML documents. It's like CSS selectors but more expressive — you can navigate parents, siblings, axes, and apply complex predicates. Browsers, scrapers, and test frameworks all support XPath natively.

Basic syntax

Expression	Selects
`/html/body/div`	Direct child path from root
`//div`	Any `<div>` in the document
`//a[@href]`	Any link with an href attribute
`//a[@class="btn"]`	Links with class exactly "btn"
`//a[contains(@class,"btn")]`	Links whose class contains "btn"
`//ul/li[1]`	First list item under any `<ul>`
`//ul/li[last()]`	Last list item
`//div[text()="Hello"]`	Div containing exact text "Hello"
`//*[@id="main"]`	Any element with id="main"
`//input[@type="email"]`	Email-type inputs
`//button[contains(., "Submit")]`	Button containing "Submit" anywhere in its text

Why XPath beats CSS selectors sometimes

	CSS Selectors	XPath
Match by text	No	Yes (`text()`, `contains()`)
Walk to parent	No (without :has)	Yes (`..`, `parent::`)
Sibling navigation	Limited	Full (preceding/following axes)
Index from end	:nth-last-child	last(), last()-1
Predicates with logic	Comma OR only	and, or, not()
Speed in browsers	Faster	Slower
Readability	Cleaner	Verbose

Where XPath shines

Test automation: Selenium and Playwright accept XPath; selectors based on visible text are dramatically more stable than CSS class hashes
Web scraping: "find the next sibling of this label" is one XPath expression vs many lines of JS DOM walking
XML processing: SOAP envelopes, RSS feeds, sitemaps, configuration files — all XPath-native
Browser DevTools: in Chrome/Firefox console, $x("//a") evaluates an XPath; very useful for debugging

Common test selectors

// Click button labeled "Save"
//button[normalize-space()="Save"]

// Find input following a label "Email"
//label[text()="Email"]/following-sibling::input[1]

// Find row in a table by cell content
//tr[td[contains(., "Order #1234")]]

// Cell to the right of "Total:" label
//td[text()="Total:"]/following-sibling::td[1]

Versions

XPath 1.0 (1999) — what browsers support, simple syntax
XPath 2.0/3.0/3.1 — added regex, sequences, math; only XSLT engines (Saxon) support these

For browser/Selenium/Playwright work, you're using XPath 1.0.

JSON equivalent

XPath is XML-specific. For querying JSON, use JSONPath — same idea, JSON syntax. Try our JSONPath tester.

Related tools

Test patterns against text (XPath has limited regex; sometimes you need real regex): regex tester. Convert HTML entities encountered in scraped text: HTML encoder/decoder.

What Is XPath? Querying XML and HTML Documents

Short answer

Basic syntax

Why XPath beats CSS selectors sometimes

Where XPath shines

Common test selectors

Versions

JSON equivalent

Related tools

Featured Tools

Regex Tester

JSONPath Tester

HTML Encoder / Decoder

Explore 300+ Free Tools