Glossary

What Is XPath? Querying XML and HTML Documents

XPath is a query language for selecting nodes in XML and HTML — like CSS selectors but more powerful. Plain explanation with practical examples for scraping and testing.

Short answer

XPath (XML Path Language) is a query language for selecting nodes in XML or HTML documents. It's like CSS selectors but more expressive — you can navigate parents, siblings, axes, and apply complex predicates. Browsers, scrapers, and test frameworks all support XPath natively.

Basic syntax

ExpressionSelects
/html/body/divDirect child path from root
//divAny <div> in the document
//a[@href]Any link with an href attribute
//a[@class="btn"]Links with class exactly "btn"
//a[contains(@class,"btn")]Links whose class contains "btn"
//ul/li[1]First list item under any <ul>
//ul/li[last()]Last list item
//div[text()="Hello"]Div containing exact text "Hello"
//*[@id="main"]Any element with id="main"
//input[@type="email"]Email-type inputs
//button[contains(., "Submit")]Button containing "Submit" anywhere in its text

Why XPath beats CSS selectors sometimes

CSS SelectorsXPath
Match by textNoYes (text(), contains())
Walk to parentNo (without :has)Yes (.., parent::)
Sibling navigationLimitedFull (preceding/following axes)
Index from end:nth-last-childlast(), last()-1
Predicates with logicComma OR onlyand, or, not()
Speed in browsersFasterSlower
ReadabilityCleanerVerbose

Where XPath shines

  • Test automation: Selenium and Playwright accept XPath; selectors based on visible text are dramatically more stable than CSS class hashes
  • Web scraping: "find the next sibling of this label" is one XPath expression vs many lines of JS DOM walking
  • XML processing: SOAP envelopes, RSS feeds, sitemaps, configuration files — all XPath-native
  • Browser DevTools: in Chrome/Firefox console, $x("//a") evaluates an XPath; very useful for debugging

Common test selectors

// Click button labeled "Save"
//button[normalize-space()="Save"]

// Find input following a label "Email"
//label[text()="Email"]/following-sibling::input[1]

// Find row in a table by cell content
//tr[td[contains(., "Order #1234")]]

// Cell to the right of "Total:" label
//td[text()="Total:"]/following-sibling::td[1]

Versions

  • XPath 1.0 (1999) — what browsers support, simple syntax
  • XPath 2.0/3.0/3.1 — added regex, sequences, math; only XSLT engines (Saxon) support these

For browser/Selenium/Playwright work, you're using XPath 1.0.

JSON equivalent

XPath is XML-specific. For querying JSON, use JSONPath — same idea, JSON syntax. Try our JSONPath tester.

Related tools

Test patterns against text (XPath has limited regex; sometimes you need real regex): regex tester. Convert HTML entities encountered in scraped text: HTML encoder/decoder.

Featured Tools

Try these free tools directly in your browser — no sign-up required.

what is xpath xpath explained xpath vs css selector xpath query xml query

Explore 300+ Free Tools

Utilko has tools for developers, writers, designers, students, and everyday users — all free, all browser-based.