HTML

Syndicate content 6 

I would like to write some code to parse a set of HTML pages from the internet in order to gather information from each web page.

All of the web pages are generated using a template, so the format of each of the web-pages is consistent with one-another and the information that I want to gather is always located in the same logical place within the page.

What is the best way to parse an html page in order to gather information at a specific place?

Can XML XPATH be used here?  Does anyone have any examples of parsing HTML content?

Last reply 15 December 2017
+ 2   0 2
937

views

+ 2

rating