0

I am using python with selenium to pull data from the website below: http://www.worldhospitaldirectory.com/klinik-fur-anaesthesiologie-und-intensivmedizin/info/4181

Screenshot of Page As you can see from this picture. I want to get Hospital Name, Category, Address, Country, Phone, Website, and email.

But when I inspect elements, I found that there are no id, or class name to do.
Category: General Hospitals
Address: .....

I really have no idea how to pull them from this website. Please help me or give me some advices.

Peter Cui
  • 419
  • 1
  • 4
  • 8

1 Answers1

0

you should be able to find html tag which contains 'Category' text (below c# code):

var category = driver.FindElement(By.XPath("//b[contains(., 'Category')]"));

[edit]

to get text of that element:

var textOfCategoryField = category.Text;

and to grab values from other fields just replace string 'Category' for each element:

var textOfAddressField = driver.FindElement(By.XPath("//b[contains(., 'Address')]")).Text;
var textOfCountryField = driver.FindElement(By.XPath("//b[contains(., 'Country')]")).Text;

etc..

  • Thanks. I tried. It worked to pull 'Catergory:' from web page. But I want to get the text after this Catergory:. What should I do? Can i use Xpath to locate them? – Peter Cui Feb 05 '17 at 03:43
  • use Text property on element – Jakub Obstarczyk Feb 05 '17 at 10:48
  • Thanks Jakub, now I can get the text "Category:" only. The original web source is like that Category:General Hospitals
    , I cannot get the text "general hosiptals".
    – Peter Cui Feb 05 '17 at 14:12
  • ah sorry now I see that 'General Hospitals' is text node, try the solution described here: http://stackoverflow.com/questions/8505375/getting-text-from-a-node/ – Jakub Obstarczyk Feb 05 '17 at 16:22