Selenium with Python: how to pull the data without id and class?

Question

I am using python with selenium to pull data from the website below: http://www.worldhospitaldirectory.com/klinik-fur-anaesthesiologie-und-intensivmedizin/info/4181

Screenshot of Page As you can see from this picture. I want to get Hospital Name, Category, Address, Country, Phone, Website, and email.

But when I inspect elements, I found that there are no id, or class name to do.
Category: General Hospitals
Address: .....

I really have no idea how to pull them from this website. Please help me or give me some advices.

You could just pull the `outerHTML` and parse it like a string. — gold_cy, Feb 03 '17 at 21:29
If the language does not change then you can use some xpaths based on text or if the structure is the same, each time the same then you can try based on position. — lauda, Feb 04 '17 at 10:55

Jakub Obstarczyk · Accepted Answer · 2017-02-05T10:49:35.543

0

you should be able to find html tag which contains 'Category' text (below c# code):

var category = driver.FindElement(By.XPath("//b[contains(., 'Category')]"));

[edit]

to get text of that element:

var textOfCategoryField = category.Text;

and to grab values from other fields just replace string 'Category' for each element:

var textOfAddressField = driver.FindElement(By.XPath("//b[contains(., 'Address')]")).Text;
var textOfCountryField = driver.FindElement(By.XPath("//b[contains(., 'Country')]")).Text;

etc..

edited Feb 05 '17 at 10:49

answered Feb 04 '17 at 11:04

Jakub Obstarczyk

100
1
3

Thanks. I tried. It worked to pull 'Catergory:' from web page. But I want to get the text after this Catergory:. What should I do? Can i use Xpath to locate them? – Peter Cui Feb 05 '17 at 03:43
use Text property on element – Jakub Obstarczyk Feb 05 '17 at 10:48
Thanks Jakub, now I can get the text "Category:" only. The original web source is like that Category:General Hospitals
, I cannot get the text "general hosiptals". – Peter Cui Feb 05 '17 at 14:12
ah sorry now I see that 'General Hospitals' is text node, try the solution described here: http://stackoverflow.com/questions/8505375/getting-text-from-a-node/ – Jakub Obstarczyk Feb 05 '17 at 16:22

Selenium with Python: how to pull the data without id and class?

1 Answers1