Pandas: Compare row with all other rows by multiple conditions

Question

I want to compare the all the rows (one-by-one) with all the other rows in the following extract of my dataframe.

Idx   ECTRL ID   Latitude    Longitude
0     186858227  53.617750   30.866759
1     186858229  40.569012   35.138237
2     186858235  38.915970   38.782447
3     186858295  39.737594   37.005481
4     186858299  48.287601   15.487567

I want to extract "ECTRL ID"-Combinations (e.g. 186858235, 186858295), where the differences of longitude and latitude are both less than 2.

e.g.:
df.iloc[2]["Latitude"] - df.iloc[3]["Latitude"] <= 2

if its true then i want to return it as a tuple and append it to a list. (186858235, 186858295)

It works with a loop but its pretty slow:

l = []
for idx, row in data.iterrows():
    for j, row2 in data.iterrows():
        if np.absolute(row['Longitude'] - row2['Longitude']) < 0.05 and np.absolute(row['Latitude'] - row2['Latitude']) < 0.05 and row["ECTRL ID"] != row2["ECTRL ID"]:
                tup = (row["ECTRL ID"], row2["ECTRL ID"])
                l.append(tup)

is there any way to make this faster with the build-in pandas functions? i have not found a way without looping

What happens if the Idx 3-2 and 4-3 is both less than 2 but iloc 4-2 is grater than 2? what tuples do you want in this scenario? — Emma, Jun 07 '22 at 14:33
I just want to a list of tuples of ids according to the conditions that longitude and latitude are both less than 2. Everything else shall be discarded. if want to get all tuples that match these conditions. In this example i would receive 3-2 and 3-1 as tuple, because the difference of longitude and latitude are both smaller than 2. — Vorsten, Jun 07 '22 at 14:44
right. so in the example I gave, do you want `[(id of idx 2, idx3), (id of idx 3, idx4)]` or `[(id of idx2, 3, 4)]`? — Emma, Jun 07 '22 at 14:46
You can take a look at `haversine` alg. https://pypi.org/project/haversine/ or https://stackoverflow.com/questions/29545704/fast-haversine-approximation-python-pandas — Emma, Jun 07 '22 at 14:58
@Emma i want to receive ```[(id of idx 2, idx3), (id of idx 3, idx4)]```. I'll look into the haversine alg., i thank you! — Vorsten, Jun 08 '22 at 05:25

Pandas: Compare row with all other rows by multiple conditions

0 Answers0