0

I have two tables. Table A structure is

City | State | Country | Latitude | Longitude | LocPoint(Longitude,Latitude)(spatial index)

I have created spatial index on column LocPoint of datatype Point.

Table B

Latitude | Longitude

I want to Join two tables A and B with City State and Country information of the nearest city to the location of points in table B. I know the formula of getting the nearest point, I am just stuck at the join part where i will join only for the nearest point.

Expected OutPut after running the query

Latitude(TableB) | longitude(TableB)| City(Nearest from lat/Lon from Table A) | State | Country

Any ideas?

AngryLeo
  • 390
  • 4
  • 23

1 Answers1

0

Lets call a function dst to calculate the distance between two point to simplify.

Now you need to calculate every distance between your point in TableB with your cities.

 SELECT TA.*, TB.*, dst(TA,TB) as distance
 FROM TableA TA
 CROSS JOIN TableB TB

Now you need to choose the smallest distance for each TableB entry using variables. Would help if you have a PK for TableB. If you dont have a PK you can use CONCAT (tableB.Latitud,tableB.Longitud).

SELECT *
FROM (
    SELECT *, 
           @rn := IF(@pk = tb_pk,
                     @rn + 1,
                     IF(@pk := tb_pk, 1, 1)
                    ) as rn       
    FROM (
         SELECT TA.*, TB.*, dst(TA,TB) as distance
         FROM TableA TA
         CROSS JOIN TableB TB
         ) T
    CROSS JOIN ( SELECT @rn := 0, @pk := 0 ) as var
    ORDER BY distance DESC -- Find the near city
   ) Y
WHERE Y.rn = 1  -- Only select the first city for group.
Community
  • 1
  • 1
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
  • did this help or have more questions? – Juan Carlos Oropeza Apr 11 '17 at 17:23
  • I tried to implement that query but it was taking a long time even after indexing both tables. I have 1000 rows in tableB and 100000 rows in table A. I am thinking about having a different approach.. – AngryLeo Apr 11 '17 at 19:06
  • Of course is slow. that is `100.000.000` calculations. You have to do some optimizations. Like only calculate the distance against the cities in a box around each TableB element. Precalculate the sin / cos values because those are the slower operations. And you dont calculate every TableB, you perform a loop and keep saving the result in a temporary table. Other option is use a different db, I do things like this in postgres, and find the near_link for 4000 records I receive every minute in less than 5s. – Juan Carlos Oropeza Apr 11 '17 at 19:19