Monthly Archives: May 2017

Laravel 5 Distance Spatial Query – Part 2

In a previous post, I played with the Haversine formula to calculate distances between two points (a user’s current location and a list of store locations for example). It got several hits presumably from Laravel (PHP framework) users trying to implement said formula in their projects. The part I left out was the actual implementation using Eloquent – essentially the post became clickbait thus ticking people off. I also eluded to a “better” formula, but never actually tried it so I decided to create a new post with updated information.

I do not possess the math skills to completely understand the nuances involved, but based on these two blog posts (original Haversine formula and the Vicinity great-circle distance formula), I understand the latter is more accurate when distances are closer together.

    protected function queryVicinity($lat, $lon)
    {
    	$radius = 100; // radius of bounding circle in kilometers
	//$EMR = 6371; // earth's mean radius, km
	$EMR = 3959; // earth's mean radius, miles

	// first-cut bounding box (in degrees)
	$maxLat = $lat + rad2deg($radius/$EMR);
	$minLat = $lat - rad2deg($radius/$EMR);
	// compensate for degrees longitude getting smaller with increasing latitude
	$maxLon = $lon + rad2deg($radius/$EMR/cos(deg2rad($lat)));
	$minLon = $lon - rad2deg($radius/$EMR/cos(deg2rad($lat)));

	$sql = '
		SELECT *, (ATAN2(SQRT(POWER(COS(RADIANS(lat)) * SIN(RADIANS(lon-:lon)), 2) + 
		POWER(COS(RADIANS(:lat)) *SIN(RADIANS(lat)) - 
		(SIN(RADIANS(:lat2)) * COS(RADIANS(lat)) * COS(RADIANS(lon-:lon2))), 2)), 
		SIN(RADIANS(:lat3)) * SIN(RADIANS(lat)) + 
		COS(RADIANS(:lat4)) * COS(RADIANS(lat)) * COS(RADIANS(lon-:lon3))) * :EMR) AS distance
	      	FROM (
	        	SELECT * FROM items WHERE lat BETWEEN :minLat 
	        	AND :maxLat AND lon BETWEEN :minLon AND :maxLon 
	        	// any other filters you want to add
	      	) AS FirstFilter
	   	HAVING distance < :radius 
           	ORDER BY distance 
		LIMIT 0, 20';

	$params = [
		'lon' => $lon,
	    	'lat' => $lat,
	    	'lat2' => $lat,// Repeated values not allowed so increment them in the params array
	    	'lon2' => $lon,
	    	'lat3' => $lat,
	    	'lat4' => $lat,
	    	'lon3' => $lon,
		'EMR' => $EMR, 
		'minLat' => $minLat, 
		'maxLat' => $maxLat, 
		'minLon' => $minLon, 
		'maxLon' => $maxLon, 
		'radius' => $radius,
	];

	$results = DB::select(DB::raw($sql), $params);
	return Model::hydrate($results);
    }

Some discussion on the above code:

  • My fancy code snippet plug-in no longer works so I had to paste it in by hand so I apologize in advance for the formatting – you may need to scroll the code to the right to see all of it. Yes, I tried a couple of other plug-ins and they only annoyed me further.
  • The last two lines form the Eloquent query. While I have seen other examples that wrap the SQL in a closure to make it look more Eloquent, this method seems cleaner to me so that’s what I’m using – pass the SQL and params into a RAW query and Eloquent does the necessary guard against SQL injection.
  • Each parameter substitution from the params array is substituted one at a time and cannot be duplicated, hence the numbering and duplication of values in the params array.
  • The center FROM section forms a bounding box to reduce the number of rows included in the calculation so the query will run quicker. Filtering as much as possible here is recommended.
  • Use HAVING instead of WHERE, otherwise you get a SQL error about a missing column ‘distance’. It allows you to use the column alias in the query vs re-running the math.
  • Adjust the bounding circle by changing the radius and use the correct earth-mean-radius for miles or kilometers.
  • Even though the model does not include a distance column by default, I noticed it IS included when hydrating from this query. Bonus!

Results

I ran tests for several pages with both formulas on a table loaded with 1 million records. In all of my tests, the results were consistent. The “better” formula took significantly longer to run (0.76 vs 1.18 seconds) and this was similar no matter which page I used so I’m sticking with the old formula. Here’s the final function:

    protected function queryDistance($lat, $lon)
    {
    	$radius = 100; // radius of bounding circle in kilometers
	//$EMR = 6371; // earth's mean radius, km
	$EMR = 3959; // earth's mean radius, miles

	// first-cut bounding box (in degrees)
	$maxLat = $lat + rad2deg($radius/$EMR);
	$minLat = $lat - rad2deg($radius/$EMR);
	// compensate for degrees longitude getting smaller with increasing latitude
	$maxLon = $lon + rad2deg($radius/$EMR/cos(deg2rad($lat)));
	$minLon = $lon - rad2deg($radius/$EMR/cos(deg2rad($lat)));
	$d2rLat = deg2rad($lat);
	$d2rLon = deg2rad($lon);

	$sql = '
		SELECT *, (acos(sin(:d2rLat)*sin(radians(lat)) + 
		cos(:d2rLat2)*cos(radians(lat))*cos(radians(lon)-:d2rLon)) * 
		:EMR) AS distance
	      	FROM (
	        	SELECT * FROM items WHERE lat BETWEEN :minLat 
	        	AND :maxLat AND lon BETWEEN :minLon AND :maxLon 
	      	) AS FirstFilter
		HAVING distance < :radius 
		ORDER BY distance 
		LIMIT 0, 20'; 

	$params = [
		'd2rLat' => $d2rLat,
		'd2rLat2' => $d2rLat,
		'd2rLon' => $d2rLon,
		'EMR' => $EMR,
		'minLat' => $minLat,
		'maxLat' => $maxLat,
		'minLon' => $minLon,
		'maxLon' => $maxLon,
		'radius' => $radius,
	];

	$results = DB::select(DB::raw($sql), $params);
	return Model::hydrate($results);
    }

I hope this helps.