Tag Archives: SQL

Laravel 5 Distance Spatial Query – Part 2

In a previous post, I played with the Haversine formula to calculate distances between two points (a user’s current location and a list of store locations for example). It got several hits presumably from Laravel (PHP framework) users trying to implement said formula in their projects. The part I left out was the actual implementation using Eloquent – essentially the post became clickbait thus ticking people off. I also eluded to a “better” formula, but never actually tried it so I decided to create a new post with updated information.

I do not possess the math skills to completely understand the nuances involved, but based on these two blog posts (original Haversine formula and the Vicinity great-circle distance formula), I understand the latter is more accurate when distances are closer together.

    protected function queryVicinity($lat, $lon)
    {
    	$radius = 100; // radius of bounding circle in kilometers
	//$EMR = 6371; // earth's mean radius, km
	$EMR = 3959; // earth's mean radius, miles

	// first-cut bounding box (in degrees)
	$maxLat = $lat + rad2deg($radius/$EMR);
	$minLat = $lat - rad2deg($radius/$EMR);
	// compensate for degrees longitude getting smaller with increasing latitude
	$maxLon = $lon + rad2deg($radius/$EMR/cos(deg2rad($lat)));
	$minLon = $lon - rad2deg($radius/$EMR/cos(deg2rad($lat)));

	$sql = '
		SELECT *, (ATAN2(SQRT(POWER(COS(RADIANS(lat)) * SIN(RADIANS(lon-:lon)), 2) + 
		POWER(COS(RADIANS(:lat)) *SIN(RADIANS(lat)) - 
		(SIN(RADIANS(:lat2)) * COS(RADIANS(lat)) * COS(RADIANS(lon-:lon2))), 2)), 
		SIN(RADIANS(:lat3)) * SIN(RADIANS(lat)) + 
		COS(RADIANS(:lat4)) * COS(RADIANS(lat)) * COS(RADIANS(lon-:lon3))) * :EMR) AS distance
	      	FROM (
	        	SELECT * FROM items WHERE lat BETWEEN :minLat 
	        	AND :maxLat AND lon BETWEEN :minLon AND :maxLon 
	        	// any other filters you want to add
	      	) AS FirstFilter
	   	HAVING distance < :radius 
           	ORDER BY distance 
		LIMIT 0, 20';

	$params = [
		'lon' => $lon,
	    	'lat' => $lat,
	    	'lat2' => $lat,// Repeated values not allowed so increment them in the params array
	    	'lon2' => $lon,
	    	'lat3' => $lat,
	    	'lat4' => $lat,
	    	'lon3' => $lon,
		'EMR' => $EMR, 
		'minLat' => $minLat, 
		'maxLat' => $maxLat, 
		'minLon' => $minLon, 
		'maxLon' => $maxLon, 
		'radius' => $radius,
	];

	$results = DB::select(DB::raw($sql), $params);
	return Model::hydrate($results);
    }

Some discussion on the above code:

  • My fancy code snippet plug-in no longer works so I had to paste it in by hand so I apologize in advance for the formatting – you may need to scroll the code to the right to see all of it. Yes, I tried a couple of other plug-ins and they only annoyed me further.
  • The last two lines form the Eloquent query. While I have seen other examples that wrap the SQL in a closure to make it look more Eloquent, this method seems cleaner to me so that’s what I’m using – pass the SQL and params into a RAW query and Eloquent does the necessary guard against SQL injection.
  • Each parameter substitution from the params array is substituted one at a time and cannot be duplicated, hence the numbering and duplication of values in the params array.
  • The center FROM section forms a bounding box to reduce the number of rows included in the calculation so the query will run quicker. Filtering as much as possible here is recommended.
  • Use HAVING instead of WHERE, otherwise you get a SQL error about a missing column ‘distance’. It allows you to use the column alias in the query vs re-running the math.
  • Adjust the bounding circle by changing the radius and use the correct earth-mean-radius for miles or kilometers.
  • Even though the model does not include a distance column by default, I noticed it IS included when hydrating from this query. Bonus!

Results

I ran tests for several pages with both formulas on a table loaded with 1 million records. In all of my tests, the results were consistent. The “better” formula took significantly longer to run (0.76 vs 1.18 seconds) and this was similar no matter which page I used so I’m sticking with the old formula. Here’s the final function:

    protected function queryDistance($lat, $lon)
    {
    	$radius = 100; // radius of bounding circle in kilometers
	//$EMR = 6371; // earth's mean radius, km
	$EMR = 3959; // earth's mean radius, miles

	// first-cut bounding box (in degrees)
	$maxLat = $lat + rad2deg($radius/$EMR);
	$minLat = $lat - rad2deg($radius/$EMR);
	// compensate for degrees longitude getting smaller with increasing latitude
	$maxLon = $lon + rad2deg($radius/$EMR/cos(deg2rad($lat)));
	$minLon = $lon - rad2deg($radius/$EMR/cos(deg2rad($lat)));
	$d2rLat = deg2rad($lat);
	$d2rLon = deg2rad($lon);

	$sql = '
		SELECT *, (acos(sin(:d2rLat)*sin(radians(lat)) + 
		cos(:d2rLat2)*cos(radians(lat))*cos(radians(lon)-:d2rLon)) * 
		:EMR) AS distance
	      	FROM (
	        	SELECT * FROM items WHERE lat BETWEEN :minLat 
	        	AND :maxLat AND lon BETWEEN :minLon AND :maxLon 
	      	) AS FirstFilter
		HAVING distance < :radius 
		ORDER BY distance 
		LIMIT 0, 20'; 

	$params = [
		'd2rLat' => $d2rLat,
		'd2rLat2' => $d2rLat,
		'd2rLon' => $d2rLon,
		'EMR' => $EMR,
		'minLat' => $minLat,
		'maxLat' => $maxLat,
		'minLon' => $minLon,
		'maxLon' => $maxLon,
		'radius' => $radius,
	];

	$results = DB::select(DB::raw($sql), $params);
	return Model::hydrate($results);
    }

I hope this helps.

Laravel 5 Eloquent Stored Procedure Spatial Query

PHP is the internet language I’m most familiar and I’m using it to write an API back-end as a side-project (read that as doing this for fun).

Laravel is a great PHP framework not only for quickly creating front end websites, but also has some nice features for the back end as well. I’ve explored a few other frameworks over the past few months while I sharpened my PHP skills and settled on this stack for it’s completeness, ease of use and support. The documentation is good and there are plenty of helpful results from internet searches. They even have podcasts!

The API needed a route that performs a MySQL Spatial Query to retrieve database records close to a location and return the results as a JSON object. A query of this type is typically used as a store locator – tell me where you are and I’ll show you stores nearby. Getting the query working without special add-on libraries was one goal plus I was curious about performance of various query methods and it took me a while to put all of the pieces together, hence the post.

Outline of My Journey

  1. Start with A Standard PDO SQL Query
  2. Convert the SQL to a Stored Procedure
  3. Fetch Data with Laravel/Eloquent
  4. Convert the Fetched Array to Eloquent Model Objects

Start with A Standard PDO SQL Query

I found what is typically referred to as the Haversine formula though I’m told it’s really the spherical law of cosines. I’ll just call it math stuff.

The nested MySQL SQL statement for this is something like:

$sql = "
   SELECT *,
      acos(sin(:locLat)*sin(radians(lat)) + cos(:locLat)*cos(radians(lat))*cos(radians(lon)-:locLon)) * :EMR AS D
      FROM (
         SELECT *
         FROM items
         WHERE lat BETWEEN :minLat AND :maxLat
         AND lon BETWEEN :minLon AND :maxLon
         // NOTE: I put other filters in here, but they aren't relevant to this example
      ) AS FirstCut
   WHERE acos(sin(:locLat)*sin(radians(lat)) + cos(:locLat)*cos(radians(lat))*cos(radians(lon)-:locLon)) * :EMR < :radius
   ORDER BY D
   LIMIT 0, 20 ";
   

Piece of cake, right? And here’s the PDO query:

$radius = 100; // radius of bounding circle in kilometers
//$EMR = 6371; // earth's mean radius, km
$EMR = 3959; // earth's mean radius, miles

// first-cut bounding box (in degrees)
$maxLat = $lat + rad2deg($radius/$EMR);
$minLat = $lat - rad2deg($radius/$EMR);
// compensate for degrees longitude getting smaller with increasing latitude
$maxLon = $lon + rad2deg($radius/$EMR/cos(deg2rad($lat)));
$minLon = $lon - rad2deg($radius/$EMR/cos(deg2rad($lat)));

$params = array(
   'locLat' => deg2rad($lat),
   'locLon' => deg2rad($lon),
   'minLat' => $minLat,
   'minLon' => $minLon,
   'maxLat' => $maxLat,
   'maxLon' => $maxLon,
   'radius' => $radius,
   'EMR'    => $EMR,
);

$statement = $pdo->prepare($sql);
$statement->execute($params); // Prevent SQL-injection
return $statement->fetchAll(PDO::FETCH_OBJ);

Total Execution Time: 17.015 milliseconds.

This returns a PHP stdClass array of database records.

Brilliant! I found a better formula while working on this so I’ll change it after some sleep, but I’ll proceed with the example using this one. Also, I used Faker to seed the database with 50,000 records to test performance on a large number of rows. Those of you with millions of rows, please stop snickering.

This example has a lot of code to deal with. I’ve never used stored procedures before, wanted to learn how and anticipated a performance increase by using one so lets find out.

Convert the SQL to a Stored Procedure

Following some searches and a little trial and error, I ended up with

    CREATE PROCEDURE `sp_get_all_items`(IN `locLat` FLOAT, IN `locLon` FLOAT, IN `EMR` INT, IN 	`minLat` FLOAT, IN `maxLat` FLOAT, IN `minLon` FLOAT, IN `maxLon` FLOAT, IN `radius` INT)
	DETERMINISTIC

	BEGIN
	SELECT *, 
		acos(sin(locLat)*sin(radians(lat)) + cos(locLat)*cos(radians(lat))*cos(radians(lon)-locLon)) * EMR AS D 
	    FROM (
    	    SELECT * 
        	FROM items 
	        WHERE lat BETWEEN minLat AND maxLat 
    	    AND lon BETWEEN minLon AND maxLon 
	    ) AS FirstCut 
    	WHERE acos(sin(locLat)*sin(radians(lat)) + cos(locLat)*cos(radians(lat))*cos(radians(lon)-locLon)) * EMR < radius 
	    ORDER BY D;
	END
    

It’s not all that different from the standard SQL query above, but it allowed me to replace that massive SQL statement in my code with:

$input = $locLat . ', ' . $locLon . ', ' . $EMR . ', ' . $minLat . ', ' . $maxLat . ', ' . $minLon . ', ' . $maxLon . ', ' . $radius;
$statement = $pdo->prepare('call sp_get_all_items(' . $input . ')');

Total Execution Time: 17.497 milliseconds.

This returns a PHP stdClass array of database records.

OK. Roughly the same performance for this case. The stored procedure type-hints the passed-in values (user’s current latitude and longitude) which avoids a SQL-injection attack where we used a prepared statement in the original example. The code looks a little neater plus this method helps with the next step.

Fetch Data with Laravel/Eloquent

Laravel includes an ORM package called Eloquent that simplifies standard database interactions. It’s still PDO under the hood, but instead of sprinkling this all over your code

$statement = $pdo->prepare('SELECT * FROM items');
$statement->execute();
return $statement->fetchAll(PDO::FETCH_OBJ);

use:

Item->get();

Those two blocks perform the exact same query. The Eloquent version is more… well… eloquent. Building typical queries is much nicer as you can just tag more modifiers onto the same line. RTFM. It’s pretty cool.

Building this nested query completely in Eloquent proved difficult partially ’cause I’m a n00b at this and it’s an interesting query. For this case, we use an Eloquent raw query. My solution is:

$result = DB::select(DB::raw('call sp_get_all_items(' . $input . ')'));

This returns a PHP stdClass array of database records. Same as above. So there's three ways to achieve the same result with similar performance. Pick the one you like.

In order to take full advantage of Eloquent’s capabilities, however,  we need to convert the PHP stdClass array into a Collection of Eloquent Model objects. FOR LOOP! NO! There must be an easier way. There is!

Convert the Fetched Array to Eloquent Model Objects

There are five columns in this table that are foreign keys to other tables. The term is Database Normalization. Fetching all of those associated objects is necessary to generate a proper response from this particular route. Yes, joins are an option, but Eloquent handles this for us with the relationship embedded in the model object.

This little gem was hard to find.

return Item::hydrate($result);

Total Execution Time: 16.328 milliseconds.

Awesome! That one line takes the PHP stdClass array of database records as an input and builds a collection of Item objects including the five joined objects and it is extremely fast.

The code is simplified and the data can be passed directly to Fractal which gives us our JSON object. PHP’s json_encode() also works, but Fractal offers more options via transformer classes for formatting the JSON returned from the API.

Final Thoughts

This example shows three ways of performing SQL queries and for this example, they all performed about the same. The difference is the amount of code required to accomplish the task. The Laravel coding style follows the single responsibility principle resulting in much smaller functions and hopefully fewer bugs. I hope you found this useful.

Update (May 3, 2017): New test & results.