Great circle distance
Great circle distance
Find the distance in kilometres between two points on the surface of the earth. This is just the sort of problem stored functions were made for. For a first order approximation, ignore deviations of the earth's surface from the perfectly spherical. Then the distance in radians is given by a number of trigonometric formulas. ACOS and COS behave reasonably:
COS(lat1-lat2)*(1+COS(lon1-lon2)) - COS(lat1+lat2)*(1-COS(lon1-lon2))
rads = ACOS( --------------------------------------------------------------------- )
2
We need to convert degrees latitude and longitude to radians, and we need to know the length in km of one radian on the earth's surface, which is 6378.388. The function:
set log_bin_trust_function_creators=TRUE;
DROP FUNCTION IF EXISTS GeoDistKM;
DELIMITER |
CREATE FUNCTION GeoDistKM( lat1 FLOAT, lon1 FLOAT, lat2 FLOAT, lon2 FLOAT ) RETURNS float
BEGIN
DECLARE pi, q1, q2, q3 FLOAT;
DECLARE rads FLOAT DEFAULT 0;
SET pi = PI();
SET lat1 = lat1 * pi / 180;
SET lon1 = lon1 * pi / 180;
SET lat2 = lat2 * pi / 180;
SET lon2 = lon2 * pi / 180;
SET q1 = COS(lon1-lon2);
SET q2 = COS(lat1-lat2);
SET q3 = COS(lat1+lat2);
SET rads = ACOS( 0.5*((1.0+q1)*q2 - (1.0-q1)*q3) );
RETURN 6378.388 * rads;
END;
|
DELIMITER ;
-- toronto to montreal (505km):
select geodistkm(43.6667,-79.4167,45.5000,-73.5833);
+----------------------------------------------+
| geodistkm(43.6667,-79.4167,45.5000,-73.5833) |
+----------------------------------------------+
| 505.38836669921875 |
+----------------------------------------------+
You can check and compare sort orders provided by these two collations here:
http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html
http://www.collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html
utf8_general_ci is a very simple collation. What it does - it just
- removes all accents
- then converts to upper case
and uses the code of this sort of "base letter" result letter to compare.
For example, these Latin letters: ÀÁÅåāă (and all other Latin letters "a" with any accents and in any cases) are all compared as equal to "A".
utf8_unicode_ci uses the default Unicode collation element table (DUCET).
The main differences are:
1. utf8_unicode_ci supports so called expansions and ligatures, for example: German letter ß (U+00DF LETTER SHARP S) is sorted near "ss" Letter Œ (U+0152 LATIN CAPITAL LIGATURE OE) is sorted near "OE".
utf8_general_ci does not support expansions/ligatures, it sorts all these letters as single characters, and sometimes in a wrong order.
2. utf8_unicode_ci is *generally* more accurate for all scripts. For example, on Cyrillic block: utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian. While utf8_general_ci is fine only for Russian and Bulgarian subset of Cyrillic. Extra letters used in Belarusian, Macedonian, Serbian, and Ukrainian
are sorted not well.
+/- The disadvantage of utf8_unicode_ci is that it is a little bit slower than utf8_general_ci.
So when you need better sorting order - use utf8_unicode_ci, and when you utterly interested in performance - use utf8_general_ci.
command to determine the version of mysql
optionally you can run the command without the -h