Wednesday, May 30, 2012

R Postgresql

Reading from PostgreSQL DB

  1. Install RPostgreSQL
    :library(RPostgreSQL)
  2. Open a db connection using the dbConnect function
    :con<-dbConnect(PostgreSQL(),user="xxx", password="xxx",host="localhost", client.flag=CLIENT_MULTI_RESULTS)
  3. Use dbGetQuery to initiate a SELECT and return the result sets
    :sql<-"SELECT * FROM table WHERE name='xxx'"
    :rows<- dbGetQuery(con,sql)
  4. Use dbDisconnect to terminate the connection
    :dbDisconnect(con)

Monday, May 28, 2012

tiger geocoder 2011


Settings
  1. Ubuntu 12.04
  2. Postgresql 9.1
  3. Postgis 2.0
  4. Tiger 2011 Data
Installing Postgresql + postgis2.0
  1. sudo apt-get install python-software-properties
  2. sudo apt-add-repository ppa:sharpie/for-science
  3. sudo apt-add-repository ppa:sharpie/postgis-nightly
  4. sudo apt-get update
  5. sudo apt-get install postgresql-9.1-postgis
  6. sudo apt-get install build-essential postgresql-9.1 postgresql-server-dev-9.1 libxml2-dev proj libjson0-dev xsltproc docbook-xsl docbook-mathml gettext postgresql-contrib-9.1 pgadmin3
Updating gdal and geos libraries
  1. sudo apt-add-repository ppa:olivier-berten/geo
  2. sudo apt-get update
  3. sudo apt-get install libgdal-dev
  4. sudo apt-get install g++ ruby ruby1.8-dev swig swig2.0 unzip
  5. wget http://download.osgeo.org/geos/geos-3.3.3.tar.bz2
  6. tar xvfj geos-3.3.3.tar.bz2
  7. cd geos-3.3.3
  8. ./configure --enable-ruby --prefix=/usr
  9. make
  10. sudo make install
Building postgis2.0
  1. wget http://postgis.refractions.net/download/postgis-2.0.0.tar.gz
  2. tar zxvf postgis-2.0.0.tar.gz
  3. cd postgis-2.0.0
  4. ./configure --with-gui
  5. make
  6. sudo make install
  7. sudo ldconfig
  8. sudo make comments-install
  9. sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/shp2pgsql
  10. sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/pgsql2shp
  11. sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/raster2pgsql
Building spatial db template
  1. createdb postgis2
  2. psql -d postgis2
  3. psql>create extension postgis;
  4. psql>create extension postgis_topology;
Building geocoder
  1. createdb -T postgis2 geocoder
  2. wget http://postgis.refractions.net/download/postgis-2.1.0SVN.tar.gz
  3. tar zxvf postgis-2.1SVN.tar.gz
  4. EDIT postgis-2.1SVN/extra/tiger-geocoder/tiger2011/tiger_loader.sql
    export PGBIN=/usr/bin
    export PGPORT=5432
    export PGHOST=localhost
    export PGUSER=postgres
    export PGPASSWORD=XXX
    export PGDATABASE=geocoder
    PSQL=${PGBIN}/psql
    SHP2PGSQL=${PGBIN}/shp2pgsql
  5. EDIT postgis-2.1SVN/extra/create_geocode.sh
    PGPORT=5432
    PGHOST=localhost
    PGUSER=postgres
    PGPASSWORD=XXX
    THEDB=geocoder
    PGSQL_CMD=/usr/bin/psql
    PGCONTRIB=/usr/share/postgresql/9.1/contrib
    //unmark the following two lines
    ${PGSQL_CMD} -d "${THEDB}" -c "CREATE EXTENSION fuzzystrmatch.sql"
    ${PGSQL_CMD} -d "${THEDB}" -c "ALTER DATABASE ${THEDB} SET search_path=public,tiger;"
  6. sh ./create_geocode.sh
Downloading tiger data
  1. sudo mkdir -p /gisdata/temp
  2. sudo chown -R postgres:postgres /gisdata
  3. psql -d geocoder
  4. psql>\a
  5. psql>\t
  6. psql>\o nation.sh
  7. psql>select loader_generate_nation_script('sh');
  8. psql>\q
  9. sh ./nation.sh
  10. psql -d geocoder
  11. psql>\a
  12. psql>\t
  13. psql>\o all_states.sh
  14. psql>select loader_generate_script(ARRAY['AK','AR','AS',...],'sh');
  15. psql>\q
  16. sh ./all_states.sh
  17. psql -d geocoder
  18. psql>select install_missing_indexes();
Testing geocoder
  1. psql -d geocoder
  2. get geocode of "27 Waterview Drive, Shelton, CT 06484)psql>select g.rating,ST_X(geomout) AS lon,ST_Y(geomout) AS lat, (addy).* from geocode('27 Waterview Drive, Shelton, CT 06484') AS g;  
  3. get reverse-geocode of point(-73.0811694362766 41.2931353732512)psql>select pprint_addy(r.addy[1]) as st1, pprint_addy(r.addy[2]) as st2, pprint_addy(r.addy[3]) as st3, array_to_string(r.street,',') as cross_streets from reverse_geocode(st_geomfromtext('POINT(-73.0811694362766 41.2931353732512)',4269),true) as r; 

Thursday, May 24, 2012

Maven

Why maven?
Ans: you want to manage jars in your projects.  you want to manage dependencies and versions in your projects.  you want to manage your project structures.  you want to manage the building processes.

How setup maven
1. export M2_home=/directory/to/maven
    export PATH=/directory/to/maven:${PATH}

How to use

Wednesday, May 23, 2012

R Spatial

1. Load library
    library(rgdal)
    library(maptools)
    library(sp)
2. setwd("directory")  #set default working directory

3.data<-read.table()

4. str(data)

5. coordinates(data) "LAt+Lon"

6. str(data)

7. proje4string(data) = CRS("+init=epsg:4326")

8. spplot()

XX: install rgdal in MacOS.
install.packages('rgdal',repos="http://www.stats.ox.ac.uk/pub/RWin")


Geohashing

Geohash function has been included in Postgis 2.0.  To generate a geohash code, one can use "st_geohash(geom_column, [integer 1-20])".
After inserting/updating geohash code, remember to create b-tree index for the geohash codes.

Thursday, May 17, 2012

Algorithms in Mahout

From IBM:
1. Logistic Regression (Stochastic Gradient Descent)
2. Hidden Markov Models (HMM)
3. Dirichlet Clustering
4. Spectral Clustering
5. Minhash Clustering
6. Numerous Recommender Improvements
7. Collections

JVM tuning

Key JVM

-Xmx                           -Xmx2G    maximum size of java heap.
-Xms                           -Xms1G     minimum size of java heap.
-server/client                 defines type of jvm
-d32/-d64                     specify the type of JVM
-XX:+NewRatio           -XX:+NewRatio=12     only 1/12 of the heap will be reserved for short-lived objects.
-XX:+UseParallelGC    use multi-cores or processor to do garbage collection in parallel or not.
-XX:+UseParallelOldGC


Add new Harddrive to ubuntu

Situation: the old /home directory is too small.  I want to add a new hard drive and mount it as /home.

1. sudo fdisk -l to check whether ubuntu detects the new harddrive and identify the new harddrive identity (the one without mounting point, ex. sdb)

2. sudo fdisk /dev/sdb
     press 'n' to create new partition.
     press 'p' to add primary partition.
     press '1' to name partition number.
     enter the first and the last section
     press 'L' to list available file sysetem type
     choice '83' (linux)
     press 'w' to write to the table and exit

3. sudo partprobe /dev/sdb

4. sudo mkfs /dev/sdb1 -t ext4

5. sudo mkdir /mnt/sdb1

6. sudo chmod 777 /mnt/sdb1

7. sudo mount /dev/sdb1 /mnt/sdb1 -t ext4

8. sudo cp -R /home /mnt/sdb1

9. sudo vi /etc/fstab
    add "/dev/sdb1 /home ext4 defaults 0 0 to the file, save and quit

10. backup the home folder to other places, then delete all in /home

11.  sudo mount -a

12. reboot and done.

Tuesday, May 8, 2012

PostGIS examples

From http://postgis.refractions.net/pipermail/postgis-users/2007-July/016358.html

SQL Commands for the Demonstration

Simple Spatial SQL

 create table points ( pt geometry, name varchar );
 insert into points values ( 'POINT(0 0)', 'Origin' );
 insert into points values ( 'POINT(5 0)', 'X Axis' );
 insert into points values ( 'POINT(0 5)', 'Y Axis' );
 select name, AsText(pt), ST_Distance(pt, 'POINT(5 5)') from points;
 drop table points;


Loading Shape Files

 cd \postgis-workshop\data
 pg_setenv
 shp2pgsql -i -s 3005 bc_pubs.shp bc_pubs > bc_pubs.sql
 notepad bc_pubs.sql
 pg_shpsql
 psql -U postgres -f bc_data.sql -d postgis
   
Creating Spatial Indexes

 create index bc_roads_gidx on bc_roads using gist ( the_geom );
 create index bc_pubs_gidx on bc_pubs using gist ( the_geom );
 create index bc_voting_areas_gidx on bc_voting_areas using gist ( the_geom );
 create index bc_municipality_gidx on bc_municipality using gist ( the_geom );
 create index bc_hospitals_gidx on bc_hospitals using gist ( the_geom );

Using Spatial Indexes

 -- Unindexed query
 select gid, name from bc_roads where _ST_Crosses(the_geom, ST_GeomFromText('LINESTRING(1220446 477473,1220417 477559)', 3005));
 -- Indexed query
 select gid, name from bc_roads where ST_Crosses(the_geom, ST_GeomFromText('LINESTRING(1220446 477473,1220417 477559)', 3005));

Indexes and Query Plans

 -- Unindexed query plan
 explain select gid, name from bc_roads where _ST_Crosses(the_geom, ST_GeomFromText('LINESTRING(1220446 477473,1220417 477559)', 3005));
 -- Indexed query plan
 explain select gid, name from bc_roads where ST_Crosses(the_geom, ST_GeomFromText('LINESTRING(1220446 477473,1220417 477559)', 3005));

PostgreSQL Optimization

  open postgresql.conf

Spatial Analysis in SQL

 -- What is the total length of all roads in the province, in kilometers?
 select sum(ST_Length(the_geom))/1000 as km_roads from bc_roads;

 -- How large is the city of Prince George, in hectares?
 select ST_Area(the_geom)/10000 as hectares from bc_municipality where name = 'PRINCE GEORGE';

 -- What is the largest municipality in the province, by area?
 select name, ST_Area(the_geom)/10000 as hectares from bc_municipality order by hectares desc limit 1;

Data Integrity

 -- How many invalid voting area polygons?
 select count(*) from bc_voting_areas where not ST_IsValid(the_geom);

Distance Queries

 -- Where is the Tabor Arms pub?
 select ST_AsText(the_geom) from bc_pubs where name ilike 'Tabor Arms%';

 -- How many Unity Party supporters live within 2km of Tabor Arms pub?
 select sum(upbc) as unity_voters from bc_voting_areas where ST_DWithin(the_geom, ST_GeomFromText('POINT(1209385 996204)', 3005), 2000);



Spatial Joins

 -- Find all pubs located within 250 meters of a hospital.
 select h.name, p.name from bc_hospitals h, bc_pubs p where ST_DWithin(h.the_geom, p.the_geom, 250);

 -- Summarize the 2000 provincial election results by municipality.
 select m.name, sum(v.ndp) as ndp, sum(v.lib) as liberal, sum(v.gp) as green, sum(v.upbc) as unity, sum(v.vtotal) as total from bc_voting_areas v, bc_municipality m where ST_Intersects(v.the_geom, m.the_geom) group by m.name order by m.name;

Overlays 

 -- Clip the voting areas by the Prince George boundary.
 create table pg_voting_areas as select ST_Intersection(v.the_geom, m.the_geom) as intersection_geom, ST_Area(v.the_geom) as va_area, v.*, m.name from bc_voting_areas v, bc_municipality m where ST_Intersects(v.the_geom, m.the_geom) and m.name = 'PRINCE GEORGE';

 -- What is the area of the clipped features?
 select sum(ST_Area(intersection_geom)) from pg_voting_areas;

 -- How does that compare to the clipping polygon? (Should be the same.)
 select ST_Area(the_geom) from bc_municipality where name = 'PRINCE GEORGE';

Coordinate Projection

 -- What is the proj4 definition of srid 3005?
 select proj4text from spatial_ref_sys where srid=3005;

 -- What are the native coordinate of the first road in the roads table?
 select ST_AsText(the_geom) from bc_roads limit 1;

 -- What do those coordinates look like in lon/lat?
 select ST_AsText(ST_Transform(the_geom,4326)) from bc_roads limit 1;

Exercises

 -- What is the perimeter of the municipality of Vancouver?
 select ST_Perimeter(the_geom) from bc_municipality where name = 'VANCOUVER';
 
 -- What is the total area of all voting areas in hectares?
 select sum(ST_Area(the_geom))/10000 as hectares from bc_voting_areas;
 
 -- What is the total area of all voting areas with more than 100 voters in them?
 select sum(ST_Area(the_geom))/10000 as hectares from bc_voting_areas where vtotal > 100;
 
 -- What is the length in kilometers of all roads named Douglas St?
 select sum(ST_Length(the_geom))/1000 as kilometers from bc_roads where name = 'Douglas St';

Advanced Exercises

 -- What is the length in kilometers of Douglas St in Victoria?
 select sum(ST_Length(r.the_geom))/1000 as kilometers from bc_roads r, bc_municipality m where ST_Contains(m.the_geom, r.the_geom) and r.name = 'Douglas St' and m.name = 'VICTORIA';
 
 -- What two pubs have the most Green Party supporters within 500 meters of them?
 select p.name, p.city, sum(v.gp) as greens from bc_pubs p, bc_voting_areas v where ST_DWithin(p.the_geom, v.the_geom, 500) group by p.name, p.city order by greens desc limit 2;

 -- What is the latitude of the most southerly hospital in the province?
 select ST_Y(ST_Transform(the_geom,4326)) as latitude from bc_hospitals order by latitude asc limit 1;

 -- What were the percentage NDP and Liberal vote within the city limits of Prince George in the 2000 provincial election?
 select 100*sum(v.ndp)/sum(v.vtotal) as ndp, 100*sum(v.lib)/sum(v.vtotal) as liberal from bc_voting_areas v, bc_municipality m where ST_Contains(m.the_geom, v.the_geom) and m.name = 'PRINCE GEORGE';

 -- What is the largest voting area polygon that has a hole?
 select gid, id, ST_Area(the_geom) as area from bc_voting_areas where ST_NRings(the_geom) > 1 order by area desc limit 1;

 -- How many NDP voters live within 50 meters of Simcoe St in Victoria?
 select sum(v.ndp) as ndp from bc_voting_areas v, bc_municipality m, bc_roads r where ST_Contains(m.the_geom, r.the_geom) and r.name = 'Simcoe St' and m.name = 'VICTORIA' and ST_DWithin(r.the_geom, v.the_geom, 50);

Wednesday, May 2, 2012

Hadoop Demo

Create MapReduce application:
  a. specify input and output locations
      Input/Output => <key,value> 
      key and value should implement "Writable",  and key should implement "WritableComparable".
  b. specify map and reduce functions
  c. job client submit the job and configuration to JobTracker.

Example1: WordCount
  public class WordCount{
         public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable{     
                private final static IntWritable one = new IntWritable(1); //value
                private Text word = new Text();//use word as key
                public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException{
                          String line = value.toString();
                          StringTokenizer tokenizer = new StringTokenizer(line);
                          while(tokenizer.hasMoreTokens()){
                                word.set(tokenizer.nextToken()); //set key
                                output.collect(word,one); //value will always be one
                          }
               }
        } 
        public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable>{
               public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException{
                         int sum=0
                         //for values with the same key, sum up the value (1) and output <key, sum>
                         while(values.hasNext()){
                                 sum+=values.next().get(); 
                         }
                         output.collect(key,new IntWritable(sum));
               }
        }

        public static void main(String[] args) throws Exception{
              JobConf conf = new JobConf(WordCount.class);
              conf.setJobName("wordcount");
              conf.setOutputKeyClass(Text.class); //word
              conf.setOutputValueClass(IntWritable.class);//sum
              conf.setMapperClass(Map.class);
              //conf.setOutputKeyComparatorClass(Comparator.class));  //control the grouping of output
              conf.setCombinerClass(Reduce.class);
              conf.setReducerClass(Reduce.class);
              conf.setInputFormat(TextInputFormat.class);//textfile
              conf.setOutputFormat(TextOutputFormat.class);//textfile
              FileInputFormat.setInputPaths(conf, new Path(args[0]));//location of input file
              FileOutputFormat.setOutputPath(conf, new Path(args[1]));//location of output file
              JobClient.run(conf);
        }
   }


PostgreSQL+PostGIS cheat sheet

1. Distinct Count
    solution 1:
    select count(*) from (select distinct * from table where column='') as dis

2. Check redundant tuple
    soluction 1 (self join)
       SELECTp.place_id, p.href,s.place_id,s.href
       FROM place p
       INNER JOIN place s ON p.href=s.href
      WHERE p.place_id &s s.place_id
      ORDER BY p.place_id, s.place_id;

3. Materialized View (http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views)
    a. snapshot (view definition is never changed)
        1) create a table to store view info
            CREATE TABLE matviews(
                      mv_name NAME NOT NULL PRIMARY KEY,
                      b_name NAME NOT NULL,
                      last_refresh TIMESTAMP WITH TIME ZONE
            );


        2)create a function to insert a row in Mat_VIEWS and create a materialized view
            CREATE OR REPLACE FUNCTION create_matview(NAME,NAME)
                   RETURNS VOID
                   SECURITY DEFINER
                   LANGUAGE plpgsql AS '
                         DECLARE
                                 matview ALIAS FOR $1;
                                 view_name ALIAS FOR $2;
                                 entry matviews%ROWTYPE;
                         BEGIN
                                 SELECT * INTO entry FROM matviews WHERE mv_name=matview;
                   
                                 IF FOUND THEN
                                           RAISE EXCEPTION ''Materialized view ''''%'''' already exists.'', matview;
                                 END IF;


                                 EXECUTE ''REVOKE ALL ON '' || view_name || ''FROM PUBLIC'';
                                 EXECUTE "GRANT SELECT ON '' || view_name || '' TO PUBLIC'';
                                 EXECUTE ''CREATE TABLE '' || matview|| AS SELECT * FROM ''||view_name;
                                 EXECUTE ''REVOKE ALL ON '' || matview || ' FROM PUBLIC'';
                                 EXECUTE ''GRANT SELECT ON '' || matview || '' TO PUBLIC'';


                                 INSERT INTO matviews (mv_name, b_name,last_refresh)
                                 VALUES(matview, view_name, CURRENT_TIMESTAMP);


                                 RETURN;
                         END
                  ';
         3) create a function to drop a materialized view
             CREATE OR REPLACE FUNCTION drop_matview(NAME) RETURN VOID
                   SECURITY DEFINER
                   LANGUAGE plpgsql AS '
                         DECLARE
                              matview ALIAS FOR $1;
                              entry matviews%ROWTYPE;
                         BEGIN
                              SELECT * INTO entry FROM matview WHERE mv_name=matview;
                              IF NOT FOUND THEN
                                        RAISE EXCEPTION ''Materialized view % does not exist.'',matview;
                              END IF;
                              EXECUTE ''DROP TABLE'' || matview;
                              DELETE FROM matviews WHERE mv_name=matview;
                              RETURN;
                         END
                  ';


          4) create a function to refresh a materialized view
              CREATE OR REPLACE FUNCTION refresh_matview(NAME) RETURNS VOID
                    SECURITY DEFINER
                    LANGUAGE plpgsql AS '
                         DECLARE
                               matview ALIAS $1;
                               entry matviews%ROWTYP;
                         BEGIN
                               SELECT * INTO entry FROM matviews WHERE mv_name=matview;
                               IF NOT FOUND THEN
                                         RAISE EXCEPTION ''Materialized view % does not exist.'',matview;
                               END IF;
                               EXECUTE '' DELETE FROM '' || matview;
                               EXECUTE ''INSERT INTO ''||matview || '' SELECT * FROM '' || entry.b_name;
                               UPDATE matviews
                                        SET last_refresh = CURRENT_TIMESTAMP
                                        WHERE mv_name=matview;
                               RETURN;
                         END


                   ';


   b. eager (update whenever the base view is changed)
4. delete existing duplicate rows in a table

    Before execute the following statement, make sure you have enough buffers for the session (refer to cheat 5 and 6).
 
    solution1
    DELETE FROM table USING table ta
    WHERE table.key=ta.key AND table.id>ta.id;


    solution2
    DELETE FROM table
    WHERE ctid NOT IN
    (SELECT MAX(s.ctid) FROM table s GROUP BY s.column_has_be_distinct);


    solution3
    CREATE OR REPLACE FUNCTION remove_duplicates(text,text) RETURN VOID AS $$
         DECLARE
                tablename ALIAS FOR $1;
                duplicate_column ALIAS FRO $2;
         BEGIN
                EXECUTE 'CREATE TEMPORARY TABLE TABLE_DISTINCT_' ||tablename||'
                           AS (SELECT DISTINCT ON ('||duplicate_column||') * FROM ' ||tablename||
                                  ORDER BY '||duplicate_column||' ASC);';
               EXECUTE 'DELETE FROM ' || tablename||';';
               EXECUTE 'INSERT INTO ' || tablename||' (SELECT * FROM _DISTINCT_' || tablename||');';
               EXECUTE 'DROP TABLE_DISTINCT_'|| tablename||';';
               RETURN;
         END;
   $$LANGUAGE plpgsql;


   solution4
   CREATE TEMPORARY TABLE t_temp ON COMMIT DROP AS
   SELECT DISTINCT ON ('column_name') * FROM table;
   TRUNCATE table;
   INSERT INTO table SELECT * FROM t_temp;


5. find out the size of table
    solution:
           SELECT pg_size_pretty(pg_relation_size('tablename'));

6. set temporary buffers
    solution:
          SET temp_buffers =' XXXMB';

7. delete table rows which some columns in that table (a) don't exist in the other table (b).
    solution:
          DELETE FROM a WHERE a.column_name NOT IN (SELECT column_name FROM b);

8. Backup
   Solution1:
          pg_dump
          EX.  login db at localhost as postgres and dump database poi2 into poi.backup.gz
                  pg_dump -u postgres -h localhost poi2 | gzip>poi.backup.gz


   Solution2:
         file system level backup
         Ex. 
                tar -cf poi2.backup.tar /usr/local/pgsql/data
         PS.  The database server must be shutdown before doing file system level backup and restore.


   Solutioin3:
        Continuous archiving
        EX. 
               a. enable wal in postgresql.conf
                   wal_level=archive
                   archive_mode=on
                   archive_command='test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
               b.SELECT pg_start_backup('label',ture);
               c. perform file system level backup, like tar
               d. SELECT pg_stop_backup();
touch /var/lib/pgsql/backup_in_progress
psql -c "select pg_start_backup('hot_backup');"
tar -cf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/
psql -c "select pg_stop_backup();"
rm /var/lib/pgsql/backup_in_progress
tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/

   Solution4: 
        pg_basebackup
        EX.
               a. enable local replicate permission for posgres @localhost
               b. enable max_wal_senders=1
               c. pg_basebackup -U postgres -D /backup/data -F tar -z -l label


9. PostGIS import shape files into one table
   Solution 1:
     shp2pgsql -s 915 -a[-c] -g geom shapfile.shp schema.table | psql -U username -W dbname
     -s: srid, -g: geometry_column_name


10. Test if geometry column is valid
  Solution 1:
      select ST_IsValid(geom) from table;
  Solution 2:
      select ST_IsValidDetail(geom) from table;
  Solution 3:
      select ST_IsEmpty(geom) from table;
11. Test the geometry type
  Solution 1:
      select ST_GeometryType(geom) from table;


12. Mapping points in table a to a polygon in a table b
   Solution1:
       select * from table1
       where st_within(geom, polygon);


13. get a boundary from a collection of geometries
   Solution 1:
       select st_extent(geom) from table;


14. Performance tuning:
      Install pgtune (apt-get update; apt-get install pgtune)  -> pgtune -i postgresql.conf -o psql.conf


      create dedicate index for query (considering the order of fields, and the join condition)
      Use multi-fields index
      the order of fields in multi-fields index (A, B, C can be used to find AB, but not BC)
      Vacuum the tables after delete, update, insert, or create index.
      Order of fields and order of joint have a big impact on performance.
      Explicit casting: "where x=3"   v.s. "where x=cast(3 as smallint)",  "abs(dest) = cast(111 as smallint)" v.s. "dest =111 or dest=-111"  have huge different.
      set enable_seqscan=no
      Avoid nested loop, merge and sort is faster.
      Use temp table
      Add sequence in query to monitor the progress.  select nextval('seq_progress');   select currval('seq_progress');
      Enable stats_start_collector and stats_row_level to get auto vacuum
      Partial index:  create index index_name on table1(column_name) where column_nam like 'America%'
      Specify sort_memory, temp_buffers for each connections or query.