Tuesday, August 14, 2012

JMS - glassfish

JMS allows applications to create, send, receive, and read messages.  Using JMS, application can communicate with others loosely coupled, asynchronous, and reliable.

It is useful in the following use cases:
1) inventory component can send a message to factory component when inventory is low.
2) factory component can send a message to parts component to get parts assembled.
3) parts component can send message to inventory or supplied component to get parts ready.
4) factory and parts component can send message to accounting component to update their budget.
5) business component can update their info to sales force.

JMS architecture
1) A JMS provider (provided by Glassfish JEE container)
2) JMS clients (any JEE application component can be JMS client)
3) Messages (objects that communicate information beween JMS clients)
4) Administered objects (Connection factory and destinations)

Messaging domains
1) point-to-point domain contains queues, senders and receivers.  each message has only one consumer.   senders and receivers have no time-dependency. receivers will acknowledge the success of message processing.

2) publish-subscribe domain contains subscriber, publisher and topics.  consumers subscribe to the topics they interested in.  each message has multiple consumers.  consumers and publishers are time-dependency relationships.  consumers can only consume messages published after consumers' subscription.

Programming model
1) administered objects (provided by Glassfish): connection factory and destination.
2) connection factory( an object that clients used to connect to a jms provider)
Ex.
@Resource(lookup = "/jms/ConnectionFactory")
private static ConnectionFactory connectionFactory;

3) destination
Ex.
@Resource(looku="/jms/Queue")
private static Queue queue;

4) connections
Ex.
Connection connection = connectionFactory.createConnection();

5) sessions: sessions are used to create producers, consumers, messages, and queue browsers.
Ex.
Session session = connection.createSession(transactionType, session.Auto_acknowledge);

6) message producers
Ex.
MessageProducer producer = session.createProducer(dest);

7) message receivers
Ex.
MessageConsumer consumer = session.createConsumer(dest);

8) message listener
Ex.
Listener myListener = new Listener();
consumer.setMessageListener(myListerner);

9) messages (header, property and body)
MessageType: TextMessage (String object), MapMessage(name-value pairs), BytesMessage(undefined bytes), StreamMessage(primitive value stream), ObjectMessage (serializable object), Message(without body).

10) Queue Browser
Ex. QueueBrowser browser = session.createBrowser(queue);



Thursday, June 28, 2012

Running R script on AWS EMS


  • Install Amazon EMR Command Line Interface
    1. Install Ruby (1.8 up)
    2. Download and unzip CLI (http://aws.amazon.com/developertools/2264)
    3. Configure credential.json
                     {
                        "access_id": "AWS Access Key ID",
                        "private_key":"AWS Secret Access Key",
                        "keypair": "EC2 keypair name",
                        "key-pair-file":"pem location",
                        "log_uri":"s3n://log-location",
                        "region":"us-east-1"
                      }

  • Job Flow Essentials
    1. Creating a Job Flow  (./elastic-mapreduce --create --alive)
    2. Listing all Job Flow (./elastic-mapreduce  --list
    3. Retrieving information about a specific Job Flow (./elastic-mapreduce --describe --jobflow ID)
    4. Adding a step using default parameter values to a Job Flow (./elastic-mapreduce -j ID --stream)
    5. Terminating a Job Flow (./elastic-mapreduce --terminate ID)
    6. Listing all active Job Flows (./elastic-mapreduce --list --active)
  • Streaming Job Flow
                ./elastic-mapreduce --create --stream \
                                     --mapper s3n://[mapper-location]
                                     --input s3n://[input-location]
                                     --output s3n://[output-location]
                                     --reducer s3n://[reducer-location]


Tuesday, June 19, 2012

geohash adjancent codes

An algorithm to find out neighbors of a geohash code.

  1. Base32: 123456789bcdefghjkmnpqrstuvwxyyz
  2. Neighbors (direction type)
    1. right even: bc01fg45238967deuvhjyznpkmstqrwx
    2. left even: 238967debc01fg45kmstqrwxuvhjyznp
    3. top even: p0r21436x8zb9dcf5h7kjnmqesgutwvy
    4. bottom even: 14365h7k9dcfesgujnmqp0r2twvyx8zb
    5. righ odd= top even (p0r21436x8zb9dcf5h7kjnmqesgutwvy)
    6. left odd= bottom even (14365h7k9dcfesgujnmqp0r2twvyx8zb)
    7. top odd= right even (bc01fg45238967deuvhjyznpkmstqrwx)
    8. bottom odd= left even (238967debc01fg45kmstqrwxuvhjyznp)
  3. Borders (direction type)
    1. right even: bcfguvyz
    2. left even: 0145hjnp
    3. top even: przx
    4. bottom even: 028b
    5. right odd= top even (przx)
    6. left odd= bottom even (028b)
    7. top odd= right even (bcfguvyz)
    8. bottom odd= left even (0145hjnp)
  4. function calculateAdjancent(String srcHashCode, String direction)
              srcHashCode = srcHashCode.toLowerCase();
              char lastCharacter=srcHashCode.charAt(srcHashCode.length-1);
              String type=(srcHashCode.length%2)?'odd': 'even':
              String base=srcHashCode.subString(0,srcHashCode.length-1);
             
              if(Borders[direction][type].indexOf(lastCharacter) != -1)
                       base=calculateAdjancent(base,direction);

              return base+BASE32[Neighbors[direction][type].indexOf(lastCharacter)];

 Ex. find a's right neighbor
       calculateAdjancent(a,right);

Tuesday, June 5, 2012

speed up st_within query in postgresql

1. Create index
CREATE INDEX idx_tablename_columnname ON tablename USING GIST(columnname);

This will create a spatial index for the geometry column [columnname] in the [tablename].  According to postgis manual, it creates the bbox for each geometry in the table to speed up the query.

2. Cluster
CLUSTER

After index created, cluster the table to arrange the similar data in to the same disk space.

3. Optional: simplify the geometries.
SELECT ST_NPOINTS(geom_column) AS npoints FROM tablename ORDER BY npoints DESC LIMIT 25;

SELECT ST_SIMPLIFY(geom_column, number_scale) AS simpgeom FROM tablename;

Check the number of points in the 25 biggest geometries.  If they are too big, the speed of st_within query will be slow. You can try simplify the geometries in the table if possible.  Simplifying means reduce the points in geometries.

4. Test
EXPLAIN UPDATE locationtable SET columnname = (
SELECT columnname FROM regiontable
WHERE ST_WITHIN(geom.locationtable , geom.regiontable)
)

Explain breaks down the query into plan and evaluate if an index is needed or not.  Through the EXPLAIN, you can find out if the query is efficient enough.

5. Dissolve multi-polygon to polygon




Wednesday, May 30, 2012

R Postgresql

Reading from PostgreSQL DB

  1. Install RPostgreSQL
    :library(RPostgreSQL)
  2. Open a db connection using the dbConnect function
    :con<-dbConnect(PostgreSQL(),user="xxx", password="xxx",host="localhost", client.flag=CLIENT_MULTI_RESULTS)
  3. Use dbGetQuery to initiate a SELECT and return the result sets
    :sql<-"SELECT * FROM table WHERE name='xxx'"
    :rows<- dbGetQuery(con,sql)
  4. Use dbDisconnect to terminate the connection
    :dbDisconnect(con)

Monday, May 28, 2012

tiger geocoder 2011


Settings
  1. Ubuntu 12.04
  2. Postgresql 9.1
  3. Postgis 2.0
  4. Tiger 2011 Data
Installing Postgresql + postgis2.0
  1. sudo apt-get install python-software-properties
  2. sudo apt-add-repository ppa:sharpie/for-science
  3. sudo apt-add-repository ppa:sharpie/postgis-nightly
  4. sudo apt-get update
  5. sudo apt-get install postgresql-9.1-postgis
  6. sudo apt-get install build-essential postgresql-9.1 postgresql-server-dev-9.1 libxml2-dev proj libjson0-dev xsltproc docbook-xsl docbook-mathml gettext postgresql-contrib-9.1 pgadmin3
Updating gdal and geos libraries
  1. sudo apt-add-repository ppa:olivier-berten/geo
  2. sudo apt-get update
  3. sudo apt-get install libgdal-dev
  4. sudo apt-get install g++ ruby ruby1.8-dev swig swig2.0 unzip
  5. wget http://download.osgeo.org/geos/geos-3.3.3.tar.bz2
  6. tar xvfj geos-3.3.3.tar.bz2
  7. cd geos-3.3.3
  8. ./configure --enable-ruby --prefix=/usr
  9. make
  10. sudo make install
Building postgis2.0
  1. wget http://postgis.refractions.net/download/postgis-2.0.0.tar.gz
  2. tar zxvf postgis-2.0.0.tar.gz
  3. cd postgis-2.0.0
  4. ./configure --with-gui
  5. make
  6. sudo make install
  7. sudo ldconfig
  8. sudo make comments-install
  9. sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/shp2pgsql
  10. sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/pgsql2shp
  11. sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/raster2pgsql
Building spatial db template
  1. createdb postgis2
  2. psql -d postgis2
  3. psql>create extension postgis;
  4. psql>create extension postgis_topology;
Building geocoder
  1. createdb -T postgis2 geocoder
  2. wget http://postgis.refractions.net/download/postgis-2.1.0SVN.tar.gz
  3. tar zxvf postgis-2.1SVN.tar.gz
  4. EDIT postgis-2.1SVN/extra/tiger-geocoder/tiger2011/tiger_loader.sql
    export PGBIN=/usr/bin
    export PGPORT=5432
    export PGHOST=localhost
    export PGUSER=postgres
    export PGPASSWORD=XXX
    export PGDATABASE=geocoder
    PSQL=${PGBIN}/psql
    SHP2PGSQL=${PGBIN}/shp2pgsql
  5. EDIT postgis-2.1SVN/extra/create_geocode.sh
    PGPORT=5432
    PGHOST=localhost
    PGUSER=postgres
    PGPASSWORD=XXX
    THEDB=geocoder
    PGSQL_CMD=/usr/bin/psql
    PGCONTRIB=/usr/share/postgresql/9.1/contrib
    //unmark the following two lines
    ${PGSQL_CMD} -d "${THEDB}" -c "CREATE EXTENSION fuzzystrmatch.sql"
    ${PGSQL_CMD} -d "${THEDB}" -c "ALTER DATABASE ${THEDB} SET search_path=public,tiger;"
  6. sh ./create_geocode.sh
Downloading tiger data
  1. sudo mkdir -p /gisdata/temp
  2. sudo chown -R postgres:postgres /gisdata
  3. psql -d geocoder
  4. psql>\a
  5. psql>\t
  6. psql>\o nation.sh
  7. psql>select loader_generate_nation_script('sh');
  8. psql>\q
  9. sh ./nation.sh
  10. psql -d geocoder
  11. psql>\a
  12. psql>\t
  13. psql>\o all_states.sh
  14. psql>select loader_generate_script(ARRAY['AK','AR','AS',...],'sh');
  15. psql>\q
  16. sh ./all_states.sh
  17. psql -d geocoder
  18. psql>select install_missing_indexes();
Testing geocoder
  1. psql -d geocoder
  2. get geocode of "27 Waterview Drive, Shelton, CT 06484)psql>select g.rating,ST_X(geomout) AS lon,ST_Y(geomout) AS lat, (addy).* from geocode('27 Waterview Drive, Shelton, CT 06484') AS g;  
  3. get reverse-geocode of point(-73.0811694362766 41.2931353732512)psql>select pprint_addy(r.addy[1]) as st1, pprint_addy(r.addy[2]) as st2, pprint_addy(r.addy[3]) as st3, array_to_string(r.street,',') as cross_streets from reverse_geocode(st_geomfromtext('POINT(-73.0811694362766 41.2931353732512)',4269),true) as r; 

Thursday, May 24, 2012

Maven

Why maven?
Ans: you want to manage jars in your projects.  you want to manage dependencies and versions in your projects.  you want to manage your project structures.  you want to manage the building processes.

How setup maven
1. export M2_home=/directory/to/maven
    export PATH=/directory/to/maven:${PATH}

How to use