dd

Tuesday, August 14, 2012

JMS - glassfish

JMS allows applications to create, send, receive, and read messages. Using JMS, application can communicate with others loosely coupled, asynchronous, and reliable.

It is useful in the following use cases:
1) inventory component can send a message to factory component when inventory is low.
2) factory component can send a message to parts component to get parts assembled.
3) parts component can send message to inventory or supplied component to get parts ready.
4) factory and parts component can send message to accounting component to update their budget.
5) business component can update their info to sales force.

JMS architecture
1) A JMS provider (provided by Glassfish JEE container)
2) JMS clients (any JEE application component can be JMS client)
3) Messages (objects that communicate information beween JMS clients)
4) Administered objects (Connection factory and destinations)

Messaging domains
1) point-to-point domain contains queues, senders and receivers. each message has only one consumer. senders and receivers have no time-dependency. receivers will acknowledge the success of message processing.

2) publish-subscribe domain contains subscriber, publisher and topics. consumers subscribe to the topics they interested in. each message has multiple consumers. consumers and publishers are time-dependency relationships. consumers can only consume messages published after consumers' subscription.

Programming model
1) administered objects (provided by Glassfish): connection factory and destination.
2) connection factory( an object that clients used to connect to a jms provider)
Ex.
@Resource(lookup = "/jms/ConnectionFactory")
private static ConnectionFactory connectionFactory;

3) destination
Ex.
@Resource(looku="/jms/Queue")
private static Queue queue;

4) connections
Ex.
Connection connection = connectionFactory.createConnection();

5) sessions: sessions are used to create producers, consumers, messages, and queue browsers.
Ex.
Session session = connection.createSession(transactionType, session.Auto_acknowledge);

6) message producers
Ex.
MessageProducer producer = session.createProducer(dest);

7) message receivers
Ex.
MessageConsumer consumer = session.createConsumer(dest);

8) message listener
Ex.
Listener myListener = new Listener();
consumer.setMessageListener(myListerner);

9) messages (header, property and body)
MessageType: TextMessage (String object), MapMessage(name-value pairs), BytesMessage(undefined bytes), StreamMessage(primitive value stream), ObjectMessage (serializable object), Message(without body).

10) Queue Browser
Ex. QueueBrowser browser = session.createBrowser(queue);

Thursday, June 28, 2012

Running R script on AWS EMS

Install Amazon EMR Command Line Interface

Install Ruby (1.8 up)
Download and unzip CLI (http://aws.amazon.com/developertools/2264)
Configure credential.json

{
"access_id": "AWS Access Key ID",
"private_key":"AWS Secret Access Key",
"keypair": "EC2 keypair name",
"key-pair-file":"pem location",
"log_uri":"s3n://log-location",
"region":"us-east-1"
}

Job Flow Essentials

Creating a Job Flow (./elastic-mapreduce --create --alive)
Listing all Job Flow (./elastic-mapreduce --list
Retrieving information about a specific Job Flow (./elastic-mapreduce --describe --jobflow ID)
Adding a step using default parameter values to a Job Flow (./elastic-mapreduce -j ID --stream)
Terminating a Job Flow (./elastic-mapreduce --terminate ID)
Listing all active Job Flows (./elastic-mapreduce --list --active)

Streaming Job Flow

./elastic-mapreduce --create --stream \

--mapper s3n://[mapper-location]

--input s3n://[input-location]

--output s3n://[output-location]

--reducer s3n://[reducer-location]

Tuesday, June 19, 2012

geohash adjancent codes

An algorithm to find out neighbors of a geohash code.

Base32: 123456789bcdefghjkmnpqrstuvwxyyz
Neighbors (direction type)

right even: bc01fg45238967deuvhjyznpkmstqrwx
left even: 238967debc01fg45kmstqrwxuvhjyznp
top even: p0r21436x8zb9dcf5h7kjnmqesgutwvy
bottom even: 14365h7k9dcfesgujnmqp0r2twvyx8zb
righ odd= top even (p0r21436x8zb9dcf5h7kjnmqesgutwvy)
left odd= bottom even (14365h7k9dcfesgujnmqp0r2twvyx8zb)
top odd= right even (bc01fg45238967deuvhjyznpkmstqrwx)
bottom odd= left even (238967debc01fg45kmstqrwxuvhjyznp)

Borders (direction type)

right even: bcfguvyz
left even: 0145hjnp
top even: przx
bottom even: 028b
right odd= top even (przx)
left odd= bottom even (028b)
top odd= right even (bcfguvyz)
bottom odd= left even (0145hjnp)

function calculateAdjancent(String srcHashCode, String direction)

srcHashCode = srcHashCode.toLowerCase();

char lastCharacter=srcHashCode.charAt(srcHashCode.length-1);

String type=(srcHashCode.length%2)?'odd': 'even':

String base=srcHashCode.subString(0,srcHashCode.length-1);

if(Borders[direction][type].indexOf(lastCharacter) != -1)

base=calculateAdjancent(base,direction);

return base+BASE32[Neighbors[direction][type].indexOf(lastCharacter)];

Ex. find a's right neighbor

calculateAdjancent(a,right);

Tuesday, June 5, 2012

speed up st_within query in postgresql

1. Create index
CREATE INDEX idx_tablename_columnname ON tablename USING GIST(columnname);

This will create a spatial index for the geometry column [columnname] in the [tablename]. According to postgis manual, it creates the bbox for each geometry in the table to speed up the query.

2. Cluster
CLUSTER

After index created, cluster the table to arrange the similar data in to the same disk space.

3. Optional: simplify the geometries.
SELECT ST_NPOINTS(geom_column) AS npoints FROM tablename ORDER BY npoints DESC LIMIT 25;

SELECT ST_SIMPLIFY(geom_column, number_scale) AS simpgeom FROM tablename;

Check the number of points in the 25 biggest geometries. If they are too big, the speed of st_within query will be slow. You can try simplify the geometries in the table if possible. Simplifying means reduce the points in geometries.

4. Test
EXPLAIN UPDATE locationtable SET columnname = (
SELECT columnname FROM regiontable
WHERE ST_WITHIN(geom.locationtable , geom.regiontable)
)

Explain breaks down the query into plan and evaluate if an index is needed or not. Through the EXPLAIN, you can find out if the query is efficient enough.

5. Dissolve multi-polygon to polygon

Wednesday, May 30, 2012

R Postgresql

Reading from PostgreSQL DB

Install RPostgreSQL
:library(RPostgreSQL)
Open a db connection using the dbConnect function
:con<-dbConnect(PostgreSQL(),user="xxx", password="xxx",host="localhost", client.flag=CLIENT_MULTI_RESULTS)
Use dbGetQuery to initiate a SELECT and return the result sets
:sql<-"SELECT * FROM table WHERE name='xxx'"
:rows<- dbGetQuery(con,sql)
Use dbDisconnect to terminate the connection
:dbDisconnect(con)

Monday, May 28, 2012

tiger geocoder 2011

Settings

Ubuntu 12.04
Postgresql 9.1
Postgis 2.0
Tiger 2011 Data

Installing Postgresql + postgis2.0

sudo apt-get install python-software-properties
sudo apt-add-repository ppa:sharpie/for-science
sudo apt-add-repository ppa:sharpie/postgis-nightly
sudo apt-get update
sudo apt-get install postgresql-9.1-postgis
sudo apt-get install build-essential postgresql-9.1 postgresql-server-dev-9.1 libxml2-dev proj libjson0-dev xsltproc docbook-xsl docbook-mathml gettext postgresql-contrib-9.1 pgadmin3

Updating gdal and geos libraries

sudo apt-add-repository ppa:olivier-berten/geo
sudo apt-get update
sudo apt-get install libgdal-dev
sudo apt-get install g++ ruby ruby1.8-dev swig swig2.0 unzip
wget http://download.osgeo.org/geos/geos-3.3.3.tar.bz2
tar xvfj geos-3.3.3.tar.bz2
cd geos-3.3.3
./configure --enable-ruby --prefix=/usr
make
sudo make install

Building postgis2.0

wget http://postgis.refractions.net/download/postgis-2.0.0.tar.gz
tar zxvf postgis-2.0.0.tar.gz
cd postgis-2.0.0
./configure --with-gui
make
sudo make install
sudo ldconfig
sudo make comments-install
sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/shp2pgsql
sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/pgsql2shp
sudo ln -sf /usr/share/postgresql-common/pg_wrapper /usr/local/bin/raster2pgsql

Building spatial db template

createdb postgis2
psql -d postgis2
psql>create extension postgis;
psql>create extension postgis_topology;

Building geocoder

createdb -T postgis2 geocoder
wget http://postgis.refractions.net/download/postgis-2.1.0SVN.tar.gz
tar zxvf postgis-2.1SVN.tar.gz
EDIT postgis-2.1SVN/extra/tiger-geocoder/tiger2011/tiger_loader.sql
export PGBIN=/usr/bin
export PGPORT=5432
export PGHOST=localhost
export PGUSER=postgres
export PGPASSWORD=XXX
export PGDATABASE=geocoder
PSQL=${PGBIN}/psql
SHP2PGSQL=${PGBIN}/shp2pgsql
EDIT postgis-2.1SVN/extra/create_geocode.sh
PGPORT=5432
PGHOST=localhost
PGUSER=postgres
PGPASSWORD=XXX
THEDB=geocoder
PGSQL_CMD=/usr/bin/psql
PGCONTRIB=/usr/share/postgresql/9.1/contrib
//unmark the following two lines
${PGSQL_CMD} -d "${THEDB}" -c "CREATE EXTENSION fuzzystrmatch.sql"
${PGSQL_CMD} -d "${THEDB}" -c "ALTER DATABASE ${THEDB} SET search_path=public,tiger;"
sh ./create_geocode.sh

Downloading tiger data

sudo mkdir -p /gisdata/temp
sudo chown -R postgres:postgres /gisdata
psql -d geocoder
psql>\a
psql>\t
psql>\o nation.sh
psql>select loader_generate_nation_script('sh');
psql>\q
sh ./nation.sh
psql -d geocoder
psql>\a
psql>\t
psql>\o all_states.sh
psql>select loader_generate_script(ARRAY['AK','AR','AS',...],'sh');
psql>\q
sh ./all_states.sh
psql -d geocoder
psql>select install_missing_indexes();

Testing geocoder

psql -d geocoder
get geocode of "27 Waterview Drive, Shelton, CT 06484)psql>select g.rating,ST_X(geomout) AS lon,ST_Y(geomout) AS lat, (addy).* from geocode('27 Waterview Drive, Shelton, CT 06484') AS g;
get reverse-geocode of point(-73.0811694362766 41.2931353732512)psql>select pprint_addy(r.addy[1]) as st1, pprint_addy(r.addy[2]) as st2, pprint_addy(r.addy[3]) as st3, array_to_string(r.street,',') as cross_streets from reverse_geocode(st_geomfromtext('POINT(-73.0811694362766 41.2931353732512)',4269),true) as r;

Thursday, May 24, 2012

Maven

Why maven?
Ans: you want to manage jars in your projects. you want to manage dependencies and versions in your projects. you want to manage your project structures. you want to manage the building processes.

How setup maven
1. export M2_home=/directory/to/maven
export PATH=/directory/to/maven:${PATH}

How to use

dd