Vengatc techology logs

OPOA and RIA – What does it mean?

Posted in Architecture by vengatc on June 14, 2011

Recently there was a situation in my work place where we need to standarize or adopt patterns for UI.  And OPOA RIA was taken as a guidline after my recommendation, but  adopting this patterns literally is an offense you can do to the development unless the team totally understand what is conveyed by these terminologies…

Following was my explanation , im documenting it for it can be of use for other people who hit the same road….

[Explanation]

What we are actually trying to do with these terminologies like RIA and OPOA is to get away from the traditional perspective of web development. We are using these terminologies as a guideline to help us validate if we are doing the right thing.

The crux is  that the web development strategy and technologies has evolved and hence the way we think about building the interfaces should also change.

The early  terminologies the industry used  were think and thin client, Desktop was considered thick because of its richness and the amount of  business logic  it carried with. The thin client were the once with  page by page flow where we sacrificed on the richness that a desktop application would offer for the sake of distributablity. Then market wanted distributablity and also the richness that the desktop client could offer, hence technologies like flex/flash and Silverlight were born.

 

But  flex and flash were proprietary  technology and users of those technologies were at the mercy of the browser implementers (safari or IOS) people wanted to stick with standards that would work across all hardware/OS. Hence efforts were spent to have richness without sacrificing the distributablity , framework like YUI ( Yahoo user interface) was a result of it. YUI was  built on top of the standardized JavaScript and HTML which works on all browsers . ExtJS is an matured version of YUI , which helps use built the desktop like richness on the web without sacrificing the distributablity and it stands on standards (HTML & JS).

 

ExtJS was build to help people develop RIA (Rich Internet Application a.k.a Desktop like application on Browsers) as opposed to traditional web1.0 pages.  ExtJS  becomes a terrible/counter- productive tool if we look at it from the perspective of web flow and web1.o page flow….

 

Hence Rich Internet Application / [One Page One Application]  are umbrella terms used for the UI designers/engineers to think web development in the terms of desktop application and have multiple functionality and actions in a single page and bring back the richness that desktop applications offered.

 

> it means we should only have one form (means one browser windows) for the whole application. Venky, am I right?

So in OPOA , What we call as application is subjective. You may choose to call the entire IRIS4 as one application, but someone else can choose to call CnC alone as a application and someone else can Choose to call a  group of command as one application.

More granular we go we tend to hit the web1.0 , at the highest form  we may lose multitasking ability, So we need to strike a balance  in between two and it is totally subjective.

So if we are able to do multiple things on a single page without refresh, and have the look and feel feels close to a desktop application we are in abiding to RIA and OPOA.

 

If we need a guideline/rule to check our action we can ask this question every time we have a doubtà “Will a desktop application do it this way?”

So in this case- opening multiple window…. : In that particular scenario will a desktop application open a new window to show the application functionality, if yes. Then I think we are good.

 [/Explanation]

Service oriented architecture (SOA) Governance

Posted in Architecture by vengatc on August 22, 2009

SOA Governance

Service oriented architecture  (SOA) , the latest advancement in software architecture and technology enabled architects to embrace the  best of distributed computing and open service architecture.  It allowed disparate teams to operate independently and publish their software artifacts for others to reuse.

The ultimate business objective is achieved by the orchestra between independent artifacts talking to each other to achieve the overall systems object.

Governance

Downside of SOA

Though this orchestra between system increase reusable and increase modularisation , just like any other technology it has bought its own disadvantages with it.

This distributed nature allowed the software to scale to an extent that it is unmanageable. More over when the software artifact is highly distributed the interconnection these artifact have with each other  becomes complex. As the system grows in the SOA paradigm the coupling between independent system also grows. The result is the independent system which originally intended to be independent becomes dependent on each other because of a new problem which is introduced.

The provider  & Consumer issue-  The provider system as it is open architecture allows unidentified consumers to consume their service. Being withing the same organization  the provider cannot modify or discontinue a service untile it is sure that its there are no consumer for the service which it intends to modify. The architecture which intented to advocate independence is actually hypocrate if it also does not have a way to govern the interconnections.

This brings in the need for a new area called SOA Governance.  A way to maintain a repository  of the sharable assets and that providers exposes and a the consumers of the assets. Governance should also give the ablity for the provider to maintain contract for the consumption with the consumers.

Governance is not achievable unless the needed process is introduced into the regular development process. And It should be supported with the needed tools to make it happen in the organization.

An exciting challenge that im designing and driving to get it done for my Organization. Once it is fully embraced by the organization and its people Govrnance becomes an intangible asset to the organization enabling it to fully realize the power of SO Architecture of software design.

Framework for Business Intelligence over RRD files

Posted in Architecture, framework, java, myideas by vengatc on October 31, 2008

Not any of pentaho, Kettle or Talend supports RRD as a datasource for  Business Intelligence(BI). I have designed an Enterprise Information Integration  framework/layer over Multiple RRD datasource. This layer will allow the EII-RRD (the  solution) users to aggregate the data across multiple RRD files and do BI (average/Max/Min) functions on them.


Problem statement:


I will attempt to give a brief description of the problem and the constrains to be considered before deciding a solution for it.

Round Robin Database- is a file base database used to store time and value pair. It is very much used in Network management Solutions where it need to record some values based on the time.  Eg. Network latency every 5 min. The database allows you to store average for various time intervals so that when you updated every 5 min, it automatically updates the hourly or daily averages. Pretty useful in performance management portion of NMS solutions.

Now what is the problem? Yes when you talk of Business Intelligence it is a matter of aggregating data across multiple sources and trying to co-relate to obtain some kind of information which is useful for decision making or analyzing.

So here comes the problem statement- you will have RRD files pertaining to a protocol’s response time for an IP of particular network.  You will have multiple networks like that. So the BI here is given a time frame, grab the average response time of a particular protocol across all machines, in all network.


Constrains to consider before designing.


The problem will be daunting and computational intensive when you consider the time and space complexity. The solution’s main focus is to address memory complexity the second is Time. Memory complexity is must solve and the timing complexity should be reduced to the point were horizontal scalability would kick in when the resource is limited.


Crux of the solution:


Memory- I used the virtual memory concept of design here. I.e. Consider if the user queries like he needs the average graph from 1970-2008 for every 5 min interval imagine the memory that is going to allocated. i.e.. number of 5 mins between 1970 to 2008.


In my design the processing unit will read the time/value pairs from RRDs files and will hand it over to a  Virtual Memory layer. This Virtual Memory layer will promise the processing unit that it has the memory to store all the data (similar to the way the VM in OS does ) but it will allocate memory only if the data is available for that time interval. It is for sure im-material of the user’s request the data will be crowed in around 2008 time frames so the effective memory use will be very less. This kind of Virtual Memory like (promise you have more but do work for less) kind of design is some thing new brought into my design catalogue. It did break the memory issue from from GBs to 1 or 2 MB of usage.


This virtual memory kind of design really helped solve the memory problem, and i loved it and will be using it in my future design.


Time complexity and other aspects of the problem is  noting interesting as I solved it with my usual design experience no new learnings  there.


Whats in it for you?

Blog readers/architects, when ever you have a design for a problem

which allocates a huge chunk of memory in proportional to the input (or some system parameter which has no bounds)  to solve your problem  (and)  When you end up using only portion of it for actually solving the problem because of the distribution characteristics of the input data.


Consider this virtual memory concept of design in your bouquet of design principles. It might help.


For People who are tired of finding open source solutions or proprietary solutions doing BI on RRD , if you wish to get more insight into my solution or want to discuss any aspect of it contact vengateswaran.c@gmail.com . Only technical questions encouraged.

Java Memory and Garbage Collection [GC] – Internals

Posted in Architecture, java, JVM, Performance by vengatc on October 14, 2008

Java 5 has provided architects to scale applications memory wise based on the charactersitics of the application’s memory usage pattern.

Java Garbage collector basics

           The default GC of java is a serial collection GC. i.e when java decides to do the GC your application threads are suspended until the GC thread finishes.

Implications 

           On a single processor machine, this type of GC is good , but on multiple processor machine this is a kill. Imagine  you have your Jboss or IBM WS that runs for banking project , for sure there would be a high hardware investment with muliple processor (not less than 12 processor machine).With this dedicated setup with serial collection your application that ran on 12 processor stops and only one processor is used for the GC activity. Ur applicaitn is in hault. So the throughput of the applicatino is directly impacted by your GC and it worsens with the increase in processors.

So it is a must to customize the GC collections , But remember until you understand the intricasis of the Java Heap and GC dont meddle with the GC collection,leave it to default because a non-expert is more likely to spoil than to increase the throughput.

See the throughput distribution in the below graph..

Java garbage collection design

     What would you do if you are given a chance to decide the GC. IF you have a serial algorithm to sweep to all the objects in the memory and then dealocate the unreferenced objects then the BigO of the algorithm  you design is directly propositonal to the nubmer of objects in the memory. So the time complexity of the algo you design will worsen for larger system.

How Sun Microsystem gets across with this Time complexity issue???

As far as memory conceptions is concerned based on research it is identified that the young object has the highest probability to die first. That means if an object is created recently it is more likely to die first than an object that has survived for a while. Current GC algo efficiently uses this principle of memory usage to product better BigO numbers.

Entire Jave Heap is seggregated into multiple segment to take advantage of this young die first fact.

The figure shows how the heap is seggregated, The entire Heap is seperated into Young, Tenured and Perm space.

GC algo is split into minor and major runs.

Minor run does GC only in the Young space, and major run does GC on Both Young and Tenured space. That is Major run is the maximum time a GC could take and we dont want this to run that often. To avoid doing the major runs , Java GC uses the Young die first fact and runs GC on the Young space. IF the object survives the run it is moved to Tenured. When Tenured is filled then Major run is triggered. This means major run is mostly avoided.

What this implies for architects?

          Intelligently manupulating the young and Tenured size we can inpact the various characteristics of the application.

1. Frequency of GC runs.

2. Time taken for the GC to complete its run.

3. Throughput of the application.

Im not going to explain why it is impacted, readers are expected to understand the relation at this portion of the turtorial.

Java 5 provides you ablity to maniupalte the relative size of the memory segments.

What next?

      Yes i agree the throughput problem of the serial collector is still open. Java 5 has allowd us to tackle this by providing 2 alternative GC Algos to the traditional Serial Collector method.

1. Throughput collector

2.  Concurrent Low Pause Collector

I will attempt to give a short decription of the Above collectors

1. Throughput collector

        The throughput collector is a generational collector similar to the serial collector but with multiple threads used to do the minor collection. The major collections are essentially the same as with the serial collector. By default on a host with N CPUs, the throughput collector uses N garbage collector threads in the minor collection. The number of garbage collector threads can be controlled with a command line option (see below). On a host with 1 CPU the throughput collector will likely not perform as well as the serial collector because of the additional overhead for the parallel execution (e.g., synchronization costs). On a host with 2 CPUs the throughput collector generally performs as well as the serial garbage collector and a reduction in the minor garbage collector pause times can be expected on hosts with more than 2 CPUs.

2. Concurrent Low Pause Collector

The concurrent low pause collector is a generational collector similar to the serial collector. The tenured generation is collected concurrently with this collector. This means the pause in the application is close to nil.

 

 

Aspiring memory manupulators:)

To start with just observe the memory conceptions of your software system. 

java -verbose:gc xyz.jar

[GC 325407K->83000K(776768K), 0.2300771 secs] 
[GC 325816K->83372K(776768K), 0.2454258 secs] 
[Full GC 267628K->83769K(776768K), 1.8479984 secs

Conclusion

Gather enough understanding of the GC behaviour of your system against the hardware. Just remember a system that is best in Single process will be a pain in Multiprocessor.  And the system that is in good in Multiprocessor will be a kill in single. And the effeciency also differs with the applicaitn characterstics.  So leave it to default until you are confortable with the details. 

So GC tuning by architects is a ever on task through out the lifecycle of the project. And it requires practics.

Reference :http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html

As always technical queries alone accepted regarding GC tuning at vengateswaran.c@gmail.com

Framework – To do realtime data replication for postgres across firewall

Posted in Architecture, database, java, myideas, replication by vengatc on October 6, 2008

Just wanted to write about a recent framework I developed for database replication with postgres. For readers it would give you an idea to think in this direction if you come across this problem.

Requirement

We had a requirement to replicate data realtime from a postgres database from multiple machines which reside  inside a firewall to a cloud server on the internet.  We evaluated various technologies and tools available on market ,  every solution we came across requires us  to open up port in the firewall. And most of it are not real time.  Most of the tools that we saw in market were ETL kind of tool where you take the data in a batch and replicate it , more over it will not work across  firewall. I was architecting this product , and i have to come up with a solution no matter what.I opted to write my own framework. 

Im a strong believer of build the solution in mind/paper before doing the  code. So i have to develop a replication system that would run onvarious machine and which would replicate data to the central server. 

Im not going to mention the thought process that i put in for each design decision i have taken , but im going to mention what is the end result.

Step 1. I cracked the JDBC libary of postgres, Took the source code from Postgres opensource repository and i read the code flow of the JDBC driver of the postgres.

Static statement Vs Prepared statement… issue.

Java program would use the jdbc libary to construct a static SQL statement or a prepared statement. When it is a static SQL query you have the query in hand. But when it is a prepared statement is it actually inside teh JDBC driver code where the actual Query is prepared before sending to the native methods to postgres. 

I had figured out a place where the entire query leaves the JDBC diver to the native funtions to the database. There i have written a queue to sniff all the querys that leaves the system.

For technical queries regarding sniffing the query from driver  write 2 vengateswaran.c@gmail.com

Step 2. Now that I have a queue of the sniffed query i have to ship it across to the server which is across firewall. So WebService comes to resque here. I published a webservice at the Server to accept query and the client identifier and replicate create the connection and issue the query to the database.

Step 3: So have written a engine that woudl take the queued query at the client end and ship the query to the server across across firewall through webservice. And the server end of the webservice would fire the same query on the server end. 

Multiple client (Master) postgres databases were able to replicate real time data to a single database cluster on the server.

Very high level design.

After end of regrous design and implementation and performance testing, the framework that I designed and implemented effeciently replicates databases from multiple machines into the cloud server across firewall. It really scales up well….to make me happy.

Feel free to contact me[ vengateswaran.c@gmail.com] if you need more insight on the technical aspects of the framework. Only technical queries invited.

Tagged with:

JVM instrumentation – Performance tuning

Posted in Architecture, java, JVM, Performance by vengatc on October 3, 2008

Performance monitoring

Sun has done an excelent work in integrating remote management into JVM. They have build in SNMP Agent like  capablity into the JVM. The SNMP OID is synonomous to the MBeans and SNMP MIB is synonomous to the MBean Server.

This beautiful capability allows you to connect to the JVM and monitor the crime we have done in the code with regard to the run time memory/CPU and thread utilization. I became a lover of this feature. I always thought to build this capability into the applications we build. This feature is a  real bliss for architects.

Head on Jump

To have a first hand experience with the JVM instrumentation , just follow the simple steps…

1. Have Java 5.0 installed in your system and create an java program that runs until you forcefully stop it. If you have a framework with threadpools, resource management etc.. it is a good example.

2. Run your java program with this additional parameter.

Java -Dcom.sun.management.jmxremote  xyz.jar

This command publishes the MBServer in the JVM as a RMI resource for the Jconsole to connect to.

3. Start Jconsole and connect to the Program in your connection dialogue box.

Jconsole will alow you to monitor the Memory used and the threads used. Identify deadlocks etc etc..

See the peformance scaling for an Enterprise Information Integration framework that i recently wrote.. This is a heavy ETL and EII kind of tool that aggregate data from multiple databases. You would difinitly need to have some performance numbers for its production run.  jconsole tool would help you prove the roboustness your ur framework by showing the memory and thread occupancy and how controlled they are admist heavy load on the framework.

Tagged with:

Flex and Cairngrom Architecture

Posted in Architecture, Flex, java by vengatc on October 3, 2008

Developing RIA with Flex fun..

What and why is RIA?

Like fashion industry technology also swings back and forth. Initial when client server architecture was introduced industry started moving in the direction of developing desktop based rich applicaiton which talks to the server. But later architecture matured and industry wanted many clients to access the application so it rediculously called the desktop application as thickclient. and industry stared moving in the direction of thin client funda like HTML and other dynamic web technologies. Now again industry started missing the richness in the traditional destop based thinkclient , and it also got bored with the HTML based request response model. This gave the birth of RIA… FLash and silverlight.

Their aim is to develop thin-think client that can run in browser and still hide all request response boaring stuff that webclients were experiencing.

Flex was the fourrunner in t his market.

Flex development nightmares?

Developing application with flex builder a sample flex program will all be a easy going task . But the real nightmare kicks in when the applicaiton grows big. When you want to build a real production system out of it. The front end MXML becomes really messy. You will have a single monolythic mxml file where we have to keep writing actionscripts and mxml. We can seperate actionscript into seperate file and mxml into multipel file but you will hit a point were you cry for framework.

Cairngrom would be your rescue.

What is cairngorm?

Cairngrom tries to bring in the tradional software engineering best practises in to the Flex applicaiton development.

It is a way you fit in your

1. Business Delegate

2. Service locator

3. Command

4. Model Locator.

5. Front controller

Kind of patterns in to your flex development. The time to adopt cairngorm frame is bit more but once it is done and when we start fitting in pieces it really pays of.

Following picture gives a architecture of a cairngrom based flex application.

For more details on cairngrom follow some of my favourite links below.

http://www.cairngormdocs.org/cairngormDiagram/index.html

http://www.onjava.com/pub/a/onjava/2003/02/26/flash_remoting.html?CMP=AFC-ak_article&ATT=Flash+Remoting+for+J2EE+Developers

http://sujitreddyg.wordpress.com/2008/01/14/invoking-java-methods-from-adobe-flex/

http://livedocs.adobe.com/blazeds/1/blazeds_devguide/blazeds_devguide.pdf

http://sujitreddyg.wordpress.com/2008/05/16/session-data-management-in-flex-remoting/

http://www.brucephillips.name/blog/index.cfm/2008/6/23/Using-BlazeDS-to-Send-UserDefined-Data-Types-Data-Tranfer-Objects-from-Java-to-Flex

http://renaun.com/blog/2006/07/04/55/

http://www.adobe.com/devnet/flex/articles/cairngorm_pt4_05.html

http://examples.adobe.com/flex3/componentexplorer/explorer.html