Monday, September 15, 2014

Parsing/Loading huge XML & managing memory in perl XML:simple

Purpose: 
To parse "huge" XML file into perl data structure using XMLIn(). This article does not intend to teach XML or XMLIn(). This focuses only in memory management & efficiency when using XMLIn() and Twig. The same ideas can be applied in Java or C++ too. 


Here we have taken 235 MB data file named "prices.xml" as example from original 900 MB file, since the mighty unix system ran out of memory and terminated process!! and cannot be used as benchmark for comparison. Since the XMLIn() could not complete the parsing for 900MB file, this study is not just a improvement but an "essential" fine tuning of code, without which the production will fail.

Intro: Perl’s XML::Simple is a simple module (as the name suggests), that has member function XMLIn() to parse an XML file and convert it into a very easy to operate perl hash data structure. Though it is easy to use the hash once returned, it takes a long time and lot of memory before returning. So let's make our hands dirty and analyze right away:

When ran with this 235MB prices.xml file, XMLIn() did not return back. After waiting for a while, Instead of interrupting with ^c, I ran memory & system monitoring commands to see if it was really running fine or hanging. When I ran ‘top’ command, It showed two things. One good. The other bad.
The good: it clearly showed most activity happening in my perl program. (So may be not hanging):

The Bad: The process started in few MBs, reached 320MB in 1 minute, 630 MB in 2 minutes and so on. In about 6 minutes, the XMLIn consumed about 1.8 GB of resident and virtual memory and was still running for a few more seconds. This is not just a memory problem but it would cause swap-in, swap-out of large virtual memory (since Unix is time sharing system) which is time consuming and as well can slow down entire unix system.

So this article is about how to make this efficient.. interested ? read on..

The prices.xml file has several million rows. Loading this into memory using XMLIn() is not easy because of several tasks performed during loading: 

  1. parsing, splitting each piece of millions of points in the data
  2. defining sibling/child/parent relationships, annotations in them
  3. storing it in a tree like structure with no loss of data with ability reproduce the original file when used in XMLOut etc etc.
So it needed 1.8GB virtual memory, as the top command showed. When the function returned the memory usage dropped to normal. Let us investigate the data in prices.xml:

The file looks like this:
<MktData Date="2014-09-17" … … … … ….. >
<Sym="IBM"  MatDt="2015-01-17" Price="52.80" … … … … ….. >
<Sym="AAPL" MatDt="2015-01-17" Price="101.53" … … … … ….. >
….   Several millions of rows like this.. and then
<Sym="TSLA" MatDt="2015-01-17" Price="282.17" … … … … ….. >
</MktData >

As we see the snap shot of datafile, it is a rather almost flat simple structure. Except the first and last lines, all lines are linear, representing exactly one element and all data of that single record/element is specified in the same line. This qualifies this file to be operated in line by line or element by element. Let us see how it works out:


one at a time please:

So we observed that, since every line contains one "xml-record", it renders the file to be processed line by line possible. So I was almost sure that line by line processing will use less memory even before testing it. Reason: when XMLIn works and parses, it needs to store each and every element in full keystring, valuestring (Sym=>"AAPL"MatDt=>"2015-01-17" Price="101.53") structure, along with the other XML standards discussed in last para. But, when we process the same line by line, the same happens, but for a micro subset of data whose size is a few KB for each and every line compared to GB for the whole file As well, after each and every line parse XMLIN() returned, I can copy those data into perl hash structure named Prices, which can store only values (like "AAPL"=>101.53) whose size is few bytes per line, once processed and stored. After we move the next line to process, the last lines memory is unreferenced and released, so the memory doesn’t grow faster than few bytes per line.

So I changed code to open the prices.xml as a flat file, read it one line at a time in a loop. Inside the loop, I processed that line using same XMLIn call.  I picked up the structure returned, stored only the needed values and stepped to next iteration, printing rows processed once for every 1000 lines. This was really memory efficient and from the beginning to end, the memory was very stable and low(35 MB vs 1.8 GB):




However the program did not finish after 6 minutes. It continued until 17 minutes to finish processing. Seems apparently we traded speed for memory.  Fair enough? May be. At least we have avoided slowing down the entire system and stopping other programs. May be not. There may be better ways around. I started looking at the code.


somewhere in the middle:

It did not take much time for me to grasp what was happening. With this new code we are calling XMLIn more than a million times (i.e once per line). But earlier we called it only once. Each XMLIn call requires some basic setups, initializations, stack up, stack downs and must have taken some time finish, all added up time.  So I decided to balance it out, with fewer number of XMLIn calls. I changed the code to read 1000 lines in a loop, and call XMLIn once with all 1000 lines. I chose the number 1000 because the xml records are rather small size and will fit in few blocks of memory(considering a block is 8KB). To make 1000 individual XML structures to be a single valid structure, needs embedding header and footer tags. So I made this structure by adding < Thou > in the XML. So instead of every line like <Sym="IBM" MatDt="2015-01-17" Price="52.80" … … … … ….. >
fed to XMLIn, 1000 lines were read from the Prices file and fed once like this with first and last lines added:
<Thou >
<Sym="IBM"  MatDt="2015-01-17" Price="52.80" … … … … ….. >
<Sym="AAPL" MatDt="2015-01-17" Price="101.53" … … … … ….. >
….   1000 lines like this
<Sym="TSLA" MatDt="2015-01-17" Price="282.17" … … … … ….. >
</Thou >


And after the XMLIn returned, I would run through all the elements in loop, store them one by one (1000 times). But will this run faster? I kept my fingers crossed and ran it, and watched the memory growth again, with our faithful top command:


This time, it took just 6 MB more (than line by line version), and returned back in 6 minutes 20 seconds. Very close to original run’s speed and close to line by lines memory usage. Awesome.  I should have stopped here, but I didn't.


Multi-threading failure :

Carried away by obsessive compulsion of speeding up the program, I spent more than a day in implementing multi-threading version to make parsing faster. I used threads to parse every thousand lines in parallel with 10 threads running. It finished in just 2 minutes 27 seconds. With complicated memory handling, synchronization of threads all I achieved was about 2 ½ times fast or 60% decrease in run time. But that brought back memory usage to original levels (actually it exceeded), as every thread needs to keep its own memory, variables, data structures, while the program size to handle multi-threading has already bloated up the program.  Surely not worth it. I also observed few facts, learned few things by this attempt, which are out of scope for this article. 


The Harmonious end - XML::Twig

The purpose of XML::Twig - as the name suggests, this module helps us just to process the little twigs of huge XML tree, instead of building entire tree. This straight away eliminates the memory issue by loading small tree of few twigs of each Sym element (instead of loading million branches of entire xml in reproducible manner we discussed earlier) provided we use it in correct manner and purge the memory after using xml elements. When we use the XML::Twig, we can write code / function to handle single xml element and have the Twig call our code/function for all such elements in the xml file.  Code is something like this :

$twin_var = new XML::Twig('Sym', &process_one_symbol_record ); #whenever 'Sym' tag occurs, call process_one_symbol_record function
$twin_var -> parse("Prices.xml");
..
sub process_one_symbol_record
{
($tree,$sym)=@_; #fetch the element into $sym.
$expiry_date = $sym->{'att'}->{'MatDt'};
$option_price = $sym->{'att'}->{'Price'};
..
$tree->purge_up_to($sym) ; #release memory
}

Since Twig processes one element at a time, and our call adds code to purge the memory there is no memory problem. As well there is no read-line & the pass to XMLIn() function overhead we had in line by line processing. This one while using low memory, finished faster than all others too. This is the only occurrence where both the speed and the memory were better.

Let me tabulate the results:

Program
Memory hogged
% Mem cmp. To Whole
Time taken (secs.)
% Time cmp. To Whole
Whole file parsing
1.8 GB
100.00%
370
100.00%
Line by line parsing
56 MB
3.11%
1125
304.40%
1000 lines at a time
62 MB
3.44%
380
102.71%
Multithreaed 
> 1.9 GB
110.00%
147
39.71%
XML::Twig
59 MB
3.28%
250
67.56%


Conclusion:

  • So 1000 lines at a time parsing, we gave up just 2.71% speed, but earned 96.56 % memory. Very balanced and Very satisfactory, since this programs ran fast and lets the system and other programs run fast too, by saving memory.
  • The 1000 lines once version is the most balanced overall for XML::Simple, but XML::Twig is THE BEST. Reason: faster, easier and less chance of introducing bugs. 
  • The multi threading is useful when we gather data from different resources (like parsing two different XMLS), but not very suitable, if we have common data structure / resource being operated by all threads. 
Thank you for reading. Please post your comments so I can correct myself,my English and article.

Monday, July 7, 2014

How I made a PERL program to run 10 times fast using just 25% disk space it originally used?

When I joined the company iTrade (name changed for legal reasons) for consulting I was asked to fix a perl program that stopped running half the way since data was too large. The program had run for 13 hours before having hiccups and collapsed after filling Unix disk partition to 100% and suffocating out of disk space. This article describes how I fine tuned the perl program to make it both fast and space efficient, while in most cases of tuning you trade off one for the other.

When I spoke to the manager I understood that there are TWO problems:


  • It takes only 10 minutes for him to download 1 GB file from the FTP server to his PC. But the same file  took  1 1/2 hour to download during the program execution. There are 6-7 zip files like that, so all took close to 10 hours.
  • The program somehow consumed 27 GB of disk space available and stopped running. It needed more space for further progress.


Studying the program, I was initially confused; It was since the program was doing misleading and meaningless things; but later understood that the program was very poorly written and I should not bother much. It downloads 7 GB s of zip files, with several thousands of documents inside. These documents are unzipped (extracted) and placed inside a single folder. These documents are  grouped by various business needs with filtering logic and copied  to two different folders. These folders are again zipped and sent to three different destination FTP servers.

 Speaking to the business team I got these information:

     a) First FTP destination receives all of the documents that were present in the incoming zips.
     b) Second FTP destination gets about 25% of these documents that match specific criteria.
     c) Third FTP destination gets remaining 75% of these documents.

Analysis of Performance / speed:

Why this Perl program should take 1 1/2 hour or 10 times longer when working through unix system? Was the manager precise when he gave me the information that he could download in 10 minutes? I tried to download the same file then I found that he was right. It indeed took about 11 minutes to download 1 GB from the same server. But the Perl program had taken 1 1/2 hour each GB. Was the Linux/unix box having some problem?  may be a network or device issue? May be the partition where the downloaded  files were kept had some disk i/o issue?

When I tried to download the same big zip file into same folder that used during the program using unix sftp command, I could rule out that possibility. The sftp command downloaded the file successfully in 10 minutes. i.e. 10% faster than windows and definitely not 10 times slower. No problem with the folder or network. So what was the problem?

Now we have only one dimension left out. The Perl program itself. When I looked at the Perl program, I found that the FTP tool used was Net::SFTP module. That module was totally written in perl. Being written in perl this can be slow, especially when the number of packets / block of data received is too many. In case of GB sized files, the number of blocks sent one by one, may be close to a 100,000. With that in mind, I searched internet for newer modules. If I don't find any, then I would use sftp unix command as a system command from perl and then use the file. But I found one superb perl module named Net::SFTP::Foreign which was exactly what I was searching for. This module is blazing fast, uses unix systems sftp & ssh (which is foriegn to perl's internal module, hence the name) combination and so robust that it provides tons of features. The most relevant among the features was block_size option. By setting this block_size option, say to 100,000 we can make every FTP round trip (request, receive, acknowledge) to get 100,000 bytes. That is, in mere 10 round trips, we can get as much as 1 MB data, which would need 125 round trips when using the old Net::SFTP module.

When I used this Net::SFTP::Foreign module with default options, it downloaded the 1 GB file in about 11 minutes. very good!! I added steroid to it by setting the option block_size to 100,000 then tried it. Result? it downloaded 1 GB zip file in 6 minutes!! i.e about 170 MB a minute!! awesome!!

I did not wait much and I went on to write my own flex_ftp.pm, which is a wrapper around Net::SFTP::Foreign with few inbuilt options and flexible block_size negotiation and progress reporting. It was not easy, it took four days, but at the end, I had a cool module that was ready to perform as a plug in. I replaced the original code to use this flex_ftp.pm (aka Net::SFTP::Foreign )  and tried the run. The same 7GB download that had taken more than 11 hours, was completed in 43 minutes. i.e mere 6% of the time!!

There are other areas in the program, some loops, queries were fine tuned too. They indeed improved but not to the extend of bragging :-), so let us move to the other issue:

Analysis of disk space:

ip60s56m3> df .

Filesystem
 1K-blocks
Used
Available
Use%
/dev/sdb1
138881684
131667900
45184
100%


Yes, above was from production machine, when the program failed. Now, reiterating  (a),(b) and (c) that we learnt from business team:

     a) First FTP destination receives all of the documents that were present in the incoming zips.
     b) Second FTP destination gets about 25% of these documents that match specific criteria.
     c) Third FTP destination gets remaining 75% of these documents.

Writing these down as tabular column, We can see  7 GB files were extracted to individual documents of 8GB in size; then copied into two different folders of 2 GB and 6 GB each during the filtering process (shaded in blue).  These three folders will be zipped into zip files that total 7GB, 1.5GB and 5GB respectively and sent to their destinations.

Folder
purpose
size
Download folder
Incoming zip files
7 GB (.zips)
All_docs
To zip & ftp to 1st Destination
8 GB (unzipped documents)
Harl_print
To zip & ftp to 2nd Dest
2 GB (copy from All_docs)
Online_docs
To zip & ftp to 3rd dest
6 GB (copy from All_docs)
All_docs
Zipped file to  1st destination
7 GB (.zip s)
Harl print
Zipped file to 2nd dest
1.5 GB (.zip s)
Online docs
Zipped file to 3rd dest
5 GB (.zip s)
Total
36.5 GB


Summing them up, no doubt, there was no memory to hold this 36 GB and system failed.

After this analysis, I again asked questions :
Why should we keep the incoming zip file instead of deleting it after expansion? Then came the answer from manager: we cannot afford to lose it, since we spent 11 hours in downloading it!!

But I have just implemented the new ftp module that downloads in about 40 minutes.. so we can afford now!! So I did further analysis and devised more techniques:


  1.  We should download the zip files one by one, immediately unzipping each and deleting each after extracting their contents into All_docs folder.  At the end of this new download-expand-delete process, we will have only 8 GB of extracted documents in side All_docs folder and no zip files lying around. 7 GB disk space saved!!
  2. Instead of copying files to Harl_print folder, we can create links inside the folder. ( unix ln command). These links do not create copy of the file in hard disk, so they consume no disk space, but just creates new name for the existing file points back to the original file, while behaving like a consistent copy of original. So all these 2GB such links point back to original files in the All_docs folder. So 2 GB disk space saved !!
  3. For the online docs folder, we can just repeat the technique in (2), i.e create links instead of copy. 6GB saved !!


Until now, we can see that we have kept the disk spaced used by this program to 8 GB :-). Internal business logic is complete, now time to zip and send:

Now we can create zips to be transferred one by one, FTP to the destination and delete after each transfer. This is just same as the step(1), but we are sending instead of downloading; that is all. This will again ensure that no more disk space is used, other than about 1.x GB for each zip during creation and FTPing.

So after deleting all zips again after sending them, we just have 8 GB. During the entire process, we used about 8 GB (of expanded files) and 1.x GB of zip file . Hence totally using just around 9 GB of disk space the entire process is complete.

This is far better than using 36.5 GB and exhausting disk space!! and efficiently using just 25% of original disk space needed.

P.S:  My manager said, I should not have wasted a 2 days in this new implementation and testing it. He said I should have efficiently asked Admin team to add 40 GB etc space in the mount.  However he was satisfied about speeding up the download and said '6 minutes per GB is acceptable'

-*-*-
*-*-*-*-*
-*-*-




Sunday, February 6, 2011

Ann Hardy , a mammal from Fujitsu Consulting

I met Ann Hardy, a middle aged lady working as placement agent in Fujitsu consulting. With sagging all over her body and suffering from Hernia, she looked like a plastic cover filled with water.

She had setup an interview with a few guys from Bank Of America, Broadway, NYC, after an initial telephonic screening in Oracle 11g fine tuning. In that Oracle 11g interview, the interviewer never asked me a single question in Oracle!!; rather he was asking about my experience and projects in general. I was smart enough to understand that the interview did not know anything in Oracle. But not smart enough to understand the fact that Ann Hardy was actually fooling me. The fact was : she promised me a job in Oracle 11g fine tuning and but in order to pacify me to take a testing job, she arranged a mock interview in Oracle making me to believe it was an Oracle job. Believing it was a performance tuning job, I went all the way to New York for a testing job !!.

Once in the second round of interview, I did not have any trouble in answering the questions in Networking, C++ and PERL. But there was no Oracle questions asked. The offer was made and I joined as 'performance engineer'. She told team is very flexible, and everyone work there for more than five years. After a while I understood it is merely testing and data collection job. An arrogant manager named Paul Puglia was leading the team. Steve Thomas was his manager, but originally a puppet. He is another 300lbs guy with no control ever. He does not have minimal amount of guts or courtesy to say anything.

My job is to test and send some report. Funnily, If I send report by 6:00 that is no good.. I have to wait until another team member who lives in California to return back from walking with his dog. He usually takes his dog walk at 4:00. Returns by 4:30 , which is 7:30 in NYC. I have to copy this crappy data in to excel sheet, format, under line them (wow.. great Performance engineer job ) and send him...

Funny thing is they suddenly escorted me out, because I was not 'up to the mark' in performance :-) and the beast Ann Hardy did not pay my salary.. leave alone 2 weeks notice, I had to fight with her for several months to get paid for the days I worked. She was saying I did not submit timesheet for the week (they escorted me out, then how I would be able to submit time sheet?)

The message of this blog, is nothing but :

1) Don't join Fujitsu or even Bank of America. Both are dangerous, politics companies.

2) If there are layer by layer in your job you are in for trouble. the layers are meant only to waste your time and do some tricks in time sheets, payment etc.

3)Now a days they will ask a lot of questions before offering. If they don't interview with relevant questions, better you be careful.

Tuesday, January 4, 2011

உலகத்தின் புகழ் வாய்ந்த புகைப்படங்களை பார்த்திருப்போம்.. என்னைப் பொறுத்தவரை ஒருவர் தன் வாழ்வில் எடுத்த படங்களில் எது சிறப்பானதாக அல்லது முக்கியமானதாக இருக்கும்? என்னும் கேள்விக்கு எவருமே 'இனிமேல் நான் எடுக்கப் போவதே முக்கியமானது' என்று சொல்வார். ஜான் கீட்ஸ் 'நான் கேட்ட கீதங்கள் இனியவை, இனிக்கேட்க இருப்பவை நனிசிறந்தவை' (“Heard Melodies Are Sweet, but Those Unheard Are Sweeter” - John Keats in “Ode on a Grecian Urn” ) என்று சொன்னது போல . அது மனித இயல்பு..

பிலிப்பைன்சில், திருமணப் புகைப்படக்காரரகவும், கவுன்சில் உறுப்பினராகவும் இருந்த ஒருவர் மிக முக்கியமான புகைப்படத்தை எடுத்துவிட்டார்.. இனி அவரால் அதை விட முக்கியமான படத்தை எடுக்கவே முடியாது.. உண்மையில் அவரால் எந்தப் படத்தையுமே எடுக்க முடியாது.. ஏனென்றால் அவர் அந்த புகைப்படத்தை எடுத்த வினாடியே சுட்டுக் கொல்லப்பட்டார். அவரால் எடுக்கப்பட்ட அந்தப்படம் முக்கியமானது. ஏனென்றால் அதில் அவர் தன்னை சுட்டவர்களைப் படம் பிடித்துவிட்டார்..





அவர் பெயர் ரேனால்டோ டக்ஸா . பிலிப்பைன்சில், காலூகன் சிட்டி என்னும் சிறு நகரத்தில் வாழ்ந்து வந்தார். அவர் ரவுடிகளுக்கு எதிராக கடுமையாக நடவடிக்கை எடுத்ததனால் சுட்டுக்கொல்லப் பட்டிருக்கலாம் எனத்தெரிகிறது. போட்டோவை வைத்து, கொலை செய்தவர்களை போலீஸ் இரு நாட்களில் பிடித்துவிட்டனர். சுட்ட நபரின் முகம் துப்பாக்கியால் மறைக்கப் பட்டிருந்தாலும், மற்றொரு கொலைகாரரின் முகம் நன்றாகவே தெரிந்தது.. பாதுகாப்புக் கருதி டக்ஸா குடும்பத்தினரின் முகம் போட்டோவில் மங்கவைக்கப் பட்டுள்ளது.


Caloocan city, north of Manila, Philippines, Tuesday, Jan. 4, 2011.
Dagsa, a local councilman and a wedding photographer, was taking pictures of his family outside their home to welcome the new year early Jan. 1, when he slumped on the pavement from an assassin's bullet. His last frame showed clearly the alleged assassin, Frederick Sales, aiming his pistol behind smiling members of his family. Police said Sales and another man, who acted as a lookout, were arrested Monday

Wednesday, March 3, 2010

how to fix HP Laptop Blank Screen

Suddenly my HP dv6000 series laptop got a black/blank/dark LCD screen.

I did few attempts to fix it, but could not succeed.
Check fn->Video out, pressed switch off continuosly, disconnected power , closed/hibernated/fn->snooze and reopened. Nothing work. No. Nada.

This is caused by the state, in which the computer goes into a mode, and when you switch off/wake up it starts and goes ot the same state/mode. This dv6000 has 'snooze on battery drain' enabled, so disconnecting power won't help too. So solution is to reset without letting it smoothly turn itself off.

What I did to fix : after disconnecting from power, I JUST pulled the Battery off from the backside crashing the system. Then put the battery back and switched on.

When I started it again it asked "where am I? what state I was?" (actually it asked if to start in 'normal mode' or BIOS set etc. )

Hope it works for you.

Saturday, December 12, 2009

USA - INDIA - ISLAM

It is becoming very common for Indian magazines to write satires about US , their economy, imperialism and culture and even their presidents in one page and writing something pro-Islamic on the other page. Is this trend fair ? Who is friend to India ? US or Muslims ?

US is the worlds biggest democratic country. India is the biggest in paper, but when it comes to human rights, human personal space, liberation US is THE DEMOCRATIC. Ever a layman can sue a police constable in India? In US it is possible. Can a Indian Chief minister illegally marry several women? yes. we know they have done it. No.. Nay...Not possible in US. Few months back, In Malasia, building a temple in a land bought by Hindus, was stopped by some fundamentalist groups. Such will never happen in US. US is true democratic, truly liberated country. Indian magazines worry that US pushes army to Afghanistan and middle east. But do they know, if US stops serving in Afghan, in a month, a thousand terrorists will be jobless and infiltrate from Afghan into Indian border most likely into Kashmir. There is no other country in this world that helps India, more than US. Still magazines write badly about US? WHY ?

Islam is one of the religions in India. Muslims are Indian citizens like Hindus, Christians, Sikhs and several others and they are not superior to another. But most of the magazines write about Muslims in a pitiable manner somehow attempting to create sympathy. Commonly found are : Poverty among Muslims, Unemployment among Muslims, prejudice, their sacrifice for our nation, suppression on Muslim and how all these make a 'terrorist' from a 'good Muslim'. Also plan to convert them back to 'good citizens' are already implemented as effective laws : Reservations in Engineering Colleges, Medical Colleges, Law schools, etc etc etc. Gift them further with Scholarship ( how else one poor muslim will be able to come out with flying colors just with a seat for him in University?) . For non-students ? Fund them Loans for their business. For older citizens ? Fund $1000 for their Haj Travel. This ideas will just work as nicely as they have been working in Kashmir for last 30 years which calls for separation now.

Wait... give me a minute... I saw few faces in the same magazines cover.. yes, It was ex. President of India, APJ Abdul Kalam... and previous Indian youths' religions' captain (sorry Crickets captain ) Mohammed Azaruddhin.. and Sharuk Khan, Salman Khan... A.R.Rehman.... and... Are these not Muslims?


The fact is : Muslims are living safer and successful in India, than in any other Islamic nation. They have more rights in India. India is so flexible for them : They can marry four women, while a Hindu can marry one. They have better opportunities in Govt, Schools and everywhere possible. There are well educated Muslims, Working as Judges in apex court of India, leading India in politics, Surgeons, small businesses, Industries, sports . They are Indians and We should respect them, love them. They should join in Indian river as a stream and run along with us flourishing the riverbed. Not against the great Indian river. It is DUTY of our magazines to explain the Muslims that they are our people, they are doing well and the put fact straight: India is giving them great opportunities. But the magazines never seem to do that. Rather they stimulate and help astray the innocent minds of Islamic citizens. WHY ?

I don't know the answer... but seems to me that, even our Indian history itself puts all wrong. History books always write that Gandhi is father of India, Mohd. Ali Jinnah is father of Pakistan. Please write it plain truth in our school books : Mohd. Ali Jinnah was against Indian Independence since he did not want a non-muslim to rule a country consisting of muslim people or whatever reason. He openly called for havoc unless partition. He challenged, planned and executed mass murder and rape that affected millions of Hindus until agreed for partition. Why our history teachers won't teach this in classes?


Looking at US at same time, 70 years back : US presidents are great. Lincoln is considered as the greatest ruler ever lived on the earth. But there is one greater than him from India's perspective : Franklin D Roosevelt, one of the greatest presidents of US, helped India to get independence. He forced Winston Churchill to sign a pact, that British should free their colonies and give independence, for US to help them in WWII ; And we know the biggest colony of Britain is 'British India'. Mr. Roosevelt, could have just said 'well Mr.Churchill, just forgo all the claims on our land, India is not my business'. He did not. He had higher purpose that is rare among Indian leaders. He spoke for colonies too. However, Roosevelt did not secure any good to India. He gave up and agreed Winston Churchill's stance. i.e. The independence is applicable to Whites and Middle east. No white should be slave of others. When it comes to Africans or Indian, ok.. ups and downs are common.. thou shalt be slaves..

Why our history books in school don't say at least Roosevelt had a better interest than Jinnah?


I think it was Nehru and Rajendra Prasad who controlled these publications. They screwed up everything.. There is no point in repenting on spilled milk. But we can clean it. We should send petitions to education minister to teach truth, true history on our books. That will not humiliate any religion. It will make them remember the truth that they were not victims and mould them as good citizens acknowleging the past. We have to tell them all those Muslims stayed in India are good patriots and we love each other. At the young minds, we have to etch and ensure that they will never be astrayed by fundamentalist preachings. We should make all Indian children to be patriotic and responsible.


-Kargil Jay

Sunday, January 13, 2008

Undoing the Polarizer



The polarizer is a glass that filters off light from one particular plane. It is generally called 'polarizer' or 'polarizing filter'.


It can be used in photography to avoid reflections that cause original colors to be suppressed by bright light. In some photographic situations, where the sun light can render harsh white washing output the polarizer can be handy.

The right situation where you would use a polarizer:

1) Window glass reflections. Where the window's reflection of objects in the photographer's side becomes obtrusive in the photo.
2) Photos in clear water ponds or stream, where skyline reflection creates haze or glittering surface. Or When you want to take photos of fishes or alligators inside water
3) deep colored objects in sunlight. Say a rose or morning glory. The flower's petals could reflect the sunlight harshly, creating non-deep, lighter color surface at some angles. Polarizer can cut off the reflection resulting in well saturated, deep colored photos.

In above situations the polarizer can cut off the reflection in one plane and let the photographer to avoid the unwanted light. The example below illustrates how to cut off reflection from blue sky. The first photo was bare lens ; and second one using polarizer correctly*:

( both photos by Kargil Jay in Blue Bell, Philadelphia; fall 2010, copyright )





Though the polarizer did the job perfectly removing blue sky especially from bottom half of the photo, neither there was a fish or white sand underwater to highlight the effect.

* correctly : Fitting the polarizer in front of the camera alone is not going to help. We have to rotate the filter correctly to the angle of sunlight (or any light). Because the polarizer (any polarizer) can cut off only one plane of the light. If you watch the Polarizer glass the inner side of the rim you will see two 'cut marks' opposite to each other . If we sketch straight line like diameter, that is the plane of light which is cut off by the polarizer. And draw more lines in either side parallel to this lines.. All these lines define the plane whose light being filtered away by the polarizer. To find these poles you can do this: keep the polarizer in front of LCD of your laptop, and look thru it, while rotating the filter. At one point the polarizer will totally blackout the screen. That plane is the polarizing plane. In the polarizer photo shown in the top of this page, you can see the 'cut mark' in inner rim at 2'o clock position. That is defines the polarizing plane. When you shoot, that cut mark should be parallel to the on coming lights plane, in order to cut off that light.

Leaving the polarizer in camera forever, is not a good idea except for bright situations. Use it when you need it. Sometimes, undoing the polarizer, can be useful. Especially when you consider reflections add elements to photo. In the photo below, I removed polarizer, because the reflections are the sole subject of this photo :



Notes:
1) Circular polarizer does not mean 'several planes'. All polarizer glasses intend to filter only one plane of light.
2) small point and shoot cameras will not fit any polarizer unless, screw-able threads are present in the front of the lens.
3) The width of polarizer filter vary and hence we must choose 52mm,64mm or whatever the camera's thread size before purchasing. The one shown about is 72 mm suitable for SLR lenses only.
4) Generally all polarizers work decently and any decent brand is ok.