Google
 

Wednesday, October 10, 2007

Interesting Numbers {Nov 2007}

Some interesting numbers

  • India added 8.03 million mobile customers in july 2007.
  • Foreign exchange reserves touched $229 billion.
  • India employed 5,62,000 people in IT services, 5,45,000 in ITES, and 1,44,000 in engineering services in 2006.
  • Indian life expectancy has increased from 31 years in 1947 to 64 years in 2005, thanks to inexpensive antibiotics.
  • In India,there are 20 annual air travels per 1000 people,while USA has 2,300 & Sri Lanka has 30.
  • Average pay in the government and private sectors in India is at Rs 2,45,745 and Rs 5,65,214 respectively.
  • 2 million of the 40 million people working in managerial jobs left USA due to a supposed 'hidden' bias in dealing with foreigners.
  • Hon Hoi precision Industry that manufactures laptops,mobiles and MP3 devices (including iMac,iPhone,iPod) employs 4,50,000 people! across the world.

Tuesday, October 2, 2007

What the Web’s most popular sites are running on?

TechCrunch, FeedBurner, iStockPhoto, YouSendIt, Meebo, Vimeo and Alexaholic.

These are some of the most popular websites on the Internet. You have heard about them, you have read about them and you have most likely used or visited at least one of them. But how often have you read about what these websites are actually running on? This article dives into the facts and figures about the underlying hardware and software that keep these sites running smoothly in spite of their massive popularity.

Pingdom performed a survey of these seven “super sites” that focused on web, database and file server numbers and setup, operating systems, bandwidth usage, network hardware and other technical questions relevant to maintaining a site. For those who are interested in the nitty-gritty details there is a PDF matrix with the survey results attached to this article.

The variety of websites in the survey gives a good cross section of different kinds of setups. They all represent the crème de la crème in their respective categories, including blogging and blogging tools, stock photo libraries, file sharing, instant messaging, video sharing, and web statistics.

Common trends

Though statistically these seven sites only constitute a small drop in an ocean of websites (there are more than 100 million domain names on the Internet) in many aspects they show a surprising consistency when it comes to their choices of underlying technology, usually with a strong bias towards open source. These are some of the common trends we found during the survey.

Penguin the most popular server animal

Linux rules the game with these sites. All except one use Linux exclusively, with Alexaholic being the standout since it’s hosted on Windows. Not a single site uses the otherwise so popular FreeBSD operating system.

“Linux was selected for multiple reasons,” says Jan Mahler, network operations manager at YouSendIt. “It has a proven track record in scaling, open source code to allow for altering code as necessary, price, excellent support if necessary and ease of finding talent to support and maintain it.”

Similar sentiments are shared by all of the companies in the survey that use Linux.

“Initially, the fact that the software stack was free (as in beer) had a major influence on our decision,” says Brent Nelson, senior systems administrator at iStockPhoto. “But moving forward, standardness and supportability started becoming major factors. Using the big-name Linux distributions gives us support with big-name hardware vendors, and vice versa. Commercial solutions for backup and site acceleration are supported under Linux on x86.”

Apache serves the most pages

With Linux hosting comes the common use of the Apache web server. It’s by far the most deployed web server on the Internet with a 58.7% market share (Netcraft), so it’s only natural that it would also be used by a majority of the sites in the survey.

However, even though it’s the behemoth in the web server market, Apache is slowly losing ground to competing platforms such as Microsoft’s IIS and up-and-comers like Lighttpd, at least according to data from companies such as Port 80 and Netcraft.

MySQL dominates the databases

With open source ruling the game it shouldn’t come as a surprise that the database of choice for all but one of the sites is MySQL, the ultra-popular Swedish open-source database.

“The features that you get for free on MySQL, with replication, in-memory and fault-tolerant databases (if using MySQL cluster), transaction support, and the wicked performance, cost thousands of dollars with other database engines,” says Joseph Kottke, director of network operations at FeedBurner.

These sentiments are echoed by the other participants in the survey as well.

“We needed something proven, flexible and low-cost,” says Simon Yeo, director of operations at Meebo. (His alternative title is “ops guy.” Other laid-back titles at Meebo include “marketing dude,” “server chick,” and “Mr. Sparkle.”)

These sites are far from alone in favoring MySQL. According to the MySQL website it’s the fastest-growing database in the industry, with more than 10 million active installations and 50,000 daily downloads.

PHP rules server-side scripting

Just like Apache is the most common web server software, PHP rakes in another “win” for open source when it comes to server-side scripting languages. PHP has been the most popular server-side scripting language for years and will probably remain so for some time, despite the hype around Ruby on Rails and other frameworks and scripting languages that are growing in popularity.

As of November 2006, there were more than 19 million websites (domain names) using PHP.

Clustering for reliability and performance

Clustering servers can improve both availability, performance and help with load balancing. Five of the seven sites use clustering for their web servers, and four of them use it for their database servers.

It should be noted that since TechCrunch only uses one web server and one database server, it can’t, and doesn’t need to, do clustering. In other words you could say that five out of six sites with multiple web and database servers in this survey use clustering.

Going against the grain

Just like the sites have some things in common there are those that stand out from this (admittedly small) crowd with slightly more unconventional software choices.

Meebo with Lighttpd

Even though it runs on Linux servers, Meebo has avoided the use of Apache in favor of Lighttpd (pronounced “lighty”), a smaller and more lightweight web server with better performance than Apache.

“Lighttpd tends to work really well with AJAX-based sites like ours,” says Simon Yeo.

Incidentally, Lighttpd is also used by giants such as YouTube and Wikipedia, and is also very popular with the Ruby on Rails community.

Lighttpd may only have a small percentage of the web server market right now, but it is growing extremely fast and will be a serious challenger down the line. Netcraft data from February shows a jump from 170,000 websites to 700,000 websites in just a month. That’s a 400% increase in a very short time frame.

Alexaholic with Windows, IIS and MS SQL Server

Alexaholic is the only website in the survey that doesn’t run on Linux servers. It uses ASP.NET 2.0 on Windows with Internet Information Server (IIS) together with MS SQL Server. Ron Hornbaker, the man behind Alexaholic, cites his familiarity with the .NET platform as the main reason for his choice of platform.

“I’m most comfortable coding with C#.NET, and this was a personal project,” he says.

Ron Hornbaker built the first version of Alexaholic in just one (admittedly intense) weekend, which can definitely be seen as proof that the ASP.NET environment can be very productive.

As noted earlier in this article, IIS is actually gaining ground on Apache according to several sources, so even though Alexaholic is the only site in this survey to use Windows and IIS, it has plenty of company, with 31.1% of the Internet’s websites hosted on IIS (Netcraft).

Server-side Java with Apache Tomcat

Both TechCrunch and FeedBurner run Apache Tomcat for Java servlet support. Even then, the ever-popular PHP is still used as well. (TechCrunch uses the Wordpress open-source blog software which uses PHP and MySQL.)

In addition to this, FeedBurner also uses Perl, the previous champion of CGI scripts.

Different needs = different setups

Blogs deliver mostly static content and even very popular ones can run very well with just a couple of servers, like the case is with TechCrunch. Alexaholic, in spite of delivering massive amounts of statistics, can get away with two web servers and two database servers since it can pull a lot of its data directly from Alexa. Add more dynamic content, or streaming content, and the game changes.

In addition to their web and database servers, YouSendIt has 170 file servers split between the U.S. east and west coast just to deliver files. Vimeo has 100 content delivery servers for the sole purpose of streaming video. Meebo has more than 40 web servers to handle their AJAX-based messaging application, and FeedBurner uses 70 web servers and 15 database servers to, pardon the pun, feed its feeds. It even has a replicated second site with just as many servers.

The greatest technical challenge

“The greatest challenge was finding the most efficient ways to locate hotspots and bottlenecks in the application,” says Joseph Kottke (FeedBurner.) “Once we came up with a loose methodology for locating problems, the analysis became very easy. Detailed monitoring was crucial in this, keeping track of disk, CPU and memory usage, slow database queries, handler details in MySQL, etc.”

There seems to be a general agreement that it’s a difficult challenge to scale a website gracefully as the number of visitors grows. It’s also extremely important if you don’t want to lose your momentum. The Web is fickle, and if your service doesn’t perform and deliver the goods, users will soon go somewhere else. For FeedBurner, and any site for that matter, it’s extremely important to know where your bottlenecks are, a sentiment echoed by Brent Nelson (iStockPhoto) and the other participants in the survey:

“Pretty much every aspect of the site has been a bottleneck at some time,” he says. “Database servers, PHP sessions, web server load due to PHP execution, network, storage systems have all caused performance issues in the past. No single approach would be appropriate for every decision - and previous decisions need to be re-evaluated as you scale to the next level.”

There is another aspect of scaling that also comes into play, and that is functionality. How much functionality can you add before it starts to confuse your users?

“It’s a challenge to balance simplicity with functionality,” says Ron Hornbaker (Alexaholic). “I’m always tempted to throw more features in, but sometimes less is more.”

The big money sinks

The general consensus is that bandwidth costs account for a large share of the operating expenses. Several of the sites stated that bandwidth is the single most expensive aspect of running their site, but the more hardware you have, server costs and power consumption also become significant expenses, as well as co-location hosting.

Joseph Kottke from FeedBurner sums it up as he comments on FeedBurner’s main operating expenses: “Hosting costs: Cages and cabinets, and power. Sweet, precious power.”

The power costs are a real problem for anyone with a lot of hardware. Google, operating a server park with more than 200,000 servers, has long been lobbying for server hardware with more efficient power consumption.

Final thoughts

As could have been guessed, LAMP (Linux, Apache, MySQL and PHP) is by far the most common setup of the surveyed websites. The dominance is far from total, though, and the elements of LAMP are all challenged by alternatives that are growing in popularity.

It’s worth noting that even among the alternatives, most are open source. Open source rhymes well with the Internet’s focus on standardization, more so now than ever before.

It’s also interesting that most of the companies motivated their choice of technology partly with the word “familiarity.” They used the technology they were most comfortable with, all the way from the one-man project Alexaholic to the larger sites in the survey.

“iStockphoto grew out of a web development and hosting company,” says Brent Nelson. “We used PHP, MySQL and Linux to build client sites, so it was a logical choice to use it to build our own.”

Since it’s human nature to stick with what we know and what we feel comfortable with (which is often a smart choice, productivity wise), any new technology needs to be significantly better and a significantly more comfortable alternative than existing market leaders before a switch will be made by the majority of people.

Pingdom intends to do a similar survey every year to follow the technical trends and provide insights into the workings of massively popular websites such as these.

We would like to thank all of those who participated in the survey for being so helpful and generous in providing information and insights about the operation of their websites.

Learn and be inspired by these companies and individuals. If you have a blog with the hopes of becoming big, you may not need that much in terms of hardware. But if you’re building a web app that’s going to handle a million file transfers daily, take a look at YouSendIt and prepare for some serious investments.

How the survey was made

The seven participants all responded to a set of 28 survey questions (all responses available in the PDF matrix) plus a number of follow-up questions about their website infrastructure where they could further explain their choices.


[source]

Google availability differs greatly between countries

September 26, 2007

Google Search users in the United States are 10 times more likely to encounter a problem than users in Brazil, according to this unique one-year survey from Pingdom.

Google has a large number of localized versions of their Google Search homepage. We have monitored the uptime of Google Search for 32 different countries during a whole year to see how they perform when it comes to availability (uptime).

The website with the most downtime was the Swedish Google Search (www.google.se) which was unavailable a total of 48 minutes. The website with the least downtime was the Brazilian Google Search (www.google.com.br) which was only unavailable for a total of 3 minutes.

In other words, Google Search users in Sweden are 16 times more likely to encounter a problem than Google Search users in Brazil.

The American Google Search (www.google.com) ended up in position 26 out of 32, with ten times more downtime than the Brazilian Google Search.

Downtime measured from September 1, 2006 to September 1, 2007
Country URL Downtime in mins



Uptime over a year
Brazil www.google.com.br 3



99,999%
Netherlands www.google.nl 11



99,998%
India www.google.co.in 12



99,998%
Thailand www.google.co.th 13




99,997%
Japan www.google.co.jp 15



99,997%
Canada www.google.ca 16



99,997%
Mexico www.google.com.mx 16



99,997%
Egypt www.google.com.eg 16



99,997%
Chile www.google.cl 17



99,997%
France www.google.fr 19



99,996%
Greece www.google.gr 19



99,996%
United Arab Emirates www.google.ae 20



99,996%
United Kingdom www.google.co.uk 20



99,996%
Poland www.google.pl 20



99,996%
Argentina www.google.com.ar 21



99,996%
Hong Kong www.google.com.hk 22



99,996%
Spain www.google.es 22



99,996%
Italy www.google.it 22



99,996%
Belgium www.google.be 22



99,996%
Switzerland www.google.ch 22



99,996%
Australia www.google.com.au 26



99,995%
Romania www.google.ro 27



99,995%
Saudi Arabia www.google.com.sa 27



99,995%
Malaysia www.google.com.my 28



99,995%
Germany www.google.de 29



99,994%
United States www.google.com 31



99,994%
China www.google.cn 34



99,993%
Israel www.google.co.il 34



99,993%
Turkey www.google.com.tr 40



99,992%
Singapore www.google.com.sg 46



99,991%
Taiwan www.google.com.tw 46



99,991%
Sweden www.google.se 48



99,991%

The average downtime of these 32 websites is 23 minutes in a year.

It is interesting that several countries not traditionally associated with a good internet infrastructure ended up in top positions with very little downtime. Notable examples are Brazil, India, Thailand and Mexico.

Sweden on the other hand has a reputation for being one of the forerunners on the internet, so it is very surprising to find the Swedish Google Search at the bottom of the list.

Even though this survey contains some surprising results, none of the monitored websites passed below 99.99% uptime, which has to be considered extremely good, even for a company with the resources of Google.

Survey Data in an Excel file

[source]

Microsoft's AntiSpyware Tool That Removes Internet Explorer

Many Microsoft Windows users who downloaded the recently released AntiSpyware program from Microsoft, or had it installed through an automatic Windows update, woke up to a surprise. Unintentionally, the heuristics of the software detected Internet Explorer as spyware, and removed the program from their systems.

AntiSpywareMicrosoft has pulled the program from its website until the problem can be corrected. Elias Weatherbee, a Microsoft representative, said the program was "only in beta" and that "a fix was forthcoming."

"It shows how powerful our AntiSpyware program is," said Weatherbee. "Not only is it able to remove spyware from the system, but also the source of most spyware. Our competitors can't match that."

A representative from Lavasoft, which sells Ad-Aware another spyware removal program, complained that Microsoft was using its monopoly and knowledge of the operating system to "offer features that others can't match."

"Tough shit," said Weatherbee.

Many computer users did not view this new "feature" positively. "I tried to check the weather this morning and all my little blue 'e' icons were missing. I couldn't get to the Internet at all. I guess I'll have to get a new computer," said Windows XP user Graham Newton.

Users of alternative browsers were happy to see Internet Explorer gone. Thad Freeman of the Mozilla Users Group said, "I've been trying for years to get rid of Internet Explorer. I never imagined that Microsoft would do it for me. I'm ecstatic."

Microsoft technical support was advising customers to reinstall Windows to regain Internet access and to disable automatic updates.

Symantec Antivirus Research reported that virus sightings were down by 95% this morning.

[source]

MS Windows Vista service pack that will install XP

Redmond, WA – In response to customer demands Microsoft announced that instead of patching bugs and improving features of Windows Vista in the next service pack release, they would just install XP.

"We're focused on giving the customer what they want, and want they want is to just go back to XP," said Microsoft Development Chief Greg Elston.

Elston said not only will the move improve customer satisfaction with Vista, but will allow the company to focus resources on the next operating system instead of the flailing Vista.
"We can move people off of Vista development now, and move them to Windows 7 development," said Elston.
"That should allow us to only delay Windows 7 by thirteen months past its scheduled date instead of the planned eighteen."

Customers have had many complaints about Vista, so it wasn't surprising the response to the move was mostly positive.
"Ever since I install Vista I've wanted to go back to XP," said Trey Sportia.
"I'm glad Microsoft has given me an easy downgrade path."

[Source]


Saturday, September 1, 2007

GSLV rocket lifts off from Sriharikota

The Geosynchronous Satellite Launch Vehicle-FO4 lifted off from the Sriharikota launch range with India's latest communications satellite INSAT-4CR at 1821 IST on Sunday.

The launch was earlier postponed twice -- three seconds before the blast off and then to 1821 IST.

The countdown had stopped due to a "technical snag in parameters related to the launch", sources in the Indian Space Research Organisation said.

The GSLV will put into orbit the INSAT-4CR, which carries 12 high-power Ku-band transponders for direct-to-home television services, video picture transmission and digital satellite news gathering.

The last launch of the GSLV in July 2006 ended in failure, with the rocket falling into the Bay of Bengal shortly after blasting off from the Sriharikota launching range.

Saturday, August 25, 2007

Bomb Blasts In Hyderabad ,INDIA

At least 42 people, including five women and seven students, have been killed and 50 injured in two explosions at a crowded park and a popular eatery here last evening, three months after the Mecca Masjid blasts, police said today. The week-end outing at the popular Gokul Chat shop at Kothi locality turned into a tragedy when a deafening explosion ripped through it killing 32 people and wounding 21, they said.

Five minutes earlier, 10 people, most of them from outside the state, were killed and 29 injured in another blast in an open air auditorium in Lumbini Park near the state secretariat in the heart of the city when a laser show was underway, they said occured around 19:40 hrs on 25th aug 07. The blast at the auditorium, where 500 people were present, was so powerful that some bodies were flung in the air. Among the dead at the Lumbini Park were two students from Ahmedabad. Four Railways employee are among those killed in the blasts. The condition of some of the injured was stated to be serious, police said.Have a look of those tragedies published by eenadu [Telugu daily].