Architecture of Flickr & LiveJournal

Today I found a really interesting presentation of Flickr’s backend. You can download it here, but I warn you, this is hardcore techie stuff..!! There are also two interesting presentations of LiveJournal’s backend here and here.
This stuff is incredibly useful to us because it provides us guidance
for implementing the best engineering practices for our backend.

Information architecture is the core
of our business and I spend 60% of my time (at Adoos) thinking about
how to structure and organise information. Architecture is what makes
the application robust and scalable. How do I make sure this ship doesn’t sink when we try to grow from 1 million visitors to 50 million?

What
I would love to see also is a diagram of the architecture of AdWords
and Google Search, but unfortunately that’s not possible…


FLICKR’S BACKEND

Flickr
is possibly the best online photo management and sharing application in
the world. The application is built with PHP, Apache, MySQL using
InnoDB and MyISAM.

Backend DNA:
• PHP 4
• Smarty for templating
• PEAR for XML and Email parsing
• Perl for controlling…
• ImageMagick, for image processing
• MySQL (4.0 / InnoDb)
• Java, for the node service
• Apache 2, Redhat, etc. etc.

Some Numbers:
• One programmer, one designer, etc.
• ~60,000 lines of PHP code
• ~60,000 lines of templates
• ~70 custom smarty functions/modifiers
• ~25,000 DB transactions/second at peak
• ~1000 pages per second at peak

Application Architecture:

131487153_bca31064c4

LIVEJOURNAL’S BACKEND

LiveJournal is a free blogging
service. They’ve written an excellent presentation of how to scale an
open-source web service from 1 server to 100+ servers.

Backend DNA:
– 100+ servers
– Linux, Debian
– Apache, Perl
– MySQL (InnoDB & MyISAM)
– perlbal –  open source HTTP proxy
– memcached – open source distributed caching system
– mogileFS – opensource distributed file system
– Nagios, Cricket
– BIG-IPs

Some Numbers:
– 50+ million dynamic pageview/days.
– 5+ million accounts.
– 30 GB of cached data.
– ~100,000 queries per second at peak.

Server architecture diagram:

131487197_a19380ac48

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: