Sunday, March 29, 2015

TRAI seeks comments on Net Neutrality in India

Telecom Regulatory Authority of India has published a consultation paper asking for comments from stakeholders on important questions related to net neutrality. When it comes net neutrality all of us are stakeholders as users of the open and transparent Internet. So we should all submit our views on these important questions. Please read the document (it's long but it's very informative) and send in your responses by 8th April, 2015.

Here are the questions asked:

Question 1 
Is it too early to establish a regulatory framework for OTT services, since internet penetration is still evolving, access speeds are generally low and there is limited coverage of high-speed broadband in the country? Or, should some beginning be made now with a regulatory framework that could be adapted to changes in the future?

Question 2
Should the OTT players offering communication services (voice, messaging and video call services) through applications (resident either in the country or outside) be brought under the licensing regime?

Question 3
Is the growth of OTT impacting the traditional revenue stream of TSPs? If so, is the increase in data revenues of the TSPs sufficient to compensate for this impact?

Question 4
Should the OTT players pay for use of the TSPs network over and above data charges paid by consumers? If yes, what pricing options can be adopted? Could such options include prices based on bandwidth consumption? Can prices be used as a means of product/service differentiation?

Question 5 
Do you agree that imbalances exist in the regulatory environment in the operation of OTT players? If so, what should be the framework to address these issues? How can the prevailing laws and regulations be applied to OTT players (who operate in the  virtual world) and compliance enforced? What could be the impact on the economy?

Question 6
How should the security concerns be addressed with regard to OTT players providing communication services? What security conditions such as maintaining data records, logs etc. need to be mandated for such OTT players? And, how can compliance with these conditions be ensured if the applications of such OTT players reside outside the country?

Question 7 
How should the OTT players offering app services ensure security, safety and privacy of the consumer? How should they ensure protection of consumer interest?

Question 8 
In what manner can the proposals for a regulatory framework for OTTs in India draw from those of ETNO, referred to in para 4.23 or the best practices summarised in para 4.29? And, what practices should be proscribed by regulatory fiat?

Question 9 
What are your views on net-neutrality in the Indian context? How should the various principles discussed in para 5.47 be dealt with?

Question 10 
What forms of discrimination or traffic management practices are reasonable and consistent with a pragmatic approach? What should or can be permitted?

Question 11 
Should the TSPs be mandated to publish various traffic management techniques used for different OTT applications? Is this a sufficient condition to ensure transparency and a fair regulatory regime?

Question 12 
How should the conducive and balanced environment be created such that TSPs are able to invest in network infrastructure and CAPs are able to innovate and grow? Who should bear the network upgradation costs?

Question 13 
Should TSPs be allowed to implement non-price based discrimination of services? If so, under what circumstances are such practices acceptable? What restrictions, if any, need to be placed so that such measures are not abused? What measures should be adopted to ensure transparency to consumers?

Question 14 
Is there a justification for allowing differential pricing for data access and OTT communication services? If so, what changes need to be brought about in the present tariff and regulatory framework for telecommunication services in the country?

Question 15 
Should OTT communication service players be treated as Bulk User of Telecom Services (BuTS)? How should the framework be structured to prevent any discrimination and protect stakeholder interest?

Question 16 
What framework should be adopted to encourage Indiaspecific OTT apps?

Question 17 
If the OTT communication service players are to be licensed, should they be categorised as ASP or CSP? If so, what should be the framework?

Question 18 
Is there a need to regulate subscription charges for OTT communication services?

Question 19 
What steps should be taken by the Government for regulation of non-communication OTT players?

Question 20 
Are there any other issues that have a bearing on the subject discussed?



These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Wednesday, August 22, 2012

Amazon Glacier: Archival storage that's cheap?

Amazon Glacier is a new service from Amazon that offers archival/cold storage at a cheap & flexible on-demand price of $0.01/GB/month. They say this is highly durable storage with a durability of 99.999999999% (nine-nines, same as Amazon S3), but availability for retrieval is going to be delayed by several hours (as opposed to instant retrieval in S3 with availability of two-nines over a year).

Traditionally, cold storage meant tape storage. Last time I personally used tape backup for a server was over a decade ago. Disks have taken over as the medium of backup in most companies. (except for cases that really needs cold storage forever like CERN's LHC).

Let's do some simple math for a disk based system.

Cheapest 3TB SATA disk that costs about $100. (Enterprise class drive would be 3 times that cost). What's the actual usable storage in this? A "3TB" drive contains 3 trillion bytes of storage and not really 3 terrabytes. And a filesystem will have a few GBs of overhead.
Within a storage pod, we could use reed-solomon style encoding to provide solid redundancy with 25% space spent on error correction bits.  Taking all this into consideration, we get only 2000GB of usable space per disk. If we want 3 geo-separated replicas, the effective storage per disk goes down to 675GB.

These drive costs are typically amortized over 3 years. So just the storage cost per GB per month is =
$100/36months/675GB = $0.004/GB/month.

Now, the server, power, cooling and space costs remain to be accounted for in the remaining $0.006/GB/month. A 60-drive server would cost about $2000 excluding the drive cost. It would also cost about $70 per month for space rent. Power/cooling would be another $70/month.
So that's about $200/month for 60 drives worth storage.
So the cost per month is  ($200/month) / (675 GB * 60 drives) = $0.005/GB/month.

If we turn off the servers completely we can save more on power. If we build denser servers we can amortize the server cost better.

So it seems to be possible to build disk based solution that may work at even smaller scales (relatively, a pod would be a row of 12 racks with 10 4U 60disk servers per rack = 4.6 petabytes usable storage).

Obviously I've not factored in the cost of people needed to develop, deploy and maintain such a system.  This development effort would be not be a trivial investment and would make sense only at large scale and for strategic reasons.

Also, I've not factored in the networking infrastructure to provide equal-cost access to all data etc.

So, in conclusion, Amazon Glacier might be the best cost solution for small scale archival needs.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Tuesday, August 30, 2011

Stuff to get it right early for a startup

I'll make this a short post. It takes less time to set this up initially and get all of your projects conform to it than trying to retro-fit it later. And the effort spent on this will pay for itself in saved time from increased productivity ten times over.

  • A source control repository
    • Separate binary file assets (like lots of images, videos etc) from text file assets (like source code into separate repositories.
    • Use a distributed version control repository, like git.
  • Integrated Code review tool, like gerrit.
  • Integrated Bug database, like bugzilla (it's very customizable and fast) or jira (newer versions are pretty good).
  • Integrated code browser, like opengrok. 
  • Every project should be buildable, preferably using autotools.
    • Even if it's 3rdparty code, never just keep the binary. Always keep the source in good building shape.
    • Also, save the web url or location from where it was downloaded. There maybe a bugfix or a update you may want to pick up later.
    • Ensure build is fast. Use distcc and ccache to make it faster.
    • Split the overall code into independent layered one-way dependency projects.
  • Continuous build and deploy and smoke-test setup
    • This is extremely important for a project that's in active development.
    • Ensure smoke test is up-to-date and extensive.
    • Build system itself should be version-controlled. Treat build systems as sacrosanct. Don't install or upgrade packages randomly on this system.
  • Don't be stingy about the hardware for these systems.
  • Backup everything.
For a startup, hiring a build-automation engineer who can do all this stuff correctly and efficiently is probably money well-spent because it will save tons of effort on behalf of your costlier engineers,  architects etc.

If you are building an Internet application, then there are more things to pay attention to - like ensuring your build system can publish to software distribution system, upgradable builds with build ids and version numbers and automatic dependency management etc. Maybe another quick post on it sometime later.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Wednesday, April 13, 2011

nginx 1.0!

My favorite server, nginx has hit 1.0 release. With this release they have made public the svn repo holding the code with history all the way from 2002. The repo is at svn://svn.nginx.org.

Kudos to Igor and his team on this awesome piece of practical software. I see myself continuing to be a fan of both Apache httpd and nginx for a long time to come.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Saturday, October 30, 2010

Thrift for serializing/deserializing objects in Membase

First off, if you haven't heard of Membase, you should check it out. It's an evolution of sorts from memcached.

Typically when you use memcache or membase to store/retrieve key-value data, the value part is not a simple datatype. Instead, it would most likely be a serialized representation of some complex application specific data-structure. It's great to set/get complex datastructure with a single remote call like this. But what could become a problem very soon is the performance of the serialize/deserialize operations that needs to happen with set/get operations.

With php, the obvious way to do this is to use the language's builtin serialization facility. Since the serialized format is a ASCII based format, I would guess that it's performance is not optimal (especially for deserialization). Also, one would want to do compression to reduce the data transfer and storage costs. This again adds to the set/get operation costs.

I'm looking at one such application which could be optimized to work more efficiently in these areas. I've looked at Google Protocol Buffers. It's very easy to understand and use and has very good documentation. Unfortunately it doesn't have good support for PHP. So I'm now looking at Thrift. Thrift was initially developed by Facebook for use primarily with PHP and other languages. So it has good support for PHP and has comparable performance and functionality to that of protobufs. But it's documentation seems to be too sparse.

On Compression
LZO compression is a more suitable compression algorithm for reasons of CPU and memory efficiency. When compression is used as part of a web request handling, one has to carefully do the trade-off between compression size and speed.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Sunday, April 04, 2010

Need for service with guarantee of security and privacy

Current Situation
With all of the online communication services like e-mail, social media sites we use today, pretty much all of them are "free" services in the sense that we don't pay them any subscription fee for using the service. And as such the "Terms of Service" are heavily tilted towards the service provider.

In most cases, the only way such free service provider makes money is by mining the data they collect when we use the service. Every time we use such service we are inputting some data for query, transmission or storage. Most of the time this data is sensitive, confidential, private data like your contacts, personal messages that reveal who you are, what you like or don't like, what, when and where you do things etc.

By mining this information for profiling the user and using it to show targeted ads or to do market demographics research and sell that information to marketers are most common ways of making money. In such cases, no particular user's data is specifically exposed as it's all aggregate information. So such uses may be acceptable.

But what is scary with this is unauthorized, accidental data leakage or theft of data by illegal "hackers" or even government powered agencies getting access to this data to spy on people or corporate espionage.

What's needed?
I don't know if there ever will be a complete solution to this problem. But, to start with we need guarantees about privacy and security from service providers. It should be verified and certified by multiple 3rdparty agencies. It should be scientifically provable. And there should be stringent consequences for breaching this guarantee irrespective of whatever the reason may be.

Now, I know such security is difficult and will cost a lot of money. So, it is acceptable to have subscription fees to cover such services.

What is alarming today is there are absolutely no such services in existence. Even if someone values their privacy and security and are willing to pay subscription fees for it, they have no choice but to use ads powered free services.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Sunday, February 14, 2010

Flash video sucks!

Summer is almost here. With the rising ambient temperature, my macbook gets hot sooner.
Especially when my browser is open it gets hotter sooner. The reason is the all pervading Adobe Flash player based ads or video players on web pages.

This has made the experience of watching videos on youtube or ted.com an unpleasant experience. If I watch the video in full-screen, I notice both cores on this macbook doing full 100%. That's horribly wrong when it only needs less than 1% when I play the same video via a standalone video player like VLC.

This really needs to be fixed. Something is horribly broken here. Is this just me or everyone else simply putting up with this problem?
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Friday, January 15, 2010

Google finally getting into data backup!?!

With their latest announcement to host any type of files on Google Docs, Google is foraying into the arena of "your data in the cloud, access, organize, share - anytime from anywhere" business that we have been envisioning from a long time (over 4 years now!).

What's interesting is the approach that google has taken. Instead of traditional approach of building all the features that are geared towards providing this product vision from ground up and releasing the end product, Google has built seemingly independent product and tested the waters first. And once the users have accepted each of those individual pieces reasonably well, they are integrating them all to provide a powerful experience. (Privacy conspiracy theorist may say this is much like boiling a frog in the water slowly!).

Interestingly enough, Google's price of storage per GB seems to be the cheapest at the moment at $0.25/GB/year. But their initial free offering is just 1 GB with 250MB file size limit. At this price, it seems cheaper than amazon. And as expected for a end user product, there are no transfer charges (bandwidth costs). In comparison, Microsoft SkyDrive offers 25 GB free space with 50MB file size limit.

Even though Google doesn't have it's own backup client that can run on your desktop like traditional backup clients, I'm sure, given their good data apis available to 3rd party developers, there will be many cropping up like mushrooms.

Surely this will change the market for good in the long term. Let's see how the traditional backup companies (including us) will react to this.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Saturday, January 09, 2010

Macbook Pro Battery

One thing I realized with my last Macbook (White) is that putting your macbook to sleep all the time and never shutting it down (especially overnight) is not good for your battery. By doing that I've had consumed more battery cycles and now the battery discharge time has come down to just over 2 hours. Last week I got a new macbook pro. And this time I figured out how to make it hibernate (not sleep) upon closing the lid. With that, I've managed to do only 3 cycles of battery recharge in last one week.

Here's how to put your macbook to deep-sleep (hibernate):
Put these two lines in your ~/.bash_profile:

alias hibernateoff='sudo pmset -a hibernatemode 0'
alias hibernateon='sudo pmset -a hibernatemode 5'

And whenever you are about to close your lid (like before you go to bed), just turn hibernate on by invoking hibernateon in terminal. At other times, when you don't want it to go to deep-sleep, just turn hibernate off. This is useful when you close your lid while moving between meeting rooms etc in office.

Also, I realized that these new batteries don't need to be discharged and recharged regularly as they don't have the "memory" problem like the older technology batteries did. So I use battery only when I need to and stay on power adaptor when I can. This way I can keep my battery cycle count low.

Here's my battery info for future (self-) reference:
+-o AppleSmartBattery  
    {
      "ExternalConnected" = Yes
      "TimeRemaining" = 0
      "InstantTimeToEmpty" = 65535
      "ExternalChargeCapable" = Yes
      "CellVoltage" = (4189,4189,4190,0)
      "PermanentFailureStatus" = 0
      "BatteryInvalidWakeSeconds" = 30
      "AdapterInfo" = 0
      "MaxCapacity" = 5573
      "Voltage" = 12568
      "Quick Poll" = No
      "Manufacturer" = "DP"
      "Location" = 0
      "CurrentCapacity" = 5573
      "LegacyBatteryInfo" = {"Amperage"=226,"Flags"=5,"Capacity"=5573,"Current"=5573,"Voltage"=12568,"Cycle Count"=3}
      "BatteryInstalled" = Yes
      "FirmwareSerialNumber" = 9626
      "CycleCount" = 3
      "AvgTimeToFull" = 0
      "DesignCapacity" = 5450
      "ManufactureDate" = 15124
      "BatterySerialNumber" = "xxxxxxxxxxxx"
      "PostDischargeWaitSeconds" = 120
      "Temperature" = 3099
      "InstantAmperage" = 0
      "ManufacturerData" = <000000000000000000000000xxxxxxxxxx000000000000000>
      "MaxErr" = 1
      "FullyCharged" = Yes
      "DeviceName" = "xxxxxxxxx"
      "IOGeneralInterest" = "IOCommand is not serializable"
      "Amperage" = 226
      "IsCharging" = No
      "DesignCycleCount9C" = 1000
      "PostChargeWaitSeconds" = 120
      "AvgTimeToEmpty" = 65535
    }


As this battery is deigned to last 1000 cycles I'm hoping this battery will give me 6hrs backup when I need it for a long long time - at least 3 years.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati

Thunderbird 3: Better search user-experience, not there yet.

For my work e-mail, I've used Microsoft Outlook with Exchange server for 2 years and I liked it a lot. Especially the global address book integration, expanding distribution lists, calendar/meeting scheduling features are awesome. In my current workplace, we don't have Exchange. And I'm on a macbook. So I've a choice of Apple Mail or Thunderbird or Microsoft Entourage.
Tried Microsoft Entourage - didn't like it - it's nothing like outlook and it's UI is as if it's been resurrected from 1970s. And without exchange server to connect with, it doesn't have much advantage compared to others.

I tried Apple mail also for a couple of months. Didn't like it either. Although it looks great compared to entourage or thunderbird, it isn't great for handling lots of e-mails in lots of IMAP folders. It's search is also lacking in speed.

Then I tried Thunderbird. I've been a big fan of Mozilla for a long time. And being a supporter of open-source (where appropriate!), I decided that I could put up with minor quirks here and there with Thunderbird and woud use it as my primary mail client. And so I've been using it for past 2 years.

Over the years, Thunderbird has improved quite significantly. Especially it's ability to handle huge number of e-mails in huge number IMAP folders is great. It's search is also quite fast. Although there have been lots of crashes (as I'm always on beta or even alpha builds, that's expected), the latest Thunderbird 3 release has been quite stable. No crashes so far. So overall I'm happy.

But I think thunderbird can do much better with just a few minor improvements. Here's my list of low-hanging-fruit enhancements to thunderbird that can greatly improve it's UX.

  • Keyboard accelerator or special keywords (search operators) that maps to search filters in the quick search drop down. This would speed up the search experience in a big way. I've filed this as a enhancement request in the thunderbird bug tracking. https://bugzilla.mozilla.org/show_bug.cgi?id=538738. Please leave your comment there if you also think it's important.
  • Multiple addresses in a single line in the compose window. This is annoying when we are replying to a message having a lot of recipients. Here's the enhancement request for this one. https://bugzilla.mozilla.org/show_bug.cgi?id=495241
  • In thread view, when a new message arrives, if the thread is collapsed, it should be shown in bold to indicate there is a unread message hidden there. Otherwise, the user may miss reading the message.
If you are a thunderbird hacker, please consider working on this. I myself would like to spend time on this. Maybe with jetpack for thunderbird, this may be a simple jetpack to get both these things done.
These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Furl
  • Reddit
  • Spurl
  • StumbleUpon
  • Technorati