Harshdeep 2.0

January 27, 2009

The Making of Latest in Music – Using Youtube API

Filed under: Latest in Music, Programming, Video — harshdeep @ 2:12 pm

This is the second part of the Making of Latest in Music trilogy (here is part 1). Latest in Music is a Youtube mashup that scrapes the listings of top songs from various websites and shows their music videos by searching for them on Youtube. This is the first time I built a web application and I’m amazed by how quickly one can build something interesting with the available tools.

An important part of LiM is interaction with Youtube. Youtube generously provides an API to access the service. Initially I didn’t feel the need to use it. All I had to do was search for a song. This can be done simply by inserting the query string at the right place in the URL – http://www.youtube.com/results?search_query=<my_query_string> and fetching that page. However, this approach turned out to be insufficient for two reasons.

Firstly, some videos on Youtube cannot be embedded on external pages (like this one). Extending the basic approach to determine whether a video is embeddable would require another http page fetch. With Youtube API however, it is only a matter of setting an additional parameter in the search request (format=5).

Secondly, the initial users complained that there is no way to play all songs one after the other. I figured this could be done by creating a Youtube playlist with those songs. This definitely required the use of the API.

Using the API was pretty straightforward. But I learnt a few things the hard way. This might be a useful read if you are going to use the Youtube API for the first time

  1. Login

    The generic Google login does not work for the Youtube API calls that require authentication. It probably works fine for other Google APIs like the ones for Google Docs. But for Youtube API, you’ll get “Service Forbidden” errors with it. You need to create a login specifically on Youtube.

  2. HTTP version

    Youtube API requires the HTTP version 1.1. If you are using Ruby (version >1.6), the default http version is 1.2 and that causes errors. You need to call Net::HTTP.version_1_1 before sending any requests to make sure the Google servers are happy with you.

  3. API call frequency

    If you make a lot of Youtube API calls in a short time, you would start getting Forbidden errors. I couldn’t think of a better way to handle it than reducing the call frequency artificially by putting a sleep between them.

January 14, 2009

The Making of Latest in Music – Ruby on Rails

Filed under: Latest in Music, Programming — harshdeep @ 11:28 am

Last week I unleashed www.latestinmusic.com to the unsuspecting world. Keeping in touch with music is never going to be the same again. You don’t go finding the new songs, they come to you (in your RSS reader).

Coming back to Planet Earth, it’s a modest little site that I thought would be useful for me. Hopefully it would be useful to others as well. It took me less than a week to build it. This being my first web application, I was learning as I went – a seasoned web developer would probably take less than a day.

I used Ruby on Rails for development. The decision was primarily based on all the hype that the platform has been getting for simplicity and elegance. In my case, the hype turned out to be completely justified. Ruby, as a language, is sheer pleasure to write code in. Rails takes care of the mundane low-level things like maintaining connections with the databases, providing a set of powerful abstractions to work on top of. It does take some time getting used to and there is definitely a lot of scope of improvement in documentation, but once you cross the initial hurdles, it lets you be very productive.

One of the hang-ups that I have from my desktop/mobile development experience is the availability of an all-encompassing IDE like Visual Studio, XCode and Eclipse. Nobody should have to do serious development in Notepad anymore. Thankfully I discovered Aptana RadRails pretty early. It’s not without its share of annoying bugs, but I think it does the job pretty well. You can edit Ruby code, JavaScript code, layouts in .html.erb files and css stylesheets in the coziness of the same IDE and it lets you visually debug the code while running the application locally. TextMate is a popular choice of Ruby developers but it is available for Mac OSX only. Aptana is cross-platform.

The next question was where to get the application hosted. I first tried GoDaddy because I’d used it before to register domains. They do support Ruby on Rails but for some reason I could not get my app running with them. I tried contacting their customer care and they duly told me that it’s not them, it’s me.

Being a newbie, I thought it would be easier for me to host my app with one of the new hosting providers that focus exclusively on Ruby on Rails apps. Surely enough, I could set it all up with HostingRails in a couple of hours. Their FAQs section turned out to be particularly useful.

Overall, it was fun working with Ruby on Rails. I’ll hopefully use it for more projects.

June 21, 2007

Reusing High Level Modules – Dependency Inversion Principle

Filed under: Design Patterns, Dev, Geek, Programming — harshdeep @ 8:14 am

It is easier to make a low-level module reusable, than a high level module. Firstly, a low-level module generally has clearer goals (do one thing and do it right), and wider usability (number of people who need a generic stack is much more than those who need a document indexer). Secondly, a low-level module has less dependencies on other modules. When one moves a module from one application to another, one also needs to move all the modules that it depends on and, since dependency is transitive, also the modules that those modules depend on and so on. So, higher the efferent coupling of a module, harder it is to reuse it.

Note that we are talking about reusing a module and not just copying chunks of code. Code copying is not code reuse.

code copying … comes with a serious disadvantage: you own the code you copy! If it doesn’t work in your environment, you have to change it. If there are bugs in the code, you have to fix them. If the original author finds some bugs in the code and fixes them, you have to find this out, and you have to figure out how to make the changes in your own copy. Eventually the code you copied diverges so much from the original that it can hardly be recognized. The code is yours. While code copying can make it easier to do some initial development; it does not help very much with the most expensive phase of the software lifecycle, maintenance.

I prefer to define reuse as follows. I reuse code if, and only if, I never need to look at the source code (other than the public portions of header files). I need only link with static libraries or include dynamic libraries. Whenever these libraries are fixed or enhanced, I receive a new version which I can then integrate into my system when opportunity allows.

Now, to make my high level component reusable, I need to remove it’s dependencies on low-level modules. This is one of the motivations behind Dependency Inversion Principle put forward by Robert C. Martin in another of his brilliant papers on Design Patterns.

Consider the implications of high level modules that depend upon low level modules. It is the high level modules that contain the important policy decisions and business models of an application. It is these models that contain the identity of the application. Yet, when these modules depend upon the lower level modules, then changes to the lower level modules can have direct effects upon them; and can force them to change.

This predicament is absurd! It is the high level modules that ought to be forcing the low level modules to change. It is the high level modules that should take precedence over the lower level modules. High level modules simply should not depend upon low level modules in any way.

Moreover, it is high level modules that we want to be able to reuse. We are already quite good at reusing low level modules in the form of subroutine libraries. When high level modules depend upon low level modules, it becomes very difficult to reuse those high level modules in different contexts. However, when the high level modules are independent of the low level modules, then the high level modules can be reused quite simply.

He defines the Dependency Inversion Principle as

a) High level modules should not depend upon low level modules. Both should depend upon abstractions.

b) Abstractions should not depend upon details. Details should depend upon abstractions.

    Here’s an example from the same paper. In the traditional layered design as below, a change in the lowest level Utility Layer can affect the highest level Policy Layer.

    Instead of letting each layer depend directly on the one underneath it, I can make each of the higher level layers use the lower layer through an interface (abstract class) that the actual layer implements (derives from).

    Now none of the higher level layers will be affected if any of the lower level layers change, as long as they keep abiding to their respective interfaces. If I switch to a third party library for any of the lower level layers, I can write an Adapter to make it confirm to its interface, thereby not affecting the higher level layer at all.

    In many simple cases, DI can also be achieved through callbacks. A very common example is when a module provides APIs to allow the application to set its own memory allocation and de-allocation callbacks. The application may do this when it wants to use a heap optimized for small memory allocations, or if it wants to keep track of total memory allocated.

    However, I think there are cases when it’s alright if you don’t follow DIP.

    1. Lower-level module is highly stable. If you know that the lower-level module won’t change much during the life time of the depending module, and you are never going to have to replace it, even when the depending module is reused in another application, there is no harm in depending directly on it.
    2. Lower-level module is highly specific. Again, if you’ll never have to replace the lower-level module with another, you can depend directly on it.
    3. Performance is crucial. Use of abstract classes and virtual functions has a performance penalty. So it’s not advisable for the performance critical parts of the application. However, one can consider using plain function callbacks to achieve DI in such cases, as in the allocation/de-allocation routine example above.

    Blog at WordPress.com.