eHub Interviews CiteULike

Posted by Emily Chang on Saturday, October 8th, 2005. Filed under: eHub Interviews

Visit CiteULike, originally added to eHub on Oct 06, 05.

imageThanks to Richard Cameron, creator of CiteULike for this email interview posted October 7, 2005.

eHub: What is your web application/service about?

CiteULike: It’s all about making an extremely dull task faced by academics a little bit more bearable. There’s an important principle in science that in order to be a well-behaved member of the club, you need to record the full details of all the relevant academic articles you read in the process of discovering your Grand Theory of Everything… and you need to cite them at the end of the paper when you finally write it up. This all sounds vaguely like a job for social bookmarking application, and indeed it is. There’s a twist though, which is that web bookmarks simply consist of titles and URLs. Academic citations are a bit more complex (authors, journal names, page numbers), and so I have an amazing battery of regular expressions coded into the application which can automatically extract this stuff without the user having to copy and paste. That was the really dull part of being an academic, and it’s what CiteULike tries to alleviate. The added bonus is that you get all the social “Web 2.0” perks for free: being able to explore other people’s libraries and see what they’re reading is an amazing boon. I discovered lots of important material which I’d never have noticed otherwise.

eHub: Why did you start this project?

imageCiteULike: Out of grumpiness. When I was doing my brief stint of “going back to academia”, one of the first fiats to be delivered by my boss was that I should be using the “industry standard” desktop citation manager software.

I grumpily went off to download the Mac OS X port of this thing. Installed it. Started her up, and launched her on her maiden voyage (“May God bless her and all those who fail in her”). It crashed within five minutes.

Eventually I did get it working, but I felt like I was using some Microsoft DOS application which had been ported to Windows by a convoluted process of sitting on it and squashing it until it ran. The bent remains of this original software felt like they’d then been turned into a Mac OS X app in rather a hurry by means of a bullet point on an executive roadmap document and a programmer who perhaps didn’t really appreciate how OS X applications are supposed to feel. So, that’s the software I was being “encouraged” to use, and I felt depressed every time I launched it. I got so depressed, in fact, that I never got round to recording any information in it. I found it preferable to print off all the papers I’d read, put them in a big pile on the shelf, and every time I wanted to write a bibliography I simply went through them one by one, picked out the relevant articles, and simply typed the details manually into a text file.

After the rage boiled within me for about a month, I sat down to write something better. CiteULike was originally going to be a Mac OS X desktop application which was going to look a bit like iTunes and categorise articles in a similar way. It was also going to synchronise with a central server so that I could see my library both at work and at home.

I just happened to write the server first. This was about the time del.icio.us was starting to become useful and popular, so I put a trivial little web interface onto the server one evening. Next morning I discovered that 20 people had registered on it. Lord only knows how they found out about it – all of the people I gave the original URL to swear blind that they didn’t tell a soul.

eHub: How much time do you devote to its growth?  Do you have a day job?

CiteULike: Yes, I have a full time day job. I’m back in the real world now, and I spend my days writing software to automatically trade financial securities for a small group of private investors. On the one hand, the disadvantage is that this leaves me much less time to work on CiteULike than I’d want. On the other hand it pays quite well, so I don’t have the requirement to perform a smash-and-grab commercialisation of the site (by, oh, say… selling out to Microsoft) on the grounds that I was hungry and needed to eat [see discussion on business model, or lack of it, below].

eHub: How large is your team and what are your backgrounds?

CiteULike: It’s just me, I’m afraid.

I started off life in a small startup company writing the software for online sports betting, which we eventually sold to most of the major bookmakers in the UK. After that, I went to go and work in a hedge fund for a while. After a brief shot at an academic life, I gave up and I’m now back doing financial work.

eHub: What is your design philosophy?

CiteULike: I maintain that people who have been enraged by certain shoddy aspects of how computers work always write the best software. Historically, outside computing, that’s been the case too. All the best revolutions have come from people unhappy with their lot in life trying to do something about it. While I can’t quite claim to draw a direct parallel between someone’s copy of Microsoft Word crashing and being packed off to work down the pit at the age of fourteen with a canary and a miner’s helmet, it’s still jolly annoying. In fact, apart from being kept on hold for hours listening to piped music when attempting to deal with any major utility company’s call centre, dealing with computer software misbehaving is probably one of the more annoying things to come out of the last twenty years.

And grumpiness doesn’t just drive people to do useful things on an individual level. Big institutions can benefit from it too. In fact, the Americans managed to put a man on the moon simply because Gagarin and Korolyov beat the US space programme to it with manned space flight and it annoyed the Kennedy administration.

And what about the good old days of the big public spats in 18th century natural philosophy? While a lot of it would be condemned as “unprofessional” by marauding groups of management consultants, a lot of good work was done when two people argue about an idea, and seek to prove the other person wrong by coming up with something better.

It’s now reassuring to watch companies like Google, Microsoft and Apple starting to fight like cats in a bag once more. When they do that, they start exploring very different, and diametrically opposite ways of doing things. That’s when the real innovation starts. Contrast this with politicians consistently moving their parties towards the centre ground of politics. That, of course, is the “optimal” thing to do (in a two-party political system, anyone who moves away from the centre would lose centrist votes), but it really does stifle progress.

Of course, what would be really exciting would be to watch very small companies, or even individuals, taking on the giants of the computer industry. I think, in part, that’s what’s happening with Web 2.0, and it’s quite exciting.

So, that’s a fairly simple and unconventional design philosophy for you. Find something (or even some company) which really annoys you. Do something about it, preferably something which involves solving the problem in a radically different way. Chances are that if it was annoying you in the first place, then the accepted way of doing it was wrong. Your alternative has a much better chance of being right.

eHub: What technologies are you currently using?

CiteULike: Most of the site is written in a language called Tcl. It suits me just fine, although I’m starting to feel like I missed a trick by not spotting Ruby on Rails sooner.

I use PostgreSQL as a database, and I make extensive use of memcached for caching (or, rather, a rewrite of memcached in a language called Erlang which I ought to release soon). There’s even some ongoing work in Common Lisp, which is a suitable choice for parsing and transforming documents which can be expressed as trees. More news on this new aspect to the site sometime relatively soon.

eHub: If your project is live, what are the most requested features from your users/community?

CiteULike: Oh, it’s embarrassing. I don’t have a fully functional API yet. I meant to. I meant to do it almost a year ago, but for some reason I never got round to it, and it’s still top of my “todo” list.

eHub: Does your user base reside in a primary geographic location or is it distributed?

CiteULike: Academics are a fairly international bunch. Although articles tend to be published in English, the users are from all over the world. In fact, due to the amazing efforts of certain users who have volunteered to translate the site, it’s now available in seven languages.

eHub: What is the greatest challenge to your success?

CiteULike: Well, I still have my day job (which I intend to keep), so there’s no real burning motivation to be successful in the sense of making lots of money out of it. I think it’s got to the stage where there’s probably far too much work for me to do on my own as a side-project. I think I need to think about how the site can keep growing without going down an unpleasant over-commercialised route. Doing something silly like trying to charge users for it would just completely destroy the social network which makes it work, and doing something insane like trying to profit from the personal information about users would just be criminally unethical. I need to find some other option. I could open source it, but I’m not convinced that such an approach is going to work too well with a web service like this.

eHub: What is the one thing you need to get to the next phase of the project?

CiteULike: I suppose if I stick to the “design philosophy” outlined above, then I need to find some other artefact of the world which irritates me. Not being an academic any more, I’m not experiencing as many of those horrors as I probably should be, so maybe I need to hear some pet hates from the users? Maybe we should have an anonymous “agony aunt” style submission box?

eHub: Do you have a business model?  If so, what is it?

CiteULike: None. Formally stated, the sum total of my thoughts could be expressed as: “Haven’t a clue”.

For a while, about a year ago, I thought that it might be possible to make a business out of it, but I never thought of anything which just didn’t feel like I’d be shamelessly profiteering from data which wasn’t really mine to begin with.

This, I think, will be an interesting aspect of Web 2.0 to watch over the next year. Many of the sites are created by people who’ve quit their jobs to go and work on something exciting. However, business models seem to be a bit hazy (mine being an extreme example of that). In the FAQs there’s sometimes a discussion of putting contextual advertising on the site. This doesn’t always work though (if you can think of a t-shirt merchandising opportunity and/or product placement offer for readers of “The critical probability for random Voronoi percolation in the plane is 1/2” then please send me a business plan right away).

So, as the supply of money to buy pizza diminishes, and the grim reality of having to live of noodles beckons, what’s to stop the people who run these sites realising that their biggest liquid asset is the personal data in the database?

For instance, in my case, I’m sure I could dream up quite an evil money-making scheme very easily. I’m bound to have plenty of highly qualified PhD students as users on CiteULike. I know what they’re reading, so I can probably have a good guess about their fields of expertise, and they’re probably going to be looking for jobs fairly soon. In the biotech, computer, or any other profitable industry, what’s to stop me attempting to sell this information (after all, I know all their email addresses) to Burke & Hare Recruitment Consultants, Ltd in return for sufficient money (at least) to switch back from noodles to pizza?

Hopefully my morals (and the fact that the day job pays a salary) would stop me doing this in my case, but you do need to wonder what the future holds for the hundreds of Web 2.0 sites being created every day.

eHub: If you’re able to disclose this information, how much traffic or usage do you see on an average day?

CiteULike: It’s getting relatively busy now. It averages at something like half a million page impressions per day. I’m sure some of that will be from RSS polls and search engines though.

eHub: What is the one thing you’re most proud of about the project?

CiteULike: I’m just amazed people actually use it and find it useful.

eHub: How would you describe the shift that’s occurring with the web right now to future generations?

CiteULike: Progress always seems to come in fits and starts. In science, the fits are called “paradigm shifts” after Kuhn. Looking back on them retrospectively, they always seem so obvious and you’re tempted to ask, “why didn’t anyone think of that before?”

However, hopefully one thing to come out of this is that it will be that the world realises that it’s possible to create useful web sites without millions of dollars of venture capital, and a team of a million programmers. In the same way that it was easy for an individual to create a personal homepage (with flashing text and animated GIFs) on geocites.com in 1997, it would be nice if it was as easy for individuals to create nice little useful Web 2.0 applications in 2007.

eHub: What site(s) do you visit everyday other than your own?

CiteULike: news.bbc.co.uk, del.icio.us/popular and bloglines.

eHub: How many hours of sleep do you get a night?

CiteULike: Plenty.

Thanks to Richard Cameron, creator of CiteULike for this email interview posted October 7, 2005.

Visit CiteULike
Originally added to eHub on Oct 06, 05