Wednesday, May 26, 2010

New Software Thoughts

I don't have much free time to spend on programming but when I do it's usually fixing or improving existing software. I'm at a point now where I can see the end of my current list of projects (Vampire Castle DS still to go) and have been thinking about some other new cool software to write.

Here's a few ideas:

Javais (Windows/Linux):
A virtual butler with a British accent. Ever wanted your own butler to notify you, tell you the time, start programs, do web searches or generally be a dumbed down Tony Stark Jarvis prototype? Voice recognition is obviously going to be a key factor.

Local on the 8s (Nintendo DS):
A Nintendo DS weather program that simulates the Weather Channel with some truly unique features that cannot be found in similar DS software (I'm keeping quiet about features in this one).

Squashie (Windows/Linux):
A drag and drop media converter to convert your flacs/oggs etc suitable for your media player and pull in meta data (such as photos, videos, text) from the web during conversion, which will be copied to your player. Originally I intended this to convert my oggs to low (128kbps) bitrate mp3s so I could fit them on a low memory player, but obviously this program can be so much more.

4 comments:

Buddhist Monkey said...

Re: Squashie - doesn't Handbrake do this quite well? Perhaps simply put a wrapper around Handbrake that allows me to batch process stuff.

Benjamin said...

@Buddist Monkey

Really good point. I don't want to reinvent the wheel. I've just looked at handbrake and see it's geared towards video - I was really looking to do something audio only + encoder plugins (lame/ogg vorbis etc) and add static meta data to it. I'm favouring the butler app at the moment :)

Buddhist Monkey said...

@Benjamin

Regarding the audio conversion app, I think MediaCoder does that in spades. Their program does have batch utility as well, which is why I use it for music.

As for the butler, are you planning to make it simply a link between a voice recog system and your computer, i.e. use XYZ to recognize the voice and turn it into text, feed text to Butler, and Butler then executes text within preset parameters ("destroy the world!" might be answered by "Simpsons did it!" instead of "Would you like to play a game?")? Or something else entirely? I would be interested in the Butler concept, but worried that it would run into the type of problem that is trivial for a human to resolve but maddeningly complicated to teach a computer to solve.

So now I'm getting into AI it seems.

Benjamin said...

I'd be saving the true AI for a later day (or more likely plugins...). The idea would be a number of common tasks run by keyword recognition with some predefined phrases e.g.

"Javais (identifies a sentence starting) find directions to (identifies command google maps) London Euston (identifies location)."

The home location along with other presets will be assumed.

Javais would be expected to respond with something along the lines of "Very good sir, I'll find the route for you....." opening a browser to the relevant (pre populated) link (and if sensible to a non populated area of the screen). Later versions (or again plugins) could add to this and show a butler face appear to be thinking etc, flash graphics (i.e. like Iron man's).

Combinations and different tasks can be added by plugins, the idea being that on the face of it there is some semblance of intelligence even if there really isn't. The AI behind the text to commands will also be pluggable but initially will be restricted to simple commands or combinations of.

The naff bit will be training the voice recognition, but I guess I could add that as an "introduce yourself to Javais" step. The deal breaker would be the how good the voice recognition is really, if it's no good then the app becomes a bit pointless.

Basically this would be GestureMagic (a program launcher) on steroids (incidentally I abandoned that program due to the source being prototype and too difficult to maintain) using voice recognition instead of mouse gestures.

Crikey. That's looking like a very big program.